# Grep

The *Grep* filter plugin lets you match or exclude specific records based on regular expression patterns for values or nested values.

## Configuration parameters

The plugin supports the following configuration parameters:

| Key          | Value Format | Description                                                                                                                                                                                                                           |
| ------------ | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Regex`      | `KEY REGEX`  | Keep records where the content of `KEY` matches the regular expression.                                                                                                                                                               |
| `Exclude`    | `KEY REGEX`  | Exclude records where the content of `KEY` matches the regular expression.                                                                                                                                                            |
| `Logical_Op` | `Operation`  | Specify a logical operator: `AND`, `OR` or `legacy` (default). In `legacy` mode the behaviour is either `AND` or `OR` depending on whether the `grep` is including (uses `AND`) or excluding (uses OR). Available from 2.1 or higher. |

### Record accessor enabled

Enable the [record accessor](https://docs.fluentbit.io/manual/4.0/administration/configuring-fluent-bit/classic-mode/record-accessor) feature to specify the `KEY`. Use the record accessor to match values against nested values.

## Filter records

To start filtering records, run the filter from the command line or through the configuration file. The following example assumes that you have a file named `lines.txt` with the following content:

```
{"log": "aaa"}
{"log": "aab"}
{"log": "bbb"}
{"log": "ccc"}
{"log": "ddd"}
{"log": "eee"}
{"log": "fff"}
{"log": "ggg"}
```

### Command line

When using the command line, pay close attention to quote the regular expressions. Using a configuration file might be easier.

The following command loads the [tail](https://docs.fluentbit.io/manual/4.0/data-pipeline/inputs/tail) plugin and reads the content of `lines.txt`. Then the `grep` filter applies a regular expression rule over the `log` field created by the `tail` plugin and only passes records with a field value starting with `aa`:

```shell
fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o stdout
```

### Configuration file

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
service:
  parsers_file: /path/to/parsers.conf

pipeline:
  inputs:
    - name: tail
      path: lines.txt
      parser: json

  filters:
    - name: grep
      match: '*'
      regex: log aa

  outputs:
    - name: stdout
      match: '*'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[SERVICE]
  parsers_file /path/to/parsers.conf

[INPUT]
  name   tail
  path   lines.txt
  parser json

[FILTER]
  name   grep
  match  *
  regex  log aa

[OUTPUT]
  name   stdout
  match  *
```

{% endtab %}
{% endtabs %}

The filter lets you use multiple rules which are applied in order, you can have many `Regex` and `Exclude` entries as required ([more information](#multiple-conditions)).

### Nested fields example

To match or exclude records based on nested values, you can use [Record Accessor](https://docs.fluentbit.io/manual/4.0/administration/configuring-fluent-bit/classic-mode/record-accessor) format as the `KEY` name.

Consider the following record example:

```json
{
  "log": "something",
  "kubernetes": {
    "pod_name": "myapp-0",
    "namespace_name": "default",
    "pod_id": "216cd7ae-1c7e-11e8-bb40-000c298df552",
    "labels": {
      "app": "myapp"
    },
    "host": "minikube",
    "container_name": "myapp",
    "docker_id": "370face382c7603fdd309d8c6aaaf434fd98b92421ce"
  }
}
```

For example, to exclude records that match the nested field `kubernetes.labels.app`, use the following rule:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  filters:
    - name: grep
      match: '*'
      exclude: $kubernetes['labels']['app'] myapp
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[FILTER]
  Name    grep
  Match   *
  Exclude $kubernetes['labels']['app'] myapp
```

{% endtab %}
{% endtabs %}

### Excluding records with missing or invalid fields

You might want to drop records that are missing certain keys.

One way to do this is to `exclude` with a regular expression that matches anything. A missing key fails this check.

The following example checks for a specific valid value for the key:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  filters:
    # Use Grep to verify the contents of the iot_timestamp value.
    # If the iot_timestamp key does not exist, this will fail
    # and exclude the row.
    - name: grep
      alias: filter-iots-grep
      match: iots_thread.*
      regex: iot_timestamp ^\d{4}-\d{2}-\d{2}
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
# Use Grep to verify the contents of the iot_timestamp value.
# If the iot_timestamp key does not exist, this will fail
# and exclude the row.
[FILTER]
  Name                     grep
  Alias                    filter-iots-grep
  Match                    iots_thread.*
  Regex                    iot_timestamp ^\d{4}-\d{2}-\d{2}
```

{% endtab %}
{% endtabs %}

The specified key `iot_timestamp` must match the expected expression. If it doesn't, or is missing or empty, then it will be excluded.

### Multiple conditions

If you want to set multiple `Regex` or `Exclude`, you must use the `legacy` mode. In this case, the `Exclude` must be first and you can have only one `Regex`. If `Exclude` match, the string is blocked. You can have multiple `Exclude` entry. After, if there is no `Regex`, the line is sent to the output.

If there is a `Regex` and it matches, the line is sent to the output, else, it's blocked.

If you want to set multiple `Regex` or `Exclude`, you can use `Logical_Op` property to use logical conjunction or disjunction.

If `Logical_Op` is set, setting both `Regex` and `Exclude` results in an error.

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:
  inputs:
    - name: dummy
      dummy: '{"endpoint":"localhost", "value":"something"}'
      tag: dummy

  filters:
    - name: grep
      match: '*'
      logical_op: or
      regex:
        - value something
        - value error

  outputs:
    - name: stdout
      match: '*'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[INPUT]
  Name dummy
  Dummy {"endpoint":"localhost", "value":"something"}
  Tag dummy

[FILTER]
  Name grep
  Match *
  Logical_Op or
  Regex value something
  Regex value error

[OUTPUT]
  Name stdout
  Match *
```

{% endtab %}
{% endtabs %}

The output looks similar to:

```
[0] dummy: [1674348410.558341857, {"endpoint"=>"localhost", "value"=>"something"}]
[0] dummy: [1674348411.546425499, {"endpoint"=>"localhost", "value"=>"something"}]
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.fluentbit.io/manual/4.0/data-pipeline/filters/grep.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
