# Grep

{% hint style="info" %}
**Supported event types:** `logs`
{% endhint %}

The *Grep* filter plugin lets you match or exclude specific records based on regular expression patterns for values or nested values.

## Configuration parameters

The plugin supports the following configuration parameters:

| Key          | Value Format | Description                                                                                                                                                                                                                   | Default  |
| ------------ | ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- |
| `exclude`    | `KEY REGEX`  | Exclude records where the content of `KEY` matches the regular expression.                                                                                                                                                    | *none*   |
| `logical_op` | `Operation`  | Specify a logical operator: `AND`, `OR` or `legacy`. In `legacy` mode the behaviour is either `AND` or `OR` depending on whether the `grep` is including (uses `AND`) or excluding (uses `OR`). Available from 2.1 or higher. | `legacy` |
| `regex`      | `KEY REGEX`  | Keep records where the content of `KEY` matches the regular expression.                                                                                                                                                       | *none*   |

### Record accessor enabled

Enable the [record accessor](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor) feature to specify the `KEY`. Use the record accessor to match values against nested values.

## Filter records

To start filtering records, run the filter from the command line or through the configuration file. The following example assumes that you have a file named `lines.txt` with the following content:

```
{"log": "aaa"}
{"log": "aab"}
{"log": "bbb"}
{"log": "ccc"}
{"log": "ddd"}
{"log": "eee"}
{"log": "fff"}
{"log": "ggg"}
```

### Command line

When using the command line, pay close attention to quote the regular expressions. Using a configuration file might be easier.

The following command loads the [tail](https://docs.fluentbit.io/manual/data-pipeline/inputs/tail) plugin and reads the content of `lines.txt`. Then the `grep` filter applies a regular expression rule over the `log` field created by the `tail` plugin and only passes records with a field value starting with `aa`:

```shell
fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o stdout
```

### Configuration file

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
service:
  parsers_file: /path/to/parsers.conf

pipeline:
  inputs:
    - name: tail
      path: lines.txt
      parser: json

  filters:
    - name: grep
      match: '*'
      regex: log aa

  outputs:
    - name: stdout
      match: '*'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[SERVICE]
  parsers_file /path/to/parsers.conf

[INPUT]
  Name   tail
  Path   lines.txt
  Parser json

[FILTER]
  Name   grep
  Match  *
  Regex  log aa

[OUTPUT]
  Name   stdout
  Match  *
```

{% endtab %}
{% endtabs %}

The filter lets you use multiple rules which are applied in order, you can have many `regex` and `exclude` entries as required ([more information](#multiple-conditions)).

### Nested fields example

To match or exclude records based on nested values, you can use [Record Accessor](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor) format as the `KEY` name.

Consider the following record example:

```json
{
  "log": "something",
  "kubernetes": {
    "pod_name": "myapp-0",
    "namespace_name": "default",
    "pod_id": "216cd7ae-1c7e-11e8-bb40-000c298df552",
    "labels": {
      "app": "myapp"
    },
    "host": "minikube",
    "container_name": "myapp",
    "docker_id": "370face382c7603fdd309d8c6aaaf434fd98b92421ce"
  }
}
```

For example, to exclude records that match the nested field `kubernetes.labels.app`, use the following rule:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  filters:
    - name: grep
      match: '*'
      exclude: $kubernetes['labels']['app'] myapp
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[FILTER]
  Name    grep
  Match   *
  Exclude $kubernetes['labels']['app'] myapp
```

{% endtab %}
{% endtabs %}

### Excluding records with missing or invalid fields

You might want to drop records that are missing certain keys.

One way to do this is to `exclude` with a regular expression that matches anything. A missing key fails this check.

The following example checks for a specific valid value for the key:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  filters:
    # Use Grep to verify the contents of the iot_timestamp value.
    # If the iot_timestamp key does not exist, this will fail
    # and exclude the row.
    - name: grep
      alias: filter-iots-grep
      match: iots_thread.*
      regex: iot_timestamp ^\d{4}-\d{2}-\d{2}
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
# Use Grep to verify the contents of the iot_timestamp value.
# If the iot_timestamp key does not exist, this will fail
# and exclude the row.
[FILTER]
  Name                     grep
  Alias                    filter-iots-grep
  Match                    iots_thread.*
  Regex                    iot_timestamp ^\d{4}-\d{2}-\d{2}
```

{% endtab %}
{% endtabs %}

The specified key `iot_timestamp` must match the expected expression. If it doesn't, or is missing or empty, then it will be excluded.

### Multiple conditions

If you want to set multiple `regex` or `exclude`, you must use the `legacy` mode. In this case, the `exclude` must be first and you can have only one `regex`. If `exclude` matches, the string is blocked. You can have multiple `exclude` entries. After, if there is no `regex`, the line is sent to the output.

If there is a `regex` and it matches, the line is sent to the output, else, it's blocked.

If you want to set multiple `regex` or `exclude`, you can use `logical_op` property to use logical conjunction or disjunction.

If `logical_op` is set, setting both `regex` and `exclude` results in an error.

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:
  inputs:
    - name: dummy
      dummy: '{"endpoint":"localhost", "value":"something"}'
      tag: dummy

  filters:
    - name: grep
      match: '*'
      logical_op: or
      regex:
        - value something
        - value error

  outputs:
    - name: stdout
      match: '*'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[INPUT]
  Name  dummy
  Dummy {"endpoint":"localhost", "value":"something"}
  Tag   dummy

[FILTER]
  Name       grep
  Match      *
  Logical_Op or
  Regex      value something
  Regex      value error

[OUTPUT]
  Name stdout
  Match *
```

{% endtab %}
{% endtabs %}

The output looks similar to:

```
[0] dummy: [1674348410.558341857, {"endpoint"=>"localhost", "value"=>"something"}]
[0] dummy: [1674348411.546425499, {"endpoint"=>"localhost", "value"=>"something"}]
```
