# Grep

The *Grep Filter* plugin allows you to match or exclude specific records based on regular expression patterns for values or nested values.

## Configuration Parameters

The plugin supports the following configuration parameters:

| Key         | Value Format | Description                                                                                                                                                                                                                                                                                             |
| ----------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Regex       | KEY REGEX    | Keep records in which the content of KEY matches the regular expression.                                                                                                                                                                                                                                |
| Exclude     | KEY REGEX    | Exclude records in which the content of KEY matches the regular expression.                                                                                                                                                                                                                             |
| Logical\_Op | Operation    | Specify which logical operator to use. `AND` , `OR` and `legacy` are allowed as an Operation. Default is `legacy` for backward compatibility. In `legacy` mode the behaviour is either AND or OR depending whether the `grep` is including (uses AND) or excluding (uses OR). Only available from 2.1+. |

#### Record Accessor Enabled

This plugin enables the [Record Accessor](https://docs.fluentbit.io/manual/3.0/administration/configuring-fluent-bit/classic-mode/record-accessor) feature to specify the KEY. Using the *record accessor* is suggested if you want to match values against nested values.

## Getting Started

In order to start filtering records, you can run the filter from the command line or through the configuration file. The following example assumes that you have a file called `lines.txt` with the following content:

```
{"log": "aaa"}
{"log": "aab"}
{"log": "bbb"}
{"log": "ccc"}
{"log": "ddd"}
{"log": "eee"}
{"log": "fff"}
{"log": "ggg"}
```

### Command Line

> Note: using the command line mode need special attention to quote the regular expressions properly. It's suggested to use a configuration file.

The following command will load the *tail* plugin and read the content of `lines.txt` file. Then the *grep* filter will apply a regular expression rule over the *log* field (created by tail plugin) and only *pass* the records which field value starts with *aa*:

```
$ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o stdout
```

### Configuration File

{% tabs %}
{% tab title="fluent-bit.conf" %}

```python
[SERVICE]
    parsers_file /path/to/parsers.conf

[INPUT]
    name   tail
    path   lines.txt
    parser json

[FILTER]
    name   grep
    match  *
    regex  log aa

[OUTPUT]
    name   stdout
    match  *
```

{% endtab %}

{% tab title="fluent-bit.yaml" %}

```yaml
service:
    parsers_file: /path/to/parsers.conf
pipeline:
    inputs:
        - name: tail
          path: lines.txt
          parser: json
    filters:
        - name: grep
          match: '*'
          regex: log aa
    outputs:
        - name: stdout
          match: '*'

```

{% endtab %}
{% endtabs %}

The filter allows to use multiple rules which are applied in order, you can have many *Regex* and *Exclude* entries as required.

### Nested fields example

If you want to match or exclude records based on nested values, you can use a [Record Accessor ](https://docs.fluentbit.io/manual/3.0/administration/configuring-fluent-bit/classic-mode/record-accessor)format as the KEY name. Consider the following record example:

```javascript
{
    "log": "something",
    "kubernetes": {
        "pod_name": "myapp-0",
        "namespace_name": "default",
        "pod_id": "216cd7ae-1c7e-11e8-bb40-000c298df552",
        "labels": {
            "app": "myapp"
        },
        "host": "minikube",
        "container_name": "myapp",
        "docker_id": "370face382c7603fdd309d8c6aaaf434fd98b92421ce"
    }
}
```

if you want to exclude records that match given nested field (for example `kubernetes.labels.app`), you can use the following rule:

{% tabs %}
{% tab title="fluent-bit.conf" %}

```python
[FILTER]
    Name    grep
    Match   *
    Exclude $kubernetes['labels']['app'] myapp
```

{% endtab %}

{% tab title="fluent-bit.yaml" %}

```yaml
    filters:
        - name: grep
          match: '*'
          exclude: $kubernetes['labels']['app'] myapp
```

{% endtab %}
{% endtabs %}

### Excluding records missing/invalid fields

It may be that in your processing pipeline you want to drop records that are missing certain keys.

A simple way to do this is just to `exclude` with a regex that matches anything, a missing key will fail this check.

Here is an example that checks for a specific valid value for the key as well:

{% tabs %}
{% tab title="fluent-bit.conf" %}

```
# Use Grep to verify the contents of the iot_timestamp value.
# If the iot_timestamp key does not exist, this will fail
# and exclude the row.
[FILTER]
    Name                     grep
    Alias                    filter-iots-grep
    Match                    iots_thread.*
    Regex                    iot_timestamp ^\d{4}-\d{2}-\d{2}
```

{% endtab %}

{% tab title="fluent-bit.yaml" %}

```yaml
    filters:
        - name: grep
          alias: filter-iots-grep
          match: iots_thread.*
          regex: iot_timestamp ^\d{4}-\d{2}-\d{2}
```

{% endtab %}
{% endtabs %}

The specified key `iot_timestamp` must match the expected expression - if it does not or is missing/empty then it will be excluded.

### Multiple conditions

If you want to set multiple `Regex` or `Exclude`, you can use `Logical_Op` property to use logical conjuction or disjunction.

Note: If `Logical_Op` is set, setting both 'Regex' and `Exclude` results in an error.

{% tabs %}
{% tab title="fluent-bit.conf" %}

```python
[INPUT]
    Name dummy
    Dummy {"endpoint":"localhost", "value":"something"}
    Tag dummy

[FILTER]
    Name grep
    Match *
    Logical_Op or
    Regex value something
    Regex value error

[OUTPUT]
    Name stdout
```

{% endtab %}

{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:
    inputs:
        - name: dummy
          dummy: '{"endpoint":"localhost", "value":"something"}'
          tag: dummy
    filters:
        - name: grep
          match: '*'
          logical_op: or
          regex:
            - value something
            - value error
    outputs:
        - name: stdout
```

{% endtab %}
{% endtabs %}

Output will be

```
Fluent Bit v2.0.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/01/22 09:46:49] [ info] [fluent bit] version=2.0.9, commit=16eae10786, pid=33268
[2023/01/22 09:46:49] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/01/22 09:46:49] [ info] [cmetrics] version=0.5.8
[2023/01/22 09:46:49] [ info] [ctraces ] version=0.2.7
[2023/01/22 09:46:49] [ info] [input:dummy:dummy.0] initializing
[2023/01/22 09:46:49] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2023/01/22 09:46:49] [ info] [filter:grep:grep.0] OR mode
[2023/01/22 09:46:49] [ info] [sp] stream processor started
[2023/01/22 09:46:49] [ info] [output:stdout:stdout.0] worker #0 started
[0] dummy: [1674348410.558341857, {"endpoint"=>"localhost", "value"=>"something"}]
[0] dummy: [1674348411.546425499, {"endpoint"=>"localhost", "value"=>"something"}]
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.fluentbit.io/manual/3.0/pipeline/filters/grep.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
