Fluent Bit: Official Manual
SlackGitHubCommunity MeetingsSandbox and LabsWebinars
3.2
3.2
  • Fluent Bit v3.2 Documentation
  • About
    • What is Fluent Bit?
    • A Brief History of Fluent Bit
    • Fluentd & Fluent Bit
    • License
    • Sandbox and Lab Resources
  • Concepts
    • Key Concepts
    • Buffering
    • Data Pipeline
      • Input
      • Parser
      • Filter
      • Buffer
      • Router
      • Output
  • Installation
    • Getting Started with Fluent Bit
    • Upgrade Notes
    • Supported Platforms
    • Requirements
    • Sources
      • Download Source Code
      • Build and Install
      • Build with Static Configuration
    • Linux Packages
      • Amazon Linux
      • Redhat / CentOS
      • Debian
      • Ubuntu
      • Raspbian / Raspberry Pi
    • Docker
    • Containers on AWS
    • Amazon EC2
    • Kubernetes
    • macOS
    • Windows
    • Yocto / Embedded Linux
    • Buildroot / Embedded Linux
  • Administration
    • Configuring Fluent Bit
      • YAML Configuration
        • Service
        • Parsers
        • Multiline Parsers
        • Pipeline
        • Plugins
        • Upstream Servers
        • Environment Variables
        • Includes
      • Classic mode
        • Format and Schema
        • Configuration File
        • Variables
        • Commands
        • Upstream Servers
        • Record Accessor
      • Unit Sizes
      • Multiline Parsing
    • Transport Security
    • Buffering & Storage
    • Backpressure
    • Scheduling and Retries
    • Networking
    • Memory Management
    • Monitoring
    • Multithreading
    • HTTP Proxy
    • Hot Reload
    • Troubleshooting
    • Performance Tips
    • AWS credentials
  • Local Testing
    • Validating your Data and Structure
    • Running a Logging Pipeline Locally
  • Data Pipeline
    • Pipeline Monitoring
    • Inputs
      • Collectd
      • CPU Log Based Metrics
      • Disk I/O Log Based Metrics
      • Docker Events
      • Docker Log Based Metrics
      • Dummy
      • Elasticsearch
      • Exec
      • Exec Wasi
      • Ebpf
      • Fluent Bit Metrics
      • Forward
      • Head
      • Health
      • HTTP
      • Kafka
      • Kernel Logs
      • Kubernetes Events
      • Memory Metrics
      • MQTT
      • Network I/O Log Based Metrics
      • NGINX Exporter Metrics
      • Node Exporter Metrics
      • OpenTelemetry
      • Podman Metrics
      • Process Exporter Metrics
      • Process Log Based Metrics
      • Prometheus Remote Write
      • Prometheus Scrape Metrics
      • Random
      • Serial Interface
      • Splunk
      • Standard Input
      • StatsD
      • Syslog
      • Systemd
      • Tail
      • TCP
      • Thermal
      • UDP
      • Windows Event Log
      • Windows Event Log (winevtlog)
      • Windows Exporter Metrics
    • Parsers
      • Configuring Parser
      • JSON
      • Regular Expression
      • LTSV
      • Logfmt
      • Decoders
    • Processors
      • Content Modifier
      • Labels
      • Metrics Selector
      • OpenTelemetry Envelope
      • SQL
    • Filters
      • AWS Metadata
      • CheckList
      • ECS Metadata
      • Expect
      • GeoIP2 Filter
      • Grep
      • Kubernetes
      • Log to Metrics
      • Lua
      • Parser
      • Record Modifier
      • Modify
      • Multiline
      • Nest
      • Nightfall
      • Rewrite Tag
      • Standard Output
      • Sysinfo
      • Throttle
      • Type Converter
      • Tensorflow
      • Wasm
    • Outputs
      • Amazon CloudWatch
      • Amazon Kinesis Data Firehose
      • Amazon Kinesis Data Streams
      • Amazon S3
      • Azure Blob
      • Azure Data Explorer
      • Azure Log Analytics
      • Azure Logs Ingestion API
      • Counter
      • Dash0
      • Datadog
      • Dynatrace
      • Elasticsearch
      • File
      • FlowCounter
      • Forward
      • GELF
      • Google Chronicle
      • Google Cloud BigQuery
      • HTTP
      • InfluxDB
      • Kafka
      • Kafka REST Proxy
      • LogDNA
      • Loki
      • Microsoft Fabric
      • NATS
      • New Relic
      • NULL
      • Observe
      • OpenObserve
      • OpenSearch
      • OpenTelemetry
      • Oracle Log Analytics
      • PostgreSQL
      • Prometheus Exporter
      • Prometheus Remote Write
      • SkyWalking
      • Slack
      • Splunk
      • Stackdriver
      • Standard Output
      • Syslog
      • TCP & TLS
      • Treasure Data
      • Vivo Exporter
      • WebSocket
  • Stream Processing
    • Introduction to Stream Processing
    • Overview
    • Changelog
    • Getting Started
      • Fluent Bit + SQL
      • Check Keys and NULL values
      • Hands On! 101
  • Fluent Bit for Developers
    • C Library API
    • Ingest Records Manually
    • Golang Output Plugins
    • WASM Filter Plugins
    • WASM Input Plugins
    • Developer guide for beginners on contributing to Fluent Bit
Powered by GitBook
On this page
  • Configuration parameters
  • Record Accessor Enabled
  • Filter records
  • Command line
  • Configuration file
  • Nested fields example
  • Excluding records with missing or invalid fields
  • Multiple conditions

Was this helpful?

Export as PDF
  1. Data Pipeline
  2. Filters

Grep

Select or exclude records using patterns

Last updated 7 months ago

Was this helpful?

The Grep Filter plugin lets you match or exclude specific records based on regular expression patterns for values or nested values.

Configuration parameters

The plugin supports the following configuration parameters:

Key
Value Format
Description

Regex

KEY REGEX

Keep records where the content of KEY matches the regular expression.

Exclude

KEY REGEX

Exclude records where the content of KEY matches the regular expression.

Logical_Op

Operation

Specify a logical operator: AND, OR or legacy (default). In legacy mode the behaviour is either AND or OR depending on whether the grep is including (uses AND) or excluding (uses OR). Available from 2.1 or higher.

Record Accessor Enabled

Enable the feature to specify the KEY. Use the record accessor to match values against nested values.

Filter records

To start filtering records, run the filter from the command line or through the configuration file. The following example assumes that you have a file named lines.txt with the following content:

{"log": "aaa"}
{"log": "aab"}
{"log": "bbb"}
{"log": "ccc"}
{"log": "ddd"}
{"log": "eee"}
{"log": "fff"}
{"log": "ggg"}

Command line

When using the command line, pay close attention to quote the regular expressions. Using a configuration file might be easier.

$ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o stdout

Configuration file

[SERVICE]
    parsers_file /path/to/parsers.conf

[INPUT]
    name   tail
    path   lines.txt
    parser json

[FILTER]
    name   grep
    match  *
    regex  log aa

[OUTPUT]
    name   stdout
    match  *
service:
    parsers_file: /path/to/parsers.conf
pipeline:
    inputs:
        - name: tail
          path: lines.txt
          parser: json
    filters:
        - name: grep
          match: '*'
          regex: log aa
    outputs:
        - name: stdout
          match: '*'

The filter lets you use multiple rules which are applied in order. You can have as many Regex and Exclude entries as required.

Nested fields example

Consider the following record example:

{
    "log": "something",
    "kubernetes": {
        "pod_name": "myapp-0",
        "namespace_name": "default",
        "pod_id": "216cd7ae-1c7e-11e8-bb40-000c298df552",
        "labels": {
            "app": "myapp"
        },
        "host": "minikube",
        "container_name": "myapp",
        "docker_id": "370face382c7603fdd309d8c6aaaf434fd98b92421ce"
    }
}

For example, to exclude records that match the nested field kubernetes.labels.app, use the following rule:

[FILTER]
    Name    grep
    Match   *
    Exclude $kubernetes['labels']['app'] myapp
    filters:
        - name: grep
          match: '*'
          exclude: $kubernetes['labels']['app'] myapp

Excluding records with missing or invalid fields

You might want to drop records that are missing certain keys.

One way to do this is to exclude with a regex that matches anything. A missing key fails this check.

The followinfg example checks for a specific valid value for the key:

# Use Grep to verify the contents of the iot_timestamp value.
# If the iot_timestamp key does not exist, this will fail
# and exclude the row.
[FILTER]
    Name                     grep
    Alias                    filter-iots-grep
    Match                    iots_thread.*
    Regex                    iot_timestamp ^\d{4}-\d{2}-\d{2}
    filters:
        - name: grep
          alias: filter-iots-grep
          match: iots_thread.*
          regex: iot_timestamp ^\d{4}-\d{2}-\d{2}

The specified key iot_timestamp must match the expected expression. If it doesn't, or is missing or empty, then it will be excluded.

Multiple conditions

If you want to set multiple Regex or Exclude, use the Logical_Op property to use a logical conjuction or disjunction.

If Logical_Op is set, setting both Regex and Exclude results in an error.

[INPUT]
    Name dummy
    Dummy {"endpoint":"localhost", "value":"something"}
    Tag dummy

[FILTER]
    Name grep
    Match *
    Logical_Op or
    Regex value something
    Regex value error

[OUTPUT]
    Name stdout
pipeline:
    inputs:
        - name: dummy
          dummy: '{"endpoint":"localhost", "value":"something"}'
          tag: dummy
    filters:
        - name: grep
          match: '*'
          logical_op: or
          regex:
            - value something
            - value error
    outputs:
        - name: stdout

The output looks similar to:

Fluent Bit v2.0.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/01/22 09:46:49] [ info] [fluent bit] version=2.0.9, commit=16eae10786, pid=33268
[2023/01/22 09:46:49] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/01/22 09:46:49] [ info] [cmetrics] version=0.5.8
[2023/01/22 09:46:49] [ info] [ctraces ] version=0.2.7
[2023/01/22 09:46:49] [ info] [input:dummy:dummy.0] initializing
[2023/01/22 09:46:49] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2023/01/22 09:46:49] [ info] [filter:grep:grep.0] OR mode
[2023/01/22 09:46:49] [ info] [sp] stream processor started
[2023/01/22 09:46:49] [ info] [output:stdout:stdout.0] worker #0 started
[0] dummy: [1674348410.558341857, {"endpoint"=>"localhost", "value"=>"something"}]
[0] dummy: [1674348411.546425499, {"endpoint"=>"localhost", "value"=>"something"}]

The following command loads the plugin and reads the content of lines.txt. Then the grep filter applies a regular expression rule over the log field created by the tail plugin and only passes records with a field value starting with aa:

To match or exclude records based on nested values, you can use format as the KEY name.

Record Accessor
tail
Record Accessor