1 of 23

Filters

AWS Metadata

The AWS Filter Enriches logs with AWS Metadata. Currently the plugin adds the EC2 instance ID and availability zone to log records. To use this plugin, you must be running in EC2 and have the instance metadata service enabled.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Note: If you run Fluent Bit in a container, you may have to use instance metadata v1. The plugin behaves the same regardless of which version is used.

Command Line

Configuration File

EC2 Tags

EC2 Tags are a useful feature that enables you to label and organize your EC2 instances by creating custom-defined key-value pairs. These tags are commonly utilized for resource management, cost allocation, and automation. Consequently, including them in the Fluent Bit generated logs is almost essential.

To achieve this, AWS Filter can be configured with tags_enabled true to enable the tagging of logs with the relevant EC2 instance tags. This setup ensures that logs are appropriately tagged, making it easier to manage and analyze them based on specific criteria.

Requirements

To use the tags_enabled true functionality in Fluent Bit, the option must be enabled on the EC2 instance where Fluent Bit is running. Without this option enabled, Fluent Bit will not be able to retrieve the tags associated with the EC2 instance. However, this does not mean that Fluent Bit will fail or stop working altogether. Instead, if option is not enabled, Fluent Bit will continue to operate normally and capture other values, such as the EC2 instance ID or availability zone, based on its configuration.

Example

tags_include

Assume that our EC2 instance has many tags, some of which have lengthy values that are irrelevant to the logs we want to collect. Only two tags, department and project, seem to be valuable for our purpose. Here is a configuration which reflects this requirement:

If we run Fluent Bit, what will the logs look like? Here is an example of what the logs might contain:

tags_exclude

Suppose our EC2 instance has three tags: Name:fluent-bit-docs-example, project:fluentbit, and department:it. In this example, we want to exclude the department tag since we consider it redundant. This is because all of our projects belong to the it department, and we do not need to waste storage space on redundant labels.

Here is an example configuration that achieves this:

The resulting logs might look like this:

CheckList

The following plugin looks up if a value in a specified list exists and then allows the addition of a record to indicate if found. Introduced in version 1.8.4

Configuration Parameters

The plugin supports the following configuration parameters

Key

Description

Example Configuration

In the following configuration we will read a file test1.log that includes the following values

Additionally, we will use the following lookup file which contains a list of malicious IPs (ip_list.txt)

In the configuration we are using $remote_addr as the lookup key and 7.7.7.7 is malicious. This means the record we would output for the last record would look like the following

Expect

Made for testing: make sure that your records contain the expected key and values

The expect filter plugin allows you to validate that records match certain criteria in their structure, like validating that a key exists or it has a specific value.

The following page just describes the configuration properties available, for a detailed explanation of its usage and use cases, please refer the following page:

Validating and your Data and Structure

Configuration Parameters

The plugin supports the following configuration parameters:

Property

Description

Getting Started

As mentioned on top, refer to the following page for specific details of usage of this filter:

GeoIP2 Filter

Look up Geo data from IP

GeoIP2 Filter allows you to enrich the incoming data stream using location data from GeoIP2 database.

Configuration Parameters

This plugin supports the following configuration parameters:

Key

Description

Getting Started

The following configuration will process incoming remote_addr, and append country information retrieved from GeoLite2 database.

Each Record parameter above specifies the following triplet:

The field name to be added to records (country)
The lookup key to process (remote_addr)
The query for GeoIP2 database (%{country.names.en})

By running Fluent Bit with the configuration above, you will see the following output:

Note that the GeoLite2-City.mmdb database is available from .

Grep

Select or exclude records per patterns

The Grep Filter plugin allows you to match or exclude specific records based on regular expression patterns for values or nested values.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Value Format

Description

Record Modifier

The Record Modifier Filter plugin allows to append fields or to exclude specific fields.

Configuration Parameters

The plugin supports the following configuration parameters: Remove_key and Allowlist_key are exclusive.

Key

Description

Nightfall

The Nightfall filter scans logs for sensitive data and redacts the sensitive portions. This filter supports scanning for various sensitive information, ranging from API keys and personally identifiable information(PII) to custom regexes you define. You can configure what to scan for in the Nightfall Dashboard.

This filter is not enabled by default in 1.9.0 due to a typo. It must be enabled by setting flag -DFLB_FILTER_NIGHTFALL=ON when building. In 1.9.1 and above this is fixed.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Command Line

Configuration File

Standard Output

The stdout filter plugin allows printing to the standard output the data flowed through the filter plugin, which can be very useful while debugging.

The plugin has no configuration parameters, is very simple to use.

Command Line

$ fluent-bit -i cpu -F stdout -m '*' -o null

We have specified to gather CPU usage metrics and print them out in a human-readable way when they flow through the stdout plugin.

Fluent Bit v1.x.x
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/06/04 14:53:59] [ info] [engine] started (pid=3236719)
[2021/06/04 14:53:59] [ info] [storage] version=1.1.1, initializing...
[2021/06/04 14:53:59] [ info] [storage] in-memory
[2021/06/04 14:53:59] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/06/04 14:53:59] [ info] [sp] stream processor started
[0] cpu.0: [1622789640.379532062, {"cpu_p"=>9.000000, "user_p"=>6.500000, "system_p"=>2.500000, "cpu0.p_cpu"=>8.000000, "cpu0.p_user"=>6.000000, "cpu0.p_system"=>2.000000, "cpu1.p_cpu"=>9.000000, "cpu1.p_user"=>6.000000, "cpu1.p_system"=>3.000000}]
[0] cpu.0: [1622789641.379529426, {"cpu_p"=>22.500000, "user_p"=>18.000000, "system_p"=>4.500000, "cpu0.p_cpu"=>34.000000, "cpu0.p_user"=>30.000000, "cpu0.p_system"=>4.000000, "cpu1.p_cpu"=>11.000000, "cpu1.p_user"=>6.000000, "cpu1.p_system"=>5.000000}]
[0] cpu.0: [1622789642.379544020, {"cpu_p"=>26.500000, "user_p"=>16.000000, "system_p"=>10.500000, "cpu0.p_cpu"=>30.000000, "cpu0.p_user"=>24.000000, "cpu0.p_system"=>6.000000, "cpu1.p_cpu"=>22.000000, "cpu1.p_user"=>8.000000, "cpu1.p_system"=>14.000000}]
[0] cpu.0: [1622789643.379507371, {"cpu_p"=>39.500000, "user_p"=>34.500000, "system_p"=>5.000000, "cpu0.p_cpu"=>52.000000, "cpu0.p_user"=>48.000000, "cpu0.p_system"=>4.000000, "cpu1.p_cpu"=>28.000000, "cpu1.p_user"=>21.000000, "cpu1.p_system"=>7.000000}]
^C[2021/06/04 14:54:04] [engine] caught signal (SIGINT)
[2021/06/04 14:54:04] [ info] [input] pausing cpu.0
[2021/06/04 14:54:04] [ warn] [engine] service will stop in 5 seconds
[2021/06/04 14:54:08] [ info] [engine] service stopped

Sysinfo

The Sysinfo Filter plugin allows to append system information like fluent-bit version or hostname.

Configuration Prameters

The plugin supports the following configuration parameters:

Key

Description

Supported platform

Throttle

The Throttle Filter plugin sets the average Rate of messages per Interval, based on leaky bucket and sliding window algorithm. In case of overflood, it will leak within certain rate.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Value Format

Type Converter

The Type Converter Filter plugin allows to convert data type and append new key value pair.

This plugin is useful in combination with plugins which expect incoming string value. e.g. filter_grep, filter_modify

Configuration Parameters

The plugin supports the following configuration parameters. It needs four parameters.

<config_parameter> <src_key_name> <dst_key_name> <dst_data_type>

dst_data_type allows int, uint, float and string.

e.g. int_key id id_str string

Key

Description

Getting Started

In order to start filtering records, you can run the filter from the command line or through the configuration file.

This is a sample in_mem record to filter.

The plugin outputs uint values and filter_type_converter converts them into string type.

Convert uint to string

You can also run the filter from command line.

The output will be

Tensorflow

Tensorflow Filter allows running Machine Learning inference tasks on the records of data coming from input plugins or stream processor. This filter uses as the inference engine, and requires Tensorflow Lite shared library to be present during build and at runtime.

Tensorflow Lite is a lightweight open-source deep learning framework that is used for mobile and IoT applications. Tensorflow Lite only handles inference (not training), therefore, it loads pre-trained models (.tflite files) that are converted into Tensorflow Lite format (FlatBuffer). You can read more on converting Tensorflow models

Wasm

Use Wasm programs as a filter

Wasm Filter allows you to modify the incoming records using technology.

Due to the necessity to have a flexible filtering mechanism, it is now possible to extend Fluent Bit capabilities by writing custom filters using built Wasm programs and its runtime. A Wasm-based filter takes two steps:

(Optional) Compiled as AOT (Ahead Of Time) objects to optimize Wasm execution pipeline
Configure the Filter in the main configuration

AWS Metadata

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Note: If you run Fluent Bit in a container, you may have to use instance metadata v1. The plugin behaves the same regardless of which version is used.

Command Line

Configuration File

EC2 Tags

Requirements

Example

tags_include

If we run Fluent Bit, what will the logs look like? Here is an example of what the logs might contain:

tags_exclude

Here is an example configuration that achieves this:

The resulting logs might look like this:

Multiline

Concatenate Multiline or Stack trace log messages. Available on Fluent Bit >= v1.8.2.

The Multiline Filter helps to concatenate messages that originally belong to one context but were split across multiple records or log lines. Common examples are stack traces or applications that print logs in multiple lines.

As part of the built-in functionality, without major configuration effort, you can enable one of ours built-in parsers with auto detection and multi format support:

go
python
ruby
java (Google Cloud Platform Java stacktrace format)

Some comments about this filter:

The usage of this filter depends on a previous configuration of a definition.
If you wish to concatenate messages read from a log file, it is highly recommended to use the multiline support in the itself. This is because performing concatenation while reading the log file is more performant. Concatenating messages originally split by Docker or CRI container engines, is supported in the .

This filter only performs buffering that persists across different Chunks when Buffer is enabled. Otherwise, the filter will process one Chunk at a time and is not suitable for most inputs which might send multiline messages in separate chunks.

When buffering is enabled, the filter does not immediately emit messages it receives. It uses the in_emitter plugin, same as the , and emits messages once they are fully concatenated, or a timeout is reached.

Since concatenated records are re-emitted to the head of the Fluent Bit log pipeline, you can not configure multiple multiline filter definitions that match the same tags. This will cause an infinite loop in the Fluent Bit pipeline; to use multiple parsers on the same logs, configure a single filter definitions with a comma separated list of parsers for multiline.parser. For more, see issue .

Secondly, for the same reason, the multiline filter should be the first filter. Logs will be re-emitted by the multiline filter to the head of the pipeline- the filter will ignore its own re-emitted records, but other filters won't. If there are filters before the multiline filter, they will be applied twice.

Configuration Parameters

The plugin supports the following configuration parameters:

Property

Description

Configuration Example

The following example aims to parse a log file called test.log that contains some full lines, a custom Java stacktrace and a Go stacktrace.

The following example files can be located at:

Example files content:

This is the primary Fluent Bit configuration file. It includes the parsers_multiline.conf and tails the file test.log by applying the multiline parsers multiline-regex-test and go. Then it sends the processing to the standard output.

This second file defines a multiline parser for the example. Note that a second multiline parser called go is used in fluent-bit.conf, but this one is a built-in parser.

An example file with multiline and multiformat content:

By running Fluent Bit with the given configuration file you will obtain:

The lines that did not match a pattern are not considered as part of the multiline message, while the ones that matched the rules were concatenated properly.

Docker Partial Message Use Case

When Fluent Bit is consuming logs from a container runtime, such as docker, these logs will be split above a certain limit, usually 16KB. If your application emits a 100K log line, it will be split into 7 partial messages. If you are using the to send the logs to Fluent Bit, they might look like this:

Fluent Bit can re-combine these logs that were split by the runtime and remove the partial message fields. The filter example below is for this use case.

The two options for mode are mutually exclusive in the filter. If you set the mode to partial_message then the multiline.parser option is not allowed.

Log to Metrics

Generate metrics from logs

The Log To Metrics Filter plugin allows you to generate log-derived metrics. It currently supports modes to count records, provide a gauge for field values or create a histogram. You can also match or exclude specific records based on regular expression patterns for values or nested values. This filter plugin does not actually act as a record filter and does not change or drop records. All records will pass this filter untouched and generated metrics will be emitted into a seperate metric pipeline.

Please note that this plugin is an experimental feature and is not recommended for production use. Configuration parameters and plugin functionality are subject to change without notice.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Mandatory

Value Format

Getting Started

The following example takes records from two dummy inputs and counts all messages passing through the log_to_metrics filter. It then generates metric records which are provided to the prometheus_exporter:

Configuration - Counter

You can then use e.g. curl command to retrieve the generated metric:

Configuration - Gauge

The gauge mode needs a value_field specified, where the current metric values are generated from. In this example we also apply a regex filter and enable the kubernetes_mode option:

You can then use e.g. curl command to retrieve the generated metric:

As you can see in the output, only one line is printed, as the records from the first input plugin are ignored, as they do not match the regex.

The filter also allows to use multiple rules which are applied in order, you can have many Regex and Exclude entries as required (see filter plugin).

If you execute the above curl command multiple times, you see, that in this example the metric value stays at 60, as the messages generated by the dummy plugin are not changing. In a real-world scenario the values would change and return the last processed value.

Metric label_values

As you can see, the label sets defined by add_label and label_field are added to the metric. The lines in the metric represent every combination of labels. Only actually used combinations are displayed here. To see this, you can add a dummy dummy input to your configuration.

The metric output would then look like:

You can also see, that all the kubernetes labels have been attached to the metric, accordingly.

Configuration - Histogram

Similar to the gauge mode, histogram needs a value_field specified, where the current metric values are generated from. In this example we also apply a regex filter and enable the kubernetes_mode option:

You can then use e.g. curl command to retrieve the generated metric:

As you can see in the output, there are per default the buckets 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0 and +Inf, in which values are sorted into. A sum and a counter are also part of this metric. You can specify own buckets in the config, like in the following example:

Please note, that the +Inf bucket will always be included implicitly. The buckets in a histogram are cumulative, so a value added to one bucket will add to all larger buckets, too.

You can also see, that all the kubernetes labels have been attached to the metric, idential to the behavior of label_field described in . That results in two sets for the histogram.

single line... Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting! at com.myproject.module.MyProject.badMethod(MyProject.java:22) at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18) at com.myproject.module.MyProject.anotherMethod(MyProject.java:14) at com.myproject.module.MyProject.someMethod(MyProject.java:10) at com.myproject.module.MyProject.main(MyProject.java:6) another line... panic: my panic goroutine 4 [running]: panic(0x45cb40, 0x47ad70) /usr/local/go/src/runtime/panic.go:542 +0x46c fp=0xc42003f7b8 sp=0xc42003f710 pc=0x422f7c main.main.func1(0xc420024120) foo.go:6 +0x39 fp=0xc42003f7d8 sp=0xc42003f7b8 pc=0x451339 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003f7e0 sp=0xc42003f7d8 pc=0x44b4d1 created by main.main foo.go:5 +0x58 goroutine 1 [chan receive]: runtime.gopark(0x4739b8, 0xc420024178, 0x46fcd7, 0xc, 0xc420028e17, 0x3) /usr/local/go/src/runtime/proc.go:280 +0x12c fp=0xc420053e30 sp=0xc420053e00 pc=0x42503c runtime.goparkunlock(0xc420024178, 0x46fcd7, 0xc, 0x1000f010040c217, 0x3) /usr/local/go/src/runtime/proc.go:286 +0x5e fp=0xc420053e70 sp=0xc420053e30 pc=0x42512e runtime.chanrecv(0xc420024120, 0x0, 0xc420053f01, 0x4512d8) /usr/local/go/src/runtime/chan.go:506 +0x304 fp=0xc420053f20 sp=0xc420053e70 pc=0x4046b4 runtime.chanrecv1(0xc420024120, 0x0) /usr/local/go/src/runtime/chan.go:388 +0x2b fp=0xc420053f50 sp=0xc420053f20 pc=0x40439b main.main() foo.go:9 +0x6f fp=0xc420053f80 sp=0xc420053f50 pc=0x4512ef runtime.main() /usr/local/go/src/runtime/proc.go:185 +0x20d fp=0xc420053fe0 sp=0xc420053f80 pc=0x424bad runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc420053fe8 sp=0xc420053fe0 pc=0x44b4d1 goroutine 2 [force gc (idle)]: runtime.gopark(0x4739b8, 0x4ad720, 0x47001e, 0xf, 0x14, 0x1) /usr/local/go/src/runtime/proc.go:280 +0x12c fp=0xc42003e768 sp=0xc42003e738 pc=0x42503c runtime.goparkunlock(0x4ad720, 0x47001e, 0xf, 0xc420000114, 0x1) /usr/local/go/src/runtime/proc.go:286 +0x5e fp=0xc42003e7a8 sp=0xc42003e768 pc=0x42512e runtime.forcegchelper() /usr/local/go/src/runtime/proc.go:238 +0xcc fp=0xc42003e7e0 sp=0xc42003e7a8 pc=0x424e5c runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003e7e8 sp=0xc42003e7e0 pc=0x44b4d1 created by runtime.init.4 /usr/local/go/src/runtime/proc.go:227 +0x35 goroutine 3 [GC sweep wait]: runtime.gopark(0x4739b8, 0x4ad7e0, 0x46fdd2, 0xd, 0x419914, 0x1) /usr/local/go/src/runtime/proc.go:280 +0x12c fp=0xc42003ef60 sp=0xc42003ef30 pc=0x42503c runtime.goparkunlock(0x4ad7e0, 0x46fdd2, 0xd, 0x14, 0x1) /usr/local/go/src/runtime/proc.go:286 +0x5e fp=0xc42003efa0 sp=0xc42003ef60 pc=0x42512e runtime.bgsweep(0xc42001e150) /usr/local/go/src/runtime/mgcsweep.go:52 +0xa3 fp=0xc42003efd8 sp=0xc42003efa0 pc=0x419973 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003efe0 sp=0xc42003efd8 pc=0x44b4d1 created by runtime.gcenable /usr/local/go/src/runtime/mgc.go:216 +0x58 one more line, no multiline

$ fluent-bit -c fluent-bit.conf [0] tail.0: [1626736433.143567481, {"log"=>"single line..."}] [1] tail.0: [1626736433.143570538, {"log"=>"Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting! at com.myproject.module.MyProject.badMethod(MyProject.java:22) at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18) at com.myproject.module.MyProject.anotherMethod(MyProject.java:14) at com.myproject.module.MyProject.someMethod(MyProject.java:10) at com.myproject.module.MyProject.main(MyProject.java:6)"}] [2] tail.0: [1626736433.143572538, {"log"=>"another line..."}] [3] tail.0: [1626736433.143572894, {"log"=>"panic: my panic goroutine 4 [running]: panic(0x45cb40, 0x47ad70) /usr/local/go/src/runtime/panic.go:542 +0x46c fp=0xc42003f7b8 sp=0xc42003f710 pc=0x422f7c main.main.func1(0xc420024120) foo.go:6 +0x39 fp=0xc42003f7d8 sp=0xc42003f7b8 pc=0x451339 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003f7e0 sp=0xc42003f7d8 pc=0x44b4d1 created by main.main foo.go:5 +0x58 goroutine 1 [chan receive]: runtime.gopark(0x4739b8, 0xc420024178, 0x46fcd7, 0xc, 0xc420028e17, 0x3) /usr/local/go/src/runtime/proc.go:280 +0x12c fp=0xc420053e30 sp=0xc420053e00 pc=0x42503c runtime.goparkunlock(0xc420024178, 0x46fcd7, 0xc, 0x1000f010040c217, 0x3) /usr/local/go/src/runtime/proc.go:286 +0x5e fp=0xc420053e70 sp=0xc420053e30 pc=0x42512e runtime.chanrecv(0xc420024120, 0x0, 0xc420053f01, 0x4512d8) /usr/local/go/src/runtime/chan.go:506 +0x304 fp=0xc420053f20 sp=0xc420053e70 pc=0x4046b4 runtime.chanrecv1(0xc420024120, 0x0) /usr/local/go/src/runtime/chan.go:388 +0x2b fp=0xc420053f50 sp=0xc420053f20 pc=0x40439b main.main() foo.go:9 +0x6f fp=0xc420053f80 sp=0xc420053f50 pc=0x4512ef runtime.main() /usr/local/go/src/runtime/proc.go:185 +0x20d fp=0xc420053fe0 sp=0xc420053f80 pc=0x424bad runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc420053fe8 sp=0xc420053fe0 pc=0x44b4d1 goroutine 2 [force gc (idle)]: runtime.gopark(0x4739b8, 0x4ad720, 0x47001e, 0xf, 0x14, 0x1) /usr/local/go/src/runtime/proc.go:280 +0x12c fp=0xc42003e768 sp=0xc42003e738 pc=0x42503c runtime.goparkunlock(0x4ad720, 0x47001e, 0xf, 0xc420000114, 0x1) /usr/local/go/src/runtime/proc.go:286 +0x5e fp=0xc42003e7a8 sp=0xc42003e768 pc=0x42512e runtime.forcegchelper() /usr/local/go/src/runtime/proc.go:238 +0xcc fp=0xc42003e7e0 sp=0xc42003e7a8 pc=0x424e5c runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003e7e8 sp=0xc42003e7e0 pc=0x44b4d1 created by runtime.init.4 /usr/local/go/src/runtime/proc.go:227 +0x35 goroutine 3 [GC sweep wait]: runtime.gopark(0x4739b8, 0x4ad7e0, 0x46fdd2, 0xd, 0x419914, 0x1) /usr/local/go/src/runtime/proc.go:280 +0x12c fp=0xc42003ef60 sp=0xc42003ef30 pc=0x42503c runtime.goparkunlock(0x4ad7e0, 0x46fdd2, 0xd, 0x14, 0x1) /usr/local/go/src/runtime/proc.go:286 +0x5e fp=0xc42003efa0 sp=0xc42003ef60 pc=0x42512e runtime.bgsweep(0xc42001e150) /usr/local/go/src/runtime/mgcsweep.go:52 +0xa3 fp=0xc42003efd8 sp=0xc42003efa0 pc=0x419973 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003efe0 sp=0xc42003efd8 pc=0x44b4d1 created by runtime.gcenable /usr/local/go/src/runtime/mgc.go:216 +0x58"}] [4] tail.0: [1626736433.143585473, {"log"=>"one more line, no multiline"}]

> curl -s http://127.0.0.1:2021/metrics


# HELP log_metric_histogram_current_duration This metric shows the request duration
# TYPE log_metric_histogram_current_duration histogram
log_metric_histogram_current_duration_bucket{le="0.005",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.01",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.025",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.05",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.1",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.25",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.5",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="1.0",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="2.5",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="5.0",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="10.0",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="+Inf",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 28
log_metric_histogram_current_duration_sum{namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 560
log_metric_histogram_current_duration_count{namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="red",shape="circle"} 28
log_metric_histogram_current_duration_bucket{le="0.005",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.01",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.025",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.05",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.1",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.25",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="0.5",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="1.0",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="2.5",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="5.0",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="10.0",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 0
log_metric_histogram_current_duration_bucket{le="+Inf",namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 27
log_metric_histogram_current_duration_sum{namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 1620
log_metric_histogram_current_duration_count{namespace_name="default",pod_name="pod1",container_name="mycontainer",docker_id="abc123",pod_id="def456",app="app1",color="blue",shape="circle"} 27

Lua

The Lua filter allows you to modify the incoming records (even split one record into multiple records) using custom Lua scripts.

Due to the necessity to have a flexible filtering mechanism, it is now possible to extend Fluent Bit capabilities by writing custom filters using Lua programming language. A Lua-based filter takes two steps:

Configure the Filter in the main configuration
Prepare a Lua script that will be used by the Filter

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Getting Started

In order to test the filter, you can run the plugin from the command line or through the configuration file. The following examples use the input plugin for data ingestion, invoke Lua filter using the script and call the function which only prints the same information to the standard output:

Command Line

From the command line you can use the following options:

Configuration File

In your main configuration file append the following Input, Filter & Output sections:

Lua Script Filter API

The life cycle of a filter have the following steps:

Upon Tag matching by this filter, it may process or bypass the record.
If tag matched, it will accept the record and invoke the function defined in the call property which basically is the name of a function defined in the Lua script.
Invoke Lua function and pass each record in JSON format.

Callback Prototype

The Lua script can have one or multiple callbacks that can be used by this filter. The function prototype is as follows:

Function Arguments

name

description

Return Values

Each callback must return three values:

name

data type

description

Code Examples

For functional examples of this interface, please refer to the code samples provided in the source code of the project located here:

Inline configuration

The include examples to verify during CI.

In classic mode:

Environment variable processing

As an example that combines a bit of LUA processing with the that demonstrates using environment variables with LUA regex and substitutions.

Kubernetes pods generally have various environment variables set by the infrastructure automatically which may contain useful information.

In this example, we want to extract part of the Kubernetes cluster API name.

The environment variable is set like so: KUBERNETES_SERVICE_HOST: api.sandboxbsh-a.project.domain.com

We want to extract the sandboxbsh name and add it to our record as a special key.

Number Type

+Lua treats number as double. It means an integer field (e.g. IDs, log levels) will be converted double. To avoid type conversion, The type_int_key property is available.

Protected Mode

Fluent Bit supports protected mode to prevent crash when executes invalid Lua script. See also .

Record Split

The Lua callback function can return an array of tables (i.e., array of records) in its third record return value. With this feature, the Lua filter can split one input record into multiple records according to custom logic.

For example:

Lua script

Configuration

Input

Output

Response code filtering

In this example, we want to filter istio logs to exclude lines with response codes between 1 and 399. Istio is configured to write the logs in json format.

Lua script

Script response_code_filter.lua

Configuration

Configuration to get istio logs and apply response code filter to them.

Input

Output

In the output only the messages with response code 0 or greater than 399 are shown.

Kubernetes

Fluent Bit Kubernetes Filter allows to enrich your log files with Kubernetes metadata.

When Fluent Bit is deployed in Kubernetes as a DaemonSet and configured to read the log files from the containers (using tail or systemd input plugins), this filter aims to perform the following operations:

Analyze the Tag and extract the following metadata:
- Pod Name
- Namespace
- Container Name
- Container ID
Query Kubernetes API Server to obtain extra metadata for the POD in question:
- Pod ID
- Labels
- Annotations

The data is cached locally in memory and appended to each record.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Processing the 'log' value

Kubernetes Filter aims to provide several ways to process the data contained in the log key. The following explanation of the workflow assumes that your original Docker parser defined in parsers.conf is as follows:

Since Fluent Bit v1.2 we are not suggesting the use of decoders (Decode_Field_As) if you are using Elasticsearch database in the output to avoid data type conflicts.

To perform processing of the log key, it's mandatory to enable the Merge_Log configuration property in this filter, then the following processing order will be done:

If a Pod suggest a parser, the filter will use that parser to process the content of log.
If the option Merge_Parser was set and the Pod did not suggest a parser, process the log content using the suggested parser in the configuration.
If no Pod was suggested and no Merge_Parser is set, try to handle the content as JSON.

If log value processing fails, the value is untouched. The order above is not chained, meaning it's exclusive and the filter will try only one of the options above, not all of them.

Kubernetes Annotations

A flexible feature of Fluent Bit Kubernetes filter is that allow Kubernetes Pods to suggest certain behaviors for the log processor pipeline when processing the records. At the moment it support:

Suggest a pre-defined parser
Request to exclude logs

The following annotations are available:

Annotation

Description

Default

Annotation Examples in Pod definition

Suggest a parser

The following Pod definition runs a Pod that emits Apache logs to the standard output, in the Annotations it suggest that the data should be processed using the pre-defined parser called apache:

Request to exclude logs

There are certain situations where the user would like to request that the log processor simply skip the logs from the Pod in question:

Note that the annotation value is boolean which can take a true or false and must be quoted.

Workflow of Tail + Kubernetes Filter

Kubernetes Filter depends on either or input plugins to process and enrich records with Kubernetes metadata. Here we will explain the workflow of Tail and how it configuration is correlated with Kubernetes filter. Consider the following configuration example (just for demo purposes, not production):

In the input section, the plugin will monitor all files ending in .log in path /var/log/containers/. For every file it will read every line and apply the docker parser. Then the records are emitted to the next step with an expanded tag.

Tail support Tags expansion, which means that if a tag have a star character (*), it will replace the value with the absolute path of the monitored file, so if you file name and path is:

then the Tag for every record of that file becomes:

note that slashes are replaced with dots.

When runs, it will try to match all records that starts with kube. (note the ending dot), so records from the file mentioned above will hit the matching rule and the filter will try to enrich the records

Kubernetes Filter do not care from where the logs comes from, but it cares about the absolute name of the monitored file, because that information contains the pod name and namespace name that are used to retrieve associated metadata to the running Pod from the Kubernetes Master/API Server.

If you have large pod specifications (can be caused by large numbers of environment variables, etc.), be sure to increase the Buffer_Size parameter of the kubernetes filter. If object sizes exceed this buffer, some metadata will fail to be injected to the logs.

If the configuration property Kube_Tag_Prefix was configured (available on Fluent Bit >= 1.1.x), it will use that value to remove the prefix that was appended to the Tag in the previous Input section. Note that the configuration property defaults to kube.var.logs.containers. , so the previous Tag content will be transformed from:

to:

the transformation above do not modify the original Tag, just creates a new representation for the filter to perform metadata lookup.

that new value is used by the filter to lookup the pod name and namespace, for that purpose it uses an internal Regular expression:

If you want to know more details, check the source code of that definition .

You can see on web site how this operation is performed, check the following demo link:

Custom Regex

Under certain and not common conditions, a user would want to alter that hard-coded regular expression, for that purpose the option Regex_Parser can be used (documented on top).

Final Comments

So at this point the filter is able to gather the values of pod_name and namespace, with that information it will check in the local cache (internal hash table) if some metadata for that key pair exists, if so, it will enrich the record with the metadata value, otherwise it will connect to the Kubernetes Master/API Server and retrieve that information.

Optional Feature: Using Kubelet to Get Metadata

There is an reported about kube-apiserver fall over and become unresponsive when cluster is too large and too many requests are sent to it. For this feature, fluent bit Kubernetes filter will send the request to kubelet /pods endpoint instead of kube-apiserver to retrieve the pods information and use it to enrich the log. Since Kubelet is running locally in nodes, the request would be responded faster and each node would only get one request one time. This could save kube-apiserver power to handle other requests. When this feature is enabled, you should see no difference in the kubernetes metadata added to logs, but the Kube-apiserver bottleneck should be avoided when cluster is large.

Configuration Setup

There are some configuration setup needed for this feature.

Role Configuration for Fluent Bit DaemonSet Example:

The difference is that kubelet need a special permission for resource nodes/proxy to get HTTP request in. When creating the role or clusterRole, you need to add nodes/proxy into the rule for resource.

Fluent Bit Configuration Example:

So for fluent bit configuration, you need to set the Use_Kubelet to true to enable this feature.

DaemonSet config Example:

The key point is to set hostNetwork to true and dnsPolicy to ClusterFirstWithHostNet that fluent bit DaemonSet could call Kubelet locally. Otherwise it could not resolve the dns for kubelet.

Now you are good to use this new feature!

Verify that the Use_Kubelet option is working

Basically you should see no difference about your experience for enriching your log files with Kubernetes metadata.

To check if Fluent Bit is using the kubelet, you can check fluent bit logs and there should be a log like this:

And if you are in debug mode, you could see more:

Troubleshooting

The following section goes over specific log messages you may run into and how to solve them to ensure that Fluent Bit's Kubernetes filter is operating properly

I can't see metadata appended to my pod or other Kubernetes objects

If you are not seeing metadata added to your kubernetes logs and see the following in your log message, then you may be facing connectivity issues with the Kubernetes API server.

Potential fix #1: Check Kubernetes roles

When Fluent Bit is deployed as a DaemonSet it generally runs with specific roles that allow the application to talk to the Kubernetes API server. If you are deployed in a more restricted environment check that all the Kubernetes roles are set correctly.

You can test this by running the following command (replace fluentbit-system with the namespace where your fluentbit is installed)

If set roles are configured correctly, it should simply respond with yes.

For instance, using Azure AKS, running the above command may respond with:

If you have connectivity to the API server, but still "could not get meta for POD" - debug logging might give you a message with Azure does not have opinion for this user. Then the following subject may need to be included in the fluentbit ClusterRoleBinding:

appended to subjects array:

Potential fix #2: Check Kubernetes IPv6

There may be cases where you have IPv6 on in the environment and you need to enable this within Fluent Bit. Under the service tag please set the following option ipv6 to on .

Potential fix #3: Check connectivity to Kube_URL

By default the Kube_URL is set to https://kubernetes.default.svc:443 . Ensure that you have connectivity to this endpoint from within the cluster and that there are no special permission interfering with the connection.

I can't see new objects getting metadata

In some cases, you may only see some objects being appended with metadata while other objects are not enriched. This can occur at times when local data is cached and does not contain the correct id for the kubernetes object that requires enrichment. For most Kubernetes objects the Kubernetes API server is updated which will then be reflected in Fluent Bit logs, however in some cases for Pod objects this refresh to the Kubernetes API server can be skipped, causing metadata to be skipped.

Filters

AWS Metadata

Configuration Parameters

Command Line

Configuration File

EC2 Tags

Requirements

Example

tags_include

tags_exclude

CheckList

Configuration Parameters

Example Configuration

Expect

Configuration Parameters

Getting Started

GeoIP2 Filter

Configuration Parameters

Getting Started

Grep

Configuration Parameters

Record Modifier

Configuration Parameters

Nightfall

Configuration Parameters

Command Line

Configuration File

Standard Output

Command Line

Sysinfo

Configuration Prameters

Throttle

Configuration Parameters

Type Converter

Configuration Parameters

Getting Started

Convert uint to string

Tensorflow

Tensorflow

Wasm

Filters

AWS Metadata

Configuration Parameters

Command Line

Configuration File

EC2 Tags

Requirements

Example

tags_include

tags_exclude

Standard Output

Command Line

GeoIP2 Filter

Configuration Parameters

Getting Started

Expect

Configuration Parameters

Getting Started

Sysinfo

Configuration Prameters

Wasm

Configuration Parameters

Configuration Examples

Getting Started

CheckList

Configuration Parameters

Example Configuration

Type Converter

Configuration Parameters

Getting Started

Convert uint to string

Nightfall

Configuration Parameters

Command Line

Configuration File

Record Modifier

Configuration Parameters

Getting Started

Append fields

Remove fields with Remove_key