1 of 14

Configuration

Fluent Bit is flexible enough to be configured either from the command line or through a configuration file. For production environments, we strongly recommend to use the configuration file approach.

Note that all configuration files use a specific fixed and strict schema, please proceed to the following sections for a better understanding:

File Schema (must read)
Configuration Files
Configuration Variables
Configuration Commands
Monitoring
Unit Sizes
TLS / SSL
Backpressure
Memory Usage

Configuration Schema

Fluent Bit may optionally use a configuration file to define how the service will behave, and before proceeding we need to understand how the configuration schema works. The schema is defined by three concepts:

Sections
Entries: Key/Value
Indented Configuration Mode

A simple example of a configuration file is as follows:

[SERVICE]
    # This is a commented line
    Daemon    off
    log_level debug

Sections

A section is defined by a name or title inside brackets. Looking at the example above, a Service section has been set using [SERVICE] definition. Section rules:

All section content must be indented (4 spaces ideally).
Multiple sections can exist on the same file.
A section is expected to have comments and entries, it cannot be empty.
Any commented line under a section, must be indented too.

Entries: Key/Value

A section may contain Entries, an entry is defined by a line of text that contains a Key and a Value, using the above example, the [SERVICE] section contains two entries, one is the key Daemon with value off and the other is the key Log_Level with the value debug. Entries rules:

An entry is defined by a key and a value.
A key must be indented.
A key must contain a value which ends in the breakline.
Multiple keys with the same name can exist.

Also commented lines are set prefixing the # character, those lines are not processed but they must be indented too.

Indented Configuration Mode

Fluent Bit configuration files are based in a strict Indented Mode, that means that each configuration file must follow the same pattern of alignment from left to right when writing text. By default an indentation level of four spaces from left to right is suggested. Example:

[FIRST_SECTION]
    # This is a commented line
    Key1  some value
    Key2  another value
    # more comments

[SECOND_SECTION]
    KeyN  3.14

As you can see there are two sections with multiple entries and comments, note also that empty lines are allowed and they do not need to be indented.

Configuration File

There are some cases where using the command line to start Fluent Bit is not ideal. When running Fluent Bit as a service, a configuration file is preferred.

Fluent Bit allows to use one configuration file which works at a global scope and uses the schema defined previously.

The configuration file supports four types of sections:

Service
Input
Filter
Output

In addition there is an additional feature to include external files:

Include File

Service

The Service section defines global properties of the service, the keys available as of this version are described in the following table:

Example

The following is an example of a SERVICE section:

[SERVICE]
    Flush           5
    Daemon          off
    Log_Level       debug

Input

An INPUT section defines a source (related to an input plugin), here we will describe the base configuration for each INPUT section. Note that each input plugin may add it own configuration keys:

The Name is mandatory and it let Fluent Bit know which input plugin should be loaded. The Tag is mandatory for all plugins except for the input forward plugin (as it provides dynamic tags).

Example

The following is an example of an INPUT section:

[INPUT]
    Name cpu
    Tag  my_cpu

Filter

A FILTER section defines a filter (related to an filter plugin), here we will describe the base configuration for each FILTER section. Note that each filter plugin may add it own configuration keys:

The Name is mandatory and it let Fluent Bit know which filter plugin should be loaded. The Match or Match_Regex is mandatory for all plugins. If both are specified, Match_Regex takes precedence.

Example

The following is an example of an FILTER section:

[FILTER]
    Name  stdout
    Match *

Output

The OUTPUT section specify a destination that certain records should follow after a Tag match. The configuration support the following keys:

Example

The following is an example of an OUTPUT section:

[OUTPUT]
    Name  stdout
    Match my*cpu

Example: collecting CPU metrics

The following configuration file example demonstrates how to collect CPU metrics and flush the results every five seconds to the standard output:

[SERVICE]
    Flush     5
    Daemon    off
    Log_Level debug

[INPUT]
    Name  cpu
    Tag   my_cpu

[OUTPUT]
    Name  stdout
    Match my*cpu

Include File

To avoid complicated long configuration files is better to split specific parts in different files and call them (include) from one main file.

Starting from Fluent Bit 0.12 the new configuration command @INCLUDE has been added and can be used in the following way:

@INCLUDE somefile.conf

The configuration reader will try to open the path somefile.conf, if not found, it will assume it's a relative path based on the path of the base configuration file, e.g:

Main configuration file path: /tmp/main.conf
Included file: somefile.conf
Fluent Bit will try to open somefile.conf, if it fails it will try /tmp/somefile.conf.

The @INCLUDE command only works at top-left level of the configuration line, it cannot be used inside sections.

Wildcard character (*) is supported to include multiple files, e.g:

@INCLUDE input_*.conf

Configuration Variables

Fluent Bit support the usage of environment variables in any value associated to a key when using a configuration file.

The variables are case sensitive and can be used in the following format:

${MY_VARIABLE}

When Fluent Bit starts, the configuration reader will detect any request for ${MY_VARIABLE} and will try to resolve it value.

Example

Create the following configuration file (fluent-bit.conf):

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info

[INPUT]
    Name cpu
    Tag  cpu.local

[OUTPUT]
    Name  ${MY_OUTPUT}
    Match *

Open a terminal and set the environment variable:

$ export MY_OUTPUT=stdout

The above command set the 'stdout' value to the variable MY_OUTPUT.

Run Fluent Bit with the recently created configuration file:

$ bin/fluent-bit -c fluent-bit.conf
Fluent-Bit v0.11.0
Copyright (C) Treasure Data

[2017/04/03 12:25:25] [ info] [engine] started
[0] cpu.local: [1491243925, {"cpu_p"=>1.750000, "user_p"=>1.750000, "system_p"=>0.000000, "cpu0.p_cpu"=>3.000000, "cpu0.p_user"=>2.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>4.000000, "cpu2.p_user"=>4.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

As you can see the service worked properly as the configuration was valid.

Configuration Commands

Configuration files must be flexible enough for any deployment need, but they must keep a clean and readable format.

Fluent Bit Commands extends a configuration file with specific built-in features. The list of commands available as of Fluent Bit 0.12 series are:

@INCLUDE Command

Configuring a logging pipeline might lead to an extensive configuration file. In order to maintain a human-readable configuration, it's suggested to split the configuration in multiple files.

The @INCLUDE command allows the configuration reader to include an external configuration file, e.g:

The above example defines the main service configuration file and also include two files to continue the configuration:

inputs.conf

outputs.conf

Note that despites the order of inclusion, Fluent Bit will ALWAYS respect the following order:

Service
Inputs
Filters
Outputs

@SET Command

The @SET command can only be used at root level of each line, meaning it cannot be used inside a section, e.g:

Buffering / Storage

The end-goal of Fluent Bit is to collect, parse, filter and ship logs to a central place. In this workflow there are many phases and one of the critical pieces is the ability to do buffering : a mechanism to place processed data into a temporal location until is ready to be shipped.

By default when Fluent Bit process data, it uses Memory as a primary and temporal place to store the record logs, but there are certain scenarios where would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities.

Starting with Fluent Bit v1.0, we introduced a new storage layer that can either work in memory or in the file system. Input plugins can be configured to use one or the other upon demand at start time.

Configuration

The storage layer configuration takes place in two areas:

Service Section
Input Section

The known Service section configure a global environment for the storage layer, and then in the Input sections defines which mechanism to use.

Service Section Configuration

a Service section will look like this:

[SERVICE]
    flush                     1
    log_Level                 info
    storage.path              /var/log/flb-storage/
    storage.sync              normal
    storage.checksum          off
    storage.backlog.mem_limit 5M

that configuration configure an optional buffering mechanism where it root for data is /var/log/flb-storage/, it will use normal synchronization mode, without checksum and up to a maximum of 5MB of memory when processing backlog data.

Input Section Configuration

Optionally, any Input plugin can configure their storage preference, the following table describe the options available:

The following example configure a service that offers filesystem buffering capabilities and two Input plugins being the first based in memory and the second with the filesystem:

[SERVICE]
    flush                     1
    log_Level                 info
    storage.path              /var/log/flb-storage/
    storage.sync              normal
    storage.checksum          off
    storage.backlog.mem_limit 5M

[INPUT]
    name          cpu
    storage.type  filesystem

[INPUT]
    name          mem
    storage.type  memory

Monitoring

Fluent Bit comes with a built-in HTTP Server that can be used to query internal information and monitor metrics of each running plugin.

Getting Started

To get started, the first step is to enable the HTTP Server from the configuration file:

[SERVICE]
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_PORT    2020

[INPUT]
    Name cpu

[OUTPUT]
    Name  stdout
    Match *

the above configuration snippet will instruct Fluent Bit to start it HTTP Server on TCP Port 2020 and listening on all network interfaces:

$ bin/fluent-bit -c fluent-bit.conf
Fluent-Bit v0.14.x
Copyright (C) Treasure Data

[2017/10/27 19:08:24] [ info] [engine] started
[2017/10/27 19:08:24] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020

now with a simple curl command is enough to gather some information:

$ curl -s http://127.0.0.1:2020 | jq
{
  "fluent-bit": {
    "version": "0.13.0",
    "edition": "Community",
    "flags": [
      "FLB_HAVE_TLS",
      "FLB_HAVE_METRICS",
      "FLB_HAVE_SQLDB",
      "FLB_HAVE_TRACE",
      "FLB_HAVE_HTTP_SERVER",
      "FLB_HAVE_FLUSH_LIBCO",
      "FLB_HAVE_SYSTEMD",
      "FLB_HAVE_VALGRIND",
      "FLB_HAVE_FORK",
      "FLB_HAVE_PROXY_GO",
      "FLB_HAVE_REGEX",
      "FLB_HAVE_C_TLS",
      "FLB_HAVE_SETJMP",
      "FLB_HAVE_ACCEPT4",
      "FLB_HAVE_INOTIFY"
    ]
  }
}

Note that we are sending the curl command output to the jq program which helps to make the JSON data easy to read from the terminal. Fluent Bit don't aim to do JSON pretty-printing.

REST API Interface

Fluent Bit aims to expose useful interfaces for monitoring, as of Fluent Bit v0.14 the following end points are available:

Uptime Example

Query the service uptime with the following command:

$ curl -s http://127.0.0.1:2020/api/v1/uptime | jq

it should print a similar output like this:

{
  "uptime_sec": 8950000,
  "uptime_hr": "Fluent Bit has been running:  103 days, 14 hours, 6 minutes and 40 seconds"
}

Metrics Examples

Query internal metrics in JSON format with the following command:

$ curl -s http://127.0.0.1:2020/api/v1/metrics | jq

it should print a similar output like this:

{
  "input": {
    "cpu.0": {
      "records": 8,
      "bytes": 2536
    }
  },
  "output": {
    "stdout.0": {
      "proc_records": 5,
      "proc_bytes": 1585,
      "errors": 0,
      "retries": 0,
      "retries_failed": 0
    }
  }
}

Metrics in Prometheus format

Query internal metrics in Prometheus Text 0.0.4 format:

$ curl -s http://127.0.0.1:2020/api/v1/metrics/prometheus

this time the same metrics will be in Prometheus format instead of JSON:

fluentbit_input_records_total{name="cpu.0"} 57 1509150350542
fluentbit_input_bytes_total{name="cpu.0"} 18069 1509150350542
fluentbit_output_proc_records_total{name="stdout.0"} 54 1509150350542
fluentbit_output_proc_bytes_total{name="stdout.0"} 17118 1509150350542
fluentbit_output_errors_total{name="stdout.0"} 0 1509150350542
fluentbit_output_retries_total{name="stdout.0"} 0 1509150350542
fluentbit_output_retries_failed_total{name="stdout.0"} 0 1509150350542

Configuring Aliases

By default configured plugins on runtime get an internal name in the format plugin_name.ID. For monitoring purposes this can be confusing if many plugins of the same type were configured. To make a distinction each configured input or output section can get an alias that will be used as the parent name for the metric.

The following example set an alias to the INPUT section which is using the CPU input plugin:

[SERVICE]
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_PORT    2020

[INPUT]
    Name  cpu
    Alias server1_cpu

[OUTPUT]
    Name  stdout
    Alias raw_output
    Match *

Now when querying the metrics we get the aliases in place instead of the plugin name:

{
  "input": {
    "server1_cpu": {
      "records": 8,
      "bytes": 2536
    }
  },
  "output": {
    "raw_output": {
      "proc_records": 5,
      "proc_bytes": 1585,
      "errors": 0,
      "retries": 0,
      "retries_failed": 0
    }
  }
}

Unit Sizes

Certain configuration directives in Fluent Bit refer to unit sizes such as when defining the size of a buffer or specific limits, we can find these in plugins like Tail Input, Forward Input or in generic properties like Mem_Buf_Limit.

Starting from Fluent Bit v0.11.10, all unit sizes have been standardized across the core and plugins, the following table describes the options that can be used and what they mean:

TLS / SSL

Fluent Bit provides integrated support for Transport Layer Security (TLS) and it predecessor Secure Sockets Layer (SSL) respectively. In this section we will refer as TLS only for both implementations.

Each output plugin that requires to perform Network I/O can optionally enable TLS and configure the behavior. The following table describes the properties available:

The listed properties can be enabled in the configuration file, specifically on each output plugin section or directly through the command line. The following output plugins can take advantage of the TLS feature:

Example: enable TLS on HTTP output

By default HTTP output plugin uses plain TCP, enabling TLS from the command line can be done with:

In the command line above, the two properties tls and tls.verify where enabled for demonstration purposes (we strongly suggest always keep verification ON).

The same behavior can be accomplished using a configuration file:

Backpressure

In certain environments is common to see that logs or data being ingested is faster than the ability to flush it to some destinations. The common case is reading from big log files and dispatching the logs to a backend over the network which takes some time to respond, this generate backpressure leading to a high memory consumption in the service.

In order to avoid backpressure, Fluent Bit implements a mechanism in the engine that restrict the amount of data than an input plugin can ingest, this is done through the configuration parameter Mem_Buf_Limit.

Mem_Buf_Limit

This option is disabled by default and can be applied to all input plugins. Let's explain it behavior using the following scenario:

Mem_Buf_Limit is set to 1MB (one megabyte)
input plugin tries to append 700KB
engine route the data to an output plugin
output plugin backend (HTTP Server) is down
engine scheduler will retry the flush after 10 seconds
input plugin tries to append 500KB

At this exact point, the engine will allow to append those 500KB of data into the engine: in total we have 1.2MB. The options works in a permissive mode before to reach the limit, but the limit is exceeded the following actions are taken:

block local buffers for the input plugin (cannot append more data)
notify the input plugin invoking a pause callback

The engine will protect it self and will not append more data coming from the input plugin in question; Note that is the plugin responsibility to keep their state and take some decisions about what to do on that paused state.

After some seconds if the scheduler was able to flush the initial 700KB of data or it gave up after retrying, that amount memory is released and internally the following actions happens:

Upon data buffer release (700KB), the internal counters get updated
Counters now are set at 500KB
Since 500KB is < 1MB it checks the input plugin state
If the plugin is paused, it invokes a resume callback
input plugin can continue appending more data

About pause and resume Callbacks

Each plugin is independent and not all of them implements the pause and resume callbacks. As said, these callbacks are just a notification mechanism for the plugin.

Memory Usage

In certain scenarios would be ideal to estimate how much memory Fluent Bit could be using, this is very useful for containerized environments where memory limits are a must.

In order to estimate we will assume that the input plugins have set the Mem_Buf_Limit option (you can learn more about it in the section).

Estimating

Input plugins append data independently, so in order to do an estimation a limit should be imposed through the Mem_Buf_Limit option. If the limit was set to 10MB we need to estimate that in the worse case, the output plugin likely could use 20MB.

Fluent Bit has an internal binary representation for the data being processed, but when this data reach an output plugin, this one will likely create their own representation in a new memory buffer for processing. The best example are the and output plugins, both needs to convert the binary representation to their respective-custom JSON formats before to talk to their backend servers.

So, if we impose a limit of 10MB for the input plugins and considering the worse case scenario of the output plugin consuming 20MB extra, as a minimum we need (30MB x 1.2) = 36MB.

Glibc and Memory Fragmentation

Is well known that in intensive environments where memory allocations happens in the order of magnitude, the default memory allocator provided by Glibc could lead to a high fragmentation, reporting a high memory usage by the service.

It's strongly suggested that in any production environment, Fluent Bit should be built with enabled (e.g. -DFLB_JEMALLOC=On). Jemalloc is an alternative memory allocator that can reduce fragmentation (among others things) resulting in better performance.

You can check if Fluent Bit has been built with Jemalloc using the following command:

The output should looks like:

If the FLB_HAVE_JEMALLOC option is listed in Build Flags, everything will be fine.

Upstream Servers

It's common that Fluent Bit aims to connect to external services to deliver the logs over the network, this is the case of , and within others. Being able to connect to one node (host) is normal and enough for more of the use cases, but there are other scenarios where balancing across different nodes is required. The Upstream feature provides such capability.

An Upstream defines a set of nodes that will be targeted by an output plugin, by the nature of the implementation an output plugin must support the Upstream feature. The following plugin(s) have Upstream support:

The current balancing mode implemented is round-robin.

Configuration

To define an Upstream it's required to create an specific configuration file that contains an UPSTREAM and one or multiple NODE sections. The following table describe the properties associated to each section. Note that all of them are mandatory:

Nodes and specific plugin configuration

A Node might contain additional configuration keys required by the plugin, on that way we provide enough flexibility for the output plugin, a common use case is Forward output where if TLS is enabled, it requires a shared key (more details in the example below).

Nodes and TLS (Transport Layer Security)

In addition to the properties defined in the table above, the network operations against a defined node can optionally be done through the use of TLS for further encryption and certificates use.

The TLS options available are described in the section and can be added to the any Node section.

Configuration File Example

The following example defines an Upstream called forward-balancing which aims to be used by Forward output plugin, it register three Nodes:

node-1: connects to 127.0.0.1:43000
node-2: connects to 127.0.0.1:44000
node-3: connects to 127.0.0.1:45000 using TLS without verification. It also defines a specific configuration option required by Forward output called shared_key.

Note that every Upstream definition must exists on it own configuration file in the file system. Adding multiple Upstreams in the same file or different files is not allowed.

Scheduler

has an Engine that helps to coordinate the data ingestion from input plugins and call the Scheduler to decide when is time to flush the data through one or multiple output plugins. The Scheduler flush new data every a fixed time of seconds and Schedule retries when asked.

Once an output plugin gets call to flush some data, after processing that data it can notify the Engine three possible return statuses:

OK
Retry
Error

If the return status was OK, it means it was successfully able to process and flush the data, if it returned an Error status, means that an unrecoverable error happened and the engine should not try to flush that data again. If a Retry was requested, the Engine will ask the Scheduler to retry to flush that data, the Scheduler will decide how many seconds to wait before that happen.

Configuring Retries

The Scheduler provides a simple configuration option called Retry_Limit which can be set independently on each output section. This option allows to disable retries or impose a limit to try N times and then discard the data after reaching that limit:

Example

The following example configure two outputs where the HTTP plugin have an unlimited number of retries and the Elasticsearch plugin have a limit of 5 times:

Stream Processor

Fluent Bit v1.1 comes with a new and optional Stream Processor Engine that allows to do data processing through SQL queries. This article covers the format of the expected configuration file.

For more details about the Stream Processor Engine use please refer to the following guide:

Concepts

The stream processor can be configured through defining Tasks which have a name and an execution SQL statement:

Streams File Configuration

The Stream Processor is configured through a streams file that is referenced from the main fluent-bit.conf configuration file through the Streams_File key. The content of the streams file must have the following format specified in the table below:

Configuration Example

Consider the following fluent-bit.conf configuration file:

Now creates a stream_processor.conf configuration file with the following content:

On the query there are a few things happening:

Stream Processor have a Task attached to any incoming Stream of data called cpu_data (check the alias set in the Input section).
Stream Processor will aggregate the value of cpu_p record field and calculate it average during a window of 5 seconds.
Stream Processor every 5 seconds will send the results back into Fluent Bit pipeline with a tag called results.
Fluent Bit output section will match results tagged records and print them to the standard output interface.

You should see the following output in your terminal: