Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Fluent Bit support the usage of environment variables in any value associated to a key when using a configuration file.
The variables are case sensitive and can be used in the following format:
When Fluent Bit starts, the configuration reader will detect any request for ${MY_VARIABLE} and will try to resolve it value.
Create the following configuration file (fluent-bit.conf):
Open a terminal and set the environment variable:
The above command set the 'stdout' value to the variable MY_OUTPUT.
Run Fluent Bit with the recently created configuration file:
As you can see the service worked properly as the configuration was valid.
In certain environments is common to see that logs or data being ingested is faster than the ability to flush it to some destinations. The common case is reading from big log files and dispatching the logs to a backend over the network which takes some time to respond, this generate backpressure leading to a high memory consumption in the service.
In order to avoid backpressure, Fluent Bit implements a mechanism in the engine that restrict the amount of data than an input plugin can ingest, this is done through the configuration parameter Mem_Buf_Limit.
This option is disabled by default and can be applied to all input plugins. Let's explain it behavior using the following scenario:
Mem_Buf_Limit is set to 1MB (one megabyte)
input plugin tries to append 700KB
engine route the data to an output plugin
output plugin backend (HTTP Server) is down
engine scheduler will retry the flush after 10 seconds
input plugin tries to append 500KB
At this exact point, the engine will allow to append those 500KB of data into the engine: in total we have 1.2MB. The options works in a permissive mode before to reach the limit, but the limit is exceeded the following actions are taken:
block local buffers for the input plugin (cannot append more data)
notify the input plugin invoking a pause callback
The engine will protect it self and will not append more data coming from the input plugin in question; Note that is the plugin responsibility to keep their state and take some decisions about what to do on that paused state.
After some seconds if the scheduler was able to flush the initial 700KB of data or it gave up after retrying, that amount memory is released and internally the following actions happens:
Upon data buffer release (700KB), the internal counters get updated
Counters now are set at 500KB
Since 500KB is < 1MB it checks the input plugin state
If the plugin is paused, it invokes a resume callback
input plugin can continue appending more data
Each plugin is independent and not all of them implements the pause and resume callbacks. As said, these callbacks are just a notification mechanism for the plugin.
The plugin who implements and keep a good state is the Tail Input plugin. When the pause callback is triggered, it stop their collectors and stop appending data. Upon resume, it re-enable the collectors.
Fluent Bit comes with a built-in HTTP Server that can be used to query internal information and monitor metrics of each running plugin.
To get started, the first step is to enable the HTTP Server from the configuration file:
the above configuration snippet will instruct Fluent Bit to start it HTTP Server on TCP Port 2020 and listening on all network interfaces:
now with a simple curl command is enough to gather some information:
Note that we are sending the curl command output to the jq program which helps to make the JSON data easy to read from the terminal. Fluent Bit don't aim to do JSON pretty-printing.
Fluent Bit aims to expose useful interfaces for monitoring, as of Fluent Bit v0.14 the following end points are available:
Query the service uptime with the following command:
it should print a similar output like this:
Query internal metrics in JSON format with the following command:
it should print a similar output like this:
Query internal metrics in Prometheus Text 0.0.4 format:
this time the same metrics will be in Prometheus format instead of JSON:
By default configured plugins on runtime get an internal name in the format plugin_name.ID. For monitoring purposes this can be confusing if many plugins of the same type were configured. To make a distinction each configured input or output section can get an alias that will be used as the parent name for the metric.
The following example set an alias to the INPUT section which is using the CPU input plugin:
Now when querying the metrics we get the aliases in place instead of the plugin name:
The end-goal of Fluent Bit is to collect, parse, filter and ship logs to a central place. In this workflow there are many phases and one of the critical pieces is the ability to do buffering : a mechanism to place processed data into a temporal location until is ready to be shipped.
By default when Fluent Bit process data, it uses Memory as a primary and temporal place to store the record logs, but there are certain scenarios where would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities.
Starting with Fluent Bit v1.0, we introduced a new storage layer that can either work in memory or in the file system. Input plugins can be configured to use one or the other upon demand at start time.
The storage layer configuration takes place in two areas:
Service Section
Input Section
The known Service section configure a global environment for the storage layer, and then in the Input sections defines which mechanism to use.
a Service section will look like this:
that configuration configure an optional buffering mechanism where it root for data is /var/log/flb-storage/, it will use normal synchronization mode, without checksum and up to a maximum of 5MB of memory when processing backlog data.
Optionally, any Input plugin can configure their storage preference, the following table describe the options available:
The following example configure a service that offers filesystem buffering capabilities and two Input plugins being the first based in memory and the second with the filesystem:
Configuration files must be flexible enough for any deployment need, but they must keep a clean and readable format.
Fluent Bit Commands extends a configuration file with specific built-in features. The list of commands available as of Fluent Bit 0.12 series are:
Configuring a logging pipeline might lead to an extensive configuration file. In order to maintain a human-readable configuration, it's suggested to split the configuration in multiple files.
The @INCLUDE command allows the configuration reader to include an external configuration file, e.g:
The above example defines the main service configuration file and also include two files to continue the configuration:
Note that despites the order of inclusion, Fluent Bit will ALWAYS respect the following order:
Service
Inputs
Filters
Outputs
Fluent Bit supports configuration variables, one way to expose this variables to Fluent Bit is through setting a Shell environment variable, the other is through the @SET command.
The @SET command can only be used at root level of each line, meaning it cannot be used inside a section, e.g:
Fluent Bit provides integrated support for Transport Layer Security (TLS) and it predecessor Secure Sockets Layer (SSL) respectively. In this section we will refer as TLS only for both implementations.
Each output plugin that requires to perform Network I/O can optionally enable TLS and configure the behavior. The following table describes the properties available:
The listed properties can be enabled in the configuration file, specifically on each output plugin section or directly through the command line. The following output plugins can take advantage of the TLS feature:
By default HTTP output plugin uses plain TCP, enabling TLS from the command line can be done with:
In the command line above, the two properties tls and tls.verify where enabled for demonstration purposes (we strongly suggest always keep verification ON).
The same behavior can be accomplished using a configuration file:
Fluent Bit supports TLS server name indication. If you are serving multiple hostnames on a single IP address (a.k.a. virtual hosting), you can make use of tls.vhost
to connect to a specific hostname.
Certain configuration directives in Fluent Bit refer to unit sizes such as when defining the size of a buffer or specific limits, we can find these in plugins like Tail Input, Forward Input or in generic properties like Mem_Buf_Limit.
Starting from Fluent Bit v0.11.10, all unit sizes have been standardized across the core and plugins, the following table describes the options that can be used and what they mean:
There are some cases where using the command line to start Fluent Bit is not ideal. When running Fluent Bit as a service, a configuration file is preferred.
Fluent Bit allows to use one configuration file which works at a global scope and uses the schema defined previously.
The configuration file supports four types of sections:
In addition there is an additional feature to include external files:
The Service section defines global properties of the service, the keys available as of this version are described in the following table:
The following is an example of a SERVICE section:
An INPUT section defines a source (related to an input plugin), here we will describe the base configuration for each INPUT section. Note that each input plugin may add it own configuration keys:
The Name is mandatory and it let Fluent Bit know which input plugin should be loaded. The Tag is mandatory for all plugins except for the input forward plugin (as it provides dynamic tags).
The following is an example of an INPUT section:
A FILTER section defines a filter (related to an filter plugin), here we will describe the base configuration for each FILTER section. Note that each filter plugin may add it own configuration keys:
The Name is mandatory and it let Fluent Bit know which filter plugin should be loaded. The Match or Match_Regex is mandatory for all plugins. If both are specified, Match_Regex takes precedence.
The following is an example of an FILTER section:
The OUTPUT section specify a destination that certain records should follow after a Tag match. The configuration support the following keys:
The following is an example of an OUTPUT section:
The following configuration file example demonstrates how to collect CPU metrics and flush the results every five seconds to the standard output:
To avoid complicated long configuration files is better to split specific parts in different files and call them (include) from one main file.
Starting from Fluent Bit 0.12 the new configuration command @INCLUDE has been added and can be used in the following way:
The configuration reader will try to open the path somefile.conf, if not found, it will assume it's a relative path based on the path of the base configuration file, e.g:
Main configuration file path: /tmp/main.conf
Included file: somefile.conf
Fluent Bit will try to open somefile.conf, if it fails it will try /tmp/somefile.conf.
The @INCLUDE command only works at top-left level of the configuration line, it cannot be used inside sections.
Wildcard character (*) is supported to include multiple files, e.g:
Fluent Bit may optionally use a configuration file to define how the service will behave, and before proceeding we need to understand how the configuration schema works. The schema is defined by three concepts:
A simple example of a configuration file is as follows:
A section is defined by a name or title inside brackets. Looking at the example above, a Service section has been set using [SERVICE] definition. Section rules:
All section content must be indented (4 spaces ideally).
Multiple sections can exist on the same file.
A section is expected to have comments and entries, it cannot be empty.
Any commented line under a section, must be indented too.
A section may contain Entries, an entry is defined by a line of text that contains a Key and a Value, using the above example, the [SERVICE] section contains two entries, one is the key Daemon with value off and the other is the key Log_Level with the value debug. Entries rules:
An entry is defined by a key and a value.
A key must be indented.
A key must contain a value which ends in the breakline.
Multiple keys with the same name can exist.
Also commented lines are set prefixing the # character, those lines are not processed but they must be indented too.
Fluent Bit configuration files are based in a strict Indented Mode, that means that each configuration file must follow the same pattern of alignment from left to right when writing text. By default an indentation level of four spaces from left to right is suggested. Example:
As you can see there are two sections with multiple entries and comments, note also that empty lines are allowed and they do not need to be indented.
Fluent Bit is flexible enough to be configured either from the command line or through a configuration file. For production environments, we strongly recommend to use the configuration file approach.
Note that all configuration files use a specific fixed and strict schema, please proceed to the following sections for a better understanding:
(must read)
In certain scenarios would be ideal to estimate how much memory Fluent Bit could be using, this is very useful for containerized environments where memory limits are a must.
In order to estimate we will assume that the input plugins have set the Mem_Buf_Limit option (you can learn more about it in the section).
Input plugins append data independently, so in order to do an estimation a limit should be imposed through the Mem_Buf_Limit option. If the limit was set to 10MB we need to estimate that in the worse case, the output plugin likely could use 20MB.
Fluent Bit has an internal binary representation for the data being processed, but when this data reach an output plugin, this one will likely create their own representation in a new memory buffer for processing. The best example are the and output plugins, both needs to convert the binary representation to their respective-custom JSON formats before to talk to their backend servers.
So, if we impose a limit of 10MB for the input plugins and considering the worse case scenario of the output plugin consuming 20MB extra, as a minimum we need (30MB x 1.2) = 36MB.
Is well known that in intensive environments where memory allocations happens in the order of magnitude, the default memory allocator provided by Glibc could lead to a high fragmentation, reporting a high memory usage by the service.
It's strongly suggested that in any production environment, Fluent Bit should be built with enabled (e.g. -DFLB_JEMALLOC=On
). Jemalloc is an alternative memory allocator that can reduce fragmentation (among others things) resulting in better performance.
You can check if Fluent Bit has been built with Jemalloc using the following command:
The output should looks like:
If the FLB_HAVE_JEMALLOC option is listed in Build Flags, everything will be fine.
It's common that Fluent Bit aims to connect to external services to deliver the logs over the network, this is the case of , and within others. Being able to connect to one node (host) is normal and enough for more of the use cases, but there are other scenarios where balancing across different nodes is required. The Upstream feature provides such capability.
An Upstream defines a set of nodes that will be targeted by an output plugin, by the nature of the implementation an output plugin must support the Upstream feature. The following plugin(s) have Upstream support:
The current balancing mode implemented is round-robin.
To define an Upstream it's required to create an specific configuration file that contains an UPSTREAM and one or multiple NODE sections. The following table describe the properties associated to each section. Note that all of them are mandatory:
A Node might contain additional configuration keys required by the plugin, on that way we provide enough flexibility for the output plugin, a common use case is Forward output where if TLS is enabled, it requires a shared key (more details in the example below).
In addition to the properties defined in the table above, the network operations against a defined node can optionally be done through the use of TLS for further encryption and certificates use.
The TLS options available are described in the section and can be added to the any Node section.
The following example defines an Upstream called forward-balancing which aims to be used by Forward output plugin, it register three Nodes:
node-1: connects to 127.0.0.1:43000
node-2: connects to 127.0.0.1:44000
node-3: connects to 127.0.0.1:45000 using TLS without verification. It also defines a specific configuration option required by Forward output called shared_key.
Note that every Upstream definition must exists on it own configuration file in the file system. Adding multiple Upstreams in the same file or different files is not allowed.
has an Engine that helps to coordinate the data ingestion from input plugins and call the Scheduler to decide when is time to flush the data through one or multiple output plugins. The Scheduler flush new data every a fixed time of seconds and Schedule retries when asked.
Once an output plugin gets call to flush some data, after processing that data it can notify the Engine three possible return statuses:
OK
Retry
Error
If the return status was OK, it means it was successfully able to process and flush the data, if it returned an Error status, means that an unrecoverable error happened and the engine should not try to flush that data again. If a Retry was requested, the Engine will ask the Scheduler to retry to flush that data, the Scheduler will decide how many seconds to wait before that happen.
The Scheduler provides a simple configuration option called Retry_Limit which can be set independently on each output section. This option allows to disable retries or impose a limit to try N times and then discard the data after reaching that limit:
The following example configure two outputs where the HTTP plugin have an unlimited number of retries and the Elasticsearch plugin have a limit of 5 times:
Fluent Bit v1.1 comes with a new and optional Stream Processor Engine that allows to do data processing through SQL queries. This article covers the format of the expected configuration file.
For more details about the Stream Processor Engine use please refer to the following guide:
The stream processor can be configured through defining Tasks which have a name and an execution SQL statement:
The Stream Processor is configured through a streams file that is referenced from the main fluent-bit.conf configuration file through the Streams_File key. The content of the streams file must have the following format specified in the table below:
Consider the following fluent-bit.conf configuration file:
Now creates a stream_processor.conf configuration file with the following content:
On the query there are a few things happening:
Stream Processor have a Task attached to any incoming Stream of data called cpu_data (check the alias set in the Input section).
Stream Processor will aggregate the value of cpu_p record field and calculate it average during a window of 5 seconds.
Stream Processor every 5 seconds will send the results back into Fluent Bit pipeline with a tag called results.
Fluent Bit output section will match results tagged records and print them to the standard output interface.
You should see the following output in your terminal:
Fluent Bit will gather CPU usage metrics through (metrics are calculated by default every second).
If you want to learn more about our Stream Processor engine please read the .
Key
Description
Default
storage.path
Set an optional location in the file system to store streams and chunks of data. If this parameter is not set, Input plugins can only use in-memory buffering.
storage.sync
Configure the synchronization mode used to store the data into the file system. It can take the values normal or full.
normal
storage.checksum
Enable the data integrity check when writing and reading data from the filesystem. The storage layer uses the CRC32 algorithm.
Off
storage.backlog.mem_limit
If storage.path is set, Fluent Bit will look for data chunks that were not delivered and are still in the storage layer, these are called backlog data. This option configure a hint of maximum value of memory to use when processing these records.
5M
Key
Description
Default
storage.type
Specify the buffering mechanism to use. It can be memory or filesystem.
memory
Command
Prototype
Description
@INCLUDE FILE
Include a configuration file
@SET KEY=VAL
Set a configuration variable
Property
Description
Default
tls
enable or disable TLS support
Off
tls.verify
force certificate validation
On
tls.debug
Set TLS debug verbosity level. It accept the following values: 0 (No debug), 1 (Error), 2 (State change), 3 (Informational) and 4 Verbose
1
tls.ca_file
absolute path to CA certificate file
tls.ca_path
absolute path to scan for certificate files
tls.crt_file
absolute path to Certificate file
tls.key_file
absolute path to private Key file
tls.key_passwd
optional password for tls.key_file file
tls.vhost
hostname to be used for TLS SNI extension
Suffix
Description
Example
When a suffix is not specified, it's assumed that the value given is a bytes representation.
Specifying a value of 32000, means 32000 bytes
k, K, KB, kb
Kilobyte: a unit of memory equal to 1,000 bytes.
32k means 32000 bytes.
m, M, MB, mb
Megabyte: a unit of memory equal to 1,000,000 bytes
1M means 1000000 bytes
g, G, GB, gb
Gigabyte: a unit of memory equal to 1,000,000,000 bytes
1G means 1000000000 bytes
URI
Description
Data Format
/
Fluent Bit build information
JSON
/api/v1/uptime
Get uptime information in seconds and human readable format
JSON
/api/v1/metrics
Internal metrics per loaded plugin
JSON
/api/v1/metrics/prometheus
Internal metrics per loaded plugin ready to be consumed by a Prometheus Server
Prometheus Text 0.0.4
Key
Description
Default Value
Flush
Set the flush time in seconds. Everytime it timeouts, the engine will flush the records to the output plugin.
5
Daemon
Boolean value to set if Fluent Bit should run as a Daemon (background) or not. Allowed values are: yes, no, on and off.
Off
Log_File
Absolute path for an optional log file.
Log_Level
Set the logging verbosity level. Allowed values are: error, warning, info, debug and trace. Values are accumulative, e.g: if 'debug' is set, it will include error, warning, info and debug. Note that trace mode is only available if Fluent Bit was built with the WITH_TRACE option enabled.
info
Parsers_File
Path for a parsers configuration file. Multiple Parsers_File entries can be used.
Plugins_File
Path for a plugins configuration file. A plugins configuration file allows to define paths for external plugins, for an example see here.
Streams_File
Path for the Stream Processor configuration file. For details about the format of SP configuration file see here.
HTTP_Server
Enable built-in HTTP Server
Off
HTTP_Listen
Set listening interface for HTTP Server when it's enabled
0.0.0.0
HTTP_Port
Set TCP Port for the HTTP Server
2020
Coro_Stack_Size
Set the coroutines stack size in bytes. The value must be greater than the page size of the running system. Don't set too small value (say 4096), or coroutine threads can overrun the stack buffer.
24576
Key
Description
Name
Name of the input plugin.
Tag
Tag name associated to all records comming from this plugin.
Key
Description
Name
Name of the filter plugin.
Match
A pattern to match against the tags of incoming records. It's case sensitive and support the star (*) character as a wildcard.
Match_Regex
A regular expression to match against the tags of incoming records. Use this option if you want to use the full regex syntax.
Key
Description
Name
Name of the output plugin.
Match
A pattern to match against the tags of incoming records. It's case sensitive and support the star (*) character as a wildcard.
Match_Regex
A regular expression to match against the tags of incoming records. Use this option if you want to use the full regex syntax.
Section | Key | Description |
UPSTREAM | name | Defines a name for the Upstream in question. |
NODE | name | Defines a name for the Node in question. |
host | IP address or hostname of the target host. |
port | TCP port of the target service. |
Value | Description |
Retry_Limit | N | Integer value to set the maximum number of retries allowed. N must be >= 1 (default: 2) |
Retry_Limit | False | When Retry_Limit is set to False, means that there is not limit for the number of retries that the Scheduler can do. |
Concept | Description |
Task | Definition of a Stream Processor task to be executed. A task is defined through a section called STREAM_TASK. |
Name | Tasks have a name for debugging and testing purposes. |
Exec | SQL statement to be executed when a Task runs. |
Section | Key | Description | Mandatory? |
STREAM_TASK |
Name | Set a name for the task in question. The value is used as a reference only. | Yes |
Exec | SQL statement to be executed by the task. Note that the SQL statement must be finished with a semicolon. The SQL statement must be set in one single line (no multiline support in the configuration) | Yes |