Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
Fluent Bit is an open source and multi-platform log processor tool which aims to be a generic Swiss knife for logs processing and distribution.
Nowadays the number of sources of information in our environments is ever increasing. Handling data collection at scale is complex, and collecting and aggregating diverse data requires a specialized tool that can deal with:
Different sources of information
Different data formats
Data Reliability
Security
Flexible Routing
Multiple destinations
Fluent Bit has been designed with performance and low resources consumption in mind.
Every project has a story
On 2014, the Fluentd team at Treasure Data forecasted the need of a lightweight log processor for constraint environments like Embedded Linux and Gateways, the project aimed to be part of the Fluentd Ecosystem and we called it Fluent Bit, fully open source and available under the terms of the Apache License v2.0.
After the project was around for some time, it got some traction in the Embedded market but we also started getting requests for several features from the Cloud community like more inputs, filters, and outputs. Not so long after that, Fluent Bit becomes one of the preferred solutions to solve the logging challenges in Cloud environments.
There are a few key concepts that are really important to understand how Fluent Bit operates.
Before diving into Fluent Bit it’s good to get acquainted with some of the key concepts of the service. This document provides a gentle introduction to those concepts and common Fluent Bit terminology. We’ve provided a list below of all the terms we’ll cover, but we recommend reading this document from start to finish to gain a more general understanding of our log and stream processor.
Event or Record
Filtering
Tag
Timestamp
Match
Structured Message
Every incoming piece of data that belongs to a log or a metric that is retrieved by Fluent Bit is considered an Event or a Record.
As an example consider the following content of a Syslog file:
It contains four lines and all of them represents four independent Events.
Internally, an Event always has two components (in an array form):
In some cases it is required to perform modifications on the Events content, the process to alter, enrich or drop Events is called Filtering.
There are many use cases when Filtering is required like:
Append specific information to the Event like an IP address or metadata.
Select a specific piece of the Event content.
Drop Events that matches certain pattern.
Every Event that gets into Fluent Bit gets assigned a Tag. This tag is an internal string that is used in a later stage by the Router to decide which Filter or Output phase it must go through.
Most of the tags are assigned manually in the configuration. If a tag is not specified, Fluent Bit will assign the name of the Input plugin instance from where that Event was generated from.
The only input plugin that does NOT assign tags is Forward input. This plugin speaks the Fluentd wire protocol called Forward where every Event already comes with a Tag associated. Fluent Bit will always use the incoming Tag set by the client.
A Tagged record must always have a Matching rule. To learn more about Tags and Matches check the Routing section.
The Timestamp represents the time when an Event was created. Every Event contains a Timestamp associated. The Timestamp is a numeric fractional integer in the format:
It is the number of seconds that have elapsed since the Unix epoch.
Fractional second or one thousand-millionth of a second.
A timestamp always exists, either set by the Input plugin or discovered through a data parsing process.
Fluent Bit allows to deliver your collected and processed Events to one or multiple destinations, this is done through a routing phase. A Match represent a simple rule to select Events where it Tags matches a defined rule.
To learn more about Tags and Matches check the Routing section.
Source events can have or not have a structure. A structure defines a set of keys and values inside the Event message. As an example consider the following two messages:
At a low level both are just an array of bytes, but the Structured message defines keys and values, having a structure helps to implement faster operations on data modifications.
Fluent Bit always handles every Event message as a structured message. For performance reasons, we use a binary serialization data format called MessagePack.
Consider MessagePack as a binary version of JSON on steroids.
Strong Commitment to the Openness and Collaboration
Fluent Bit, including it core, plugins and tools are distributed under the terms of the Apache License v2.0:
The Production Grade Ecosystem
Logging and data processing in general can be complex, and at scale a bit more, that's why Fluentd was born. Fluentd has become more than a simple tool, it has grown into a fullscale ecosystem that contains SDKs for different languages and sub-projects like Fluent Bit.
On this page, we will describe the relationship between the Fluentd and Fluent Bit open source projects, as a summary we can say both are:
Licensed under the terms of Apache License v2.0
Hosted projects by the Cloud Native Computing Foundation (CNCF)
Production Grade solutions: deployed thousands of times every single day, millions per month.
Community driven projects
Widely Adopted by the Industry: trusted by all major companies like AWS, Microsoft, Google Cloud and hundred of others.
Originally created by Treasure Data.
Both projects share a lot of similarities, Fluent Bit is fully designed and built on top of the best ideas of Fluentd architecture and general design. Choosing which one to use depends on the end-user needs.
The following table describes a comparison in different areas of the projects:
Both Fluentd and Fluent Bit can work as Aggregators or Forwarders, they both can complement each other or use them as standalone solutions.
High Performance Log and Metrics Processor
Fluent Bit is a Fast and Lightweight Logs and Metrics Processor and Forwarder for Linux, OSX, Windows and BSD family operating systems. It has been made with a strong focus on performance to allow the collection of events from different sources without complexity.
High Performance
Metrics Collection (Prometheus compatible)
Reliability and Data Integrity
Backpressure Handling
Data Buffering in memory and file system
Networking
Security: built-in TLS/SSL support
Asynchronous I/O
Pluggable Architecture and Extensibility: Inputs, Filters and Outputs
More than 80 built-in plugins available
Extensibility
Write any input, filter or output plugin in C language
Bonus: write Filters in Lua or Output plugins in Golang
Monitoring: expose internal metrics over HTTP in JSON and Prometheus format
Stream Processing: Perform data selection and transformation using simple SQL queries
Create new streams of data using query results
Aggregation Windows
Data analysis and prediction: Timeseries forecasting
Portable: runs on Linux, MacOS, Windows and BSD systems
Fluent Bit is a CNCF sub-project under the umbrella of Fluentd, it's licensed under the terms of the Apache License v2.0. This project was originally created by Treasure Data and is currently a vendor neutral and community driven project.
Fluentd | Fluent Bit | |
---|---|---|
Scope
Containers / Servers
Embedded Linux / Containers / Servers
Language
C & Ruby
C
Memory
~40MB
~650KB
Performance
High Performance
High Performance
Dependencies
Built as a Ruby Gem, it requires a certain number of gems.
Zero dependencies, unless some special plugin requires them.
Plugins
More than 1000 plugins available
Around 70 plugins available
License
Performance and Data Safety
When Fluent Bit processes data, it uses the system memory (heap) as a primary and temporal place to store the record logs before they get delivered, on this private memory area the records are processed.
Buffering refers to the ability to store the records somewhere, and while they are processed and delivered, still be able to store more. Buffering in memory is the fastest mechanism, but there are certain scenarios where the mechanism requires special strategies to deal with backpressure, data safety or reduce memory consumption by the service in constraint environments.
Network failures or latency on third party service is pretty common, and on scenarios where we cannot deliver data fast enough as we receive new data to process, we likely will face backpressure.
Our buffering strategies are designed to solve problems associated with backpressure and general delivery failures.
Fluent Bit as buffering strategies, offers a primary buffering mechanism in memory and an optional secondary one using the file system. With this hybrid solution you can adjust to any use case safety and keep a high performance while processing your data.
Both mechanisms are not exclusive and when the data is ready to be processed or delivered it will be always in memory, while other data in the queue might be in the file system until is ready to be processed and moved up to memory.
To learn more about the buffering configuration in Fluent Bit, please jump to the Buffering & Storage section.
Data processing with reliability
Previously defined in the Buffering concept section, the buffer
phase in the pipeline aims to provide a unified and persistent mechanism to store your data, either using the primary in-memory model or using the filesystem based mode.
The buffer
phase already contains the data in an immutable state, meaning, no other filter can be applied.
Note that buffered data is not raw text, it's in Fluent Bit's internal binary representation.
Fluent Bit offers a buffering mechanism in the file system that acts as a backup system to avoid data loss in case of system failures.
Modify, Enrich or Drop your records
In production environments we want to have full control of the data we are collecting, filtering is an important feature that allows us to alter the data before delivering it to some destination.
Filtering is implemented through plugins, so each filter available could be used to match, exclude or enrich your logs with some specific metadata.
We support many filters, A common use case for filtering is Kubernetes deployments. Every Pod log needs to get the proper metadata associated
Very similar to the input plugins, Filters run in an instance context, which has its own independent configuration. Configuration keys are often called properties.
For more details about the Filters available and their usage, please refer to the Filters section.
Convert Unstructured to Structured messages
Dealing with raw strings or unstructured messages is a constant pain; having a structure is highly desired. Ideally we want to set a structure to the incoming data by the Input Plugins as soon as they are collected:
The Parser allows you to convert from unstructured to structured data. As a demonstrative example consider the following Apache (HTTP Server) log entry:
The above log line is a raw string without format, ideally we would like to give it a structure that can be processed later easily. If the proper configuration is used, the log entry could be converted to:
Parsers are fully configurable and are independently and optionally handled by each input plugin, for more details please refer to the Parsers section.
The way to gather data from your sources
Fluent Bit provides different Input Plugins to gather information from different sources, some of them just collect data from log files while others can gather metrics information from the operating system. There are many plugins for different needs.
When an input plugin is loaded, an internal instance is created. Every instance has its own and independent configuration. Configuration keys are often called properties.
Every input plugin has its own documentation section where it's specified how it can be used and what properties are available.
For more details, please refer to the Input Plugins section.
Create flexible routing rules
Routing is a core feature that allows to route your data through Filters and finally to one or multiple destinations. The router relies on the concept of Tags and Matching rules
There are two important concepts in Routing:
Tag
Match
When the data is generated by the input plugins, it comes with a Tag (most of the time the Tag is configured manually), the Tag is a human-readable indicator that helps to identify the data source.
In order to define where the data should be routed, a Match rule must be specified in the output configuration.
Consider the following configuration example that aims to deliver CPU metrics to an Elasticsearch database and Memory metrics to the standard output interface:
Note: the above is a simple example demonstrating how Routing is configured.
Routing works automatically reading the Input Tags and the Output Match rules. If some data has a Tag that doesn't match upon routing time, the data is deleted.
Routing is flexible enough to support wildcard in the Match pattern. The below example defines a common destination for both sources of data:
The match rule is set to my_* which means it will match any Tag that starts with my_.
The following article cover the relevant notes for users upgrading from previous Fluent Bit versions. We aim to cover compatibility changes that you must be aware of.
If you are migrating from previous version of Fluent Bit please review the following important changes:
Now by default the plugin follows a file from the end once the service starts (old behavior was always read from the beginning). For every file found at start, its followed from it last position, for new files discovered at runtime or rotated, they are read from the beginning.
If you desire to keep the old behavior you can set the option read_from_head
to true.
If you have any existing queries based on the resource's project_id, please update your query accordingly.
The migration from v1.4 to v1.5 is pretty straightforward.
If you are migrating from Fluent Bit v1.3, there are no breaking changes. Just new exciting features to enjoy :)
If you are migrating from Fluent Bit v1.2 to v1.3, there are no breaking changes. If you are upgrading from an older version please review the incremental changes below.
On Fluent Bit v1.2 we have fixed many issues associated with JSON encoding and decoding, for hence when parsing Docker logs is no longer necessary to use decoders. The new Docker parser looks like this:
Note: again, do not use decoders.
We have done improvements also on how Kubernetes Filter handle the stringified log message. If the option Merge_Log is enabled, it will try to handle the log content as a JSON map, if so, it will add the keys to the root map.
In addition, we have fixed and improved the option called Merge_Log_Key. If a merge log succeed, all new keys will be packaged under the key specified by this option, a suggested configuration is as follows:
As an example, if the original log content is the following map:
the final record will be composed as follows:
If you are upgrading from Fluent Bit <= 1.0.x you should take in consideration the following relevant changes when switching to Fluent Bit v1.1 series:
We introduced a new configuration property called Kube_Tag_Prefix to help Tag prefix resolution and address an unexpected behavior that landed in previous versions.
During 1.0.x release cycle, a commit in Tail input plugin changed the default behavior on how the Tag was composed when using the wildcard for expansion generating breaking compatibility with other services. Consider the following configuration example:
The expected behavior is that Tag will be expanded to:
but the change introduced in 1.0 series switched from absolute path to the base file name only:
On Fluent Bit v1.1 release we restored to our default behavior and now the Tag is composed using the absolute path of the monitored file.
Having absolute path in the Tag is relevant for routing and flexible configuration where it also helps to keep compatibility with Fluentd behavior.
This behavior switch in Tail input plugin affects how Filter Kubernetes operates. As you know when the filter is used it needs to perform local metadata lookup that comes from the file names when using Tail as a source. Now with the new Kube_Tag_Prefix option you can specify what's the prefix used in Tail input plugin, for the configuration example above the new configuration will look as follows:
So the proper for Kube_Tag_Prefix value must be composed by Tag prefix set in Tail input plugin plus the converted monitored directory replacing slashes with dots.
For more details about changes on each release please refer to the .
The project_id of in sent to Google Cloud Logging would be set to the project ID rather than the project number. To learn the difference between Project ID and project number, see for more details.
If you enabled keepalive
mode in your configuration, note that this configuration property has been renamed to net.keepalive
. Now all Network I/O keepalive is enabled by default, to learn more about this and other associated configuration properties read the section.
If you use the Elasticsearch output plugin, note the default value of type
. Many versions of Elasticsearch will tolerate this, but ES v5.6 through v6.1 require a type without a leading underscore. See the for more.
Destinations for your data: databases, cloud services and more!
The output interface allows us to define destinations for the data. Common destinations are remote services, local file system or standard interface with others. Outputs are implemented as plugins and there are many available.
When an output plugin is loaded, an internal instance is created. Every instance has its own independent configuration. Configuration keys are often called properties.
Every output plugin has its own documentation section specifying how it can be used and what properties are available.
For more details, please refer to the Output Plugins section.
The following operating systems and architectures are supported in Fluent Bit.
From an architecture support perspective, Fluent Bit is fully functional on x86_64, Arm64v8 and Arm32v7 based processors.
Deployment Type | Instructions |
---|---|
Operating System | Installation Instructions |
---|---|
Operating System | Installation Instructions |
---|---|
Operating System | Installation Instructions |
---|---|
Operating System | Distribution | Architectures |
---|
Fluent Bit can work also on OSX and *BSD systems, but not all plugins will be available on all platforms. Official support will be expanding based on community demand. Fluent Bit may run on older operating systems though will need to be built from source, or use custom packages from
Kubernetes
Docker
Containers on AWS
CentOS / Red Hat
Ubuntu
Debian
Amazon Linux
Raspbian / Rasberry Pi
Yocto / Embedded Linux
Windows Server 2019
Windows 10 2019.03
Linux, FreeBSD, MacOS
Windows
Linux | x86_64, Arm64v8 |
x86_64, Arm64v8 |
x86_64, Arm64v8 |
x86_64, Arm64v8 |
x86_64, Arm64v8 |
x86_64, Arm64v8 |
x86_64, Arm64v8 |
x86_64, Arm64v8 |
x86_64 |
Arm32v7 |
Windows | x86_64, x86 |
x86_64, x86 |
Fluent Bit uses very low CPU and Memory consumption, it's compatible with most of x86, x86_64, arm32v7 and arm64v8 based platforms. In order to build it you need the following components in your system for the build process:
Compiler: GCC or clang
CMake
Flex & Bison: only if you enable the Stream Processor or Record Accessor feature (both enabled by default)
In the core there are not other dependencies, For certain features that depends on third party components like output plugins with special backend libraries (e.g: kafka), those are included in the main source code repository.
From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.
The GPG Key fingerprint of the new key is:
The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.
The GPG Key fingerprint of the old key is:
Refer to the supported platform documentation to see which platforms are supported in each release.## Migration to Fluent BitFrom version 1.9, td-agent-bit
is a deprecated package and will be removed in the future.The correct package name to use now is fluent-bit
.Both are currently provided to allow migration.
Fluent Bit uses CMake as it build system. The suggested procedure to prepare the build system consists of the following steps:
In the following steps you can find exact commands to build and install the project with the default options. If you already know how CMake works you can skip this part and look at the build options available. Note that Fluent Bit requires CMake 3.x. You may need to use
cmake3
instead ofcmake
to complete the following steps on your system.
Change to the build/ directory inside the Fluent Bit sources:
Let CMake configure the project specifying where the root path is located:
Now you are ready to start the compilation process through the simple make command:
to continue installing the binary on the system just do:
it's likely you may need root privileges so you can try to prefixing the command with sudo.
Fluent Bit provides certain options to CMake that can be enabled or disabled when configuring, please refer to the following tables under the General Options, Development Options, Input Plugins and _Output Plugins sections.
The input plugins provides certain features to gather information from a specific source type which can be a network interface, some built-in metric or through a specific input device, the following input plugins are available:
The filter plugins allows to modify, enrich or drop records. The following table describes the filters available on this version:
The output plugins gives the capacity to flush the information to some external interface, service or terminal, the following table describes the output plugins available as of this version:
option | description | default |
---|---|---|
option | description | default |
---|---|---|
option | description | default |
---|---|---|
option | description | default |
---|---|---|
option | description | default |
---|---|---|
FLB_ALL
Enable all features available
No
FLB_JEMALLOC
Use Jemalloc as default memory allocator
No
FLB_TLS
Build with SSL/TLS support
Yes
FLB_BINARY
Build executable
Yes
FLB_EXAMPLES
Build examples
Yes
FLB_SHARED_LIB
Build shared library
Yes
FLB_MTRACE
Enable mtrace support
No
FLB_INOTIFY
Enable Inotify support
Yes
FLB_POSIX_TLS
Force POSIX thread storage
No
FLB_SQLDB
Enable SQL embedded database support
No
FLB_HTTP_SERVER
Enable HTTP Server
No
FLB_LUAJIT
Enable Lua scripting support
Yes
FLB_RECORD_ACCESSOR
Enable record accessor
Yes
FLB_SIGNV4
Enable AWS Signv4 support
Yes
FLB_STATIC_CONF
Build binary using static configuration files. The value of this option must be a directory containing configuration files.
FLB_STREAM_PROCESSOR
Enable Stream Processor
Yes
FLB_DEBUG
Build binaries with debug symbols
No
FLB_VALGRIND
Enable Valgrind support
No
FLB_TRACE
Enable trace mode
No
FLB_SMALL
Minimise binary size
No
FLB_TESTS_RUNTIME
Enable runtime tests
No
FLB_TESTS_INTERNAL
Enable internal tests
No
FLB_TESTS
Enable tests
No
FLB_BACKTRACE
Enable backtrace/stacktrace support
Yes
Enable Collectd input plugin
On
Enable CPU input plugin
On
Enable Disk I/O Metrics input plugin
On
Enable Docker metrics input plugin
On
Enable Exec input plugin
On
Enable Forward input plugin
On
Enable Head input plugin
On
Enable Health input plugin
On
Enable Kernel log input plugin
On
Enable Memory input plugin
On
Enable MQTT Server input plugin
On
Enable Network I/O metrics input plugin
On
Enable Process monitoring input plugin
On
Enable Random input plugin
On
Enable Serial input plugin
On
Enable Standard input plugin
On
Enable Syslog input plugin
On
Enable Systemd / Journald input plugin
On
Enable Tail (follow files) input plugin
On
Enable TCP input plugin
On
Enable system temperature(s) input plugin
On
Enable Windows Event Log input plugin (Windows Only)
On
Enable AWS metadata filter
On
FLB_FILTER_EXPECT
Enable Expect data test filter
On
Enable Grep filter
On
Enable Kubernetes metadata filter
On
Enable Lua scripting filter
On
Enable Modify filter
On
Enable Nest filter
On
Enable Parser filter
On
Enable Record Modifier filter
On
Enable Rewrite Tag filter
On
Enable Stdout filter
On
Enable Throttle filter
On
Enable Microsoft Azure output plugin
On
Enable Google BigQuery output plugin
On
Enable Counter output plugin
On
Enable Amazon CloudWatch output plugin
On
Enable Datadog output plugin
On
Enable Elastic Search output plugin
On
Enable File output plugin
On
Enable Amazon Kinesis Data Firehose output plugin
On
Enable Amazon Kinesis Data Streams output plugin
On
Enable Flowcounter output plugin
On
Enable Fluentd output plugin
On
Enable Gelf output plugin
On
Enable HTTP output plugin
On
Enable InfluxDB output plugin
On
Enable Kafka output
Off
Enable Kafka REST Proxy output plugin
On
FLB_OUT_LIB
Enable Lib output plugin
On
Enable NATS output plugin
On
FLB_OUT_NULL
Enable NULL output plugin
On
FLB_OUT_PGSQL
Enable PostgreSQL output plugin
On
FLB_OUT_PLOT
Enable Plot output plugin
On
FLB_OUT_SLACK
Enable Slack output plugin
On
Enable Amazon S3 output plugin
On
Enable Splunk output plugin
On
Enable Google Stackdriver output plugin
On
Enable STDOUT output plugin
On
FLB_OUT_TCP
Enable TCP/TLS output plugin
On
Enable Treasure Data output plugin
On
For production systems, we strongly suggest that you always get the latest stable release of the source code in either zip or tarball format from Github using the following link pattern:
https://github.com/fluent/fluent-bit/archive/refs/tags/v<release version>.tar.gz https://github.com/fluent/fluent-bit/archive/refs/tags/v<release version>.zip
For example for version 1.8.12 the link is the following: https://github.com/fluent/fluent-bit/archive/refs/tags/v1.8.12.tar.gz
For anyone who aims to contribute to the project by testing or extending the code base, you can get the development version from our GIT repository:
Note that our master branch is where the development of Fluent Bit happens. Since it's a development version, expect issues when compiling or at run time.
We encourage everybody to help us testing every development version, at the end this is what will become stable.
Fluent Bit in normal operation mode allows to be configurable through text files or using specific arguments in the command line, while this is the ideal deployment case, there are scenarios where a more restricted configuration is required: static configuration mode.
Static configuration mode aims to include a built-in configuration in the final binary of Fluent Bit, disabling the usage of external files or flags at runtime.
The following steps assumes you are familiar with configuring Fluent Bit using text files and you have experience building it from scratch as described in the Build and Install section.
In your file system prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. It is mandatory that this directory contain as a minimum one configuration file called fluent-bit.conf containing the required SERVICE, INPUT and OUTPUT sections. As an example create a new fluent-bit.conf file with the following content:
the configuration provided above will calculate CPU metrics from the running system and print them to the standard output interface.
Inside Fluent Bit source code, get into the build/ directory and run CMake appending the FLB_STATIC_CONF option pointing the configuration directory recently created, e.g:
then build it:
At this point the fluent-bit binary generated is ready to run without necessity of further configuration:
Fluent Bit is distributed as td-agent-bit package and is available for the latest (and old) stable Debian systems: Buster, Stretch and Jessie.
The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:
From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.
The GPG Key fingerprint of the new key is:
The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.
The GPG Key fingerprint of the old key is:
Refer to the supported platform documentation to see which platforms are supported in each release.
On Debian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:
Now let your system update the apt database:
We recommend upgrading your system (sudo apt-get upgrade
). This could avoid potential issues with expired certificates.
Using the following apt-get command you are able now to install the latest td-agent-bit:
Now the following step is to instruct systemd to enable the service:
If you do a status check, you should see a similar output like this:
The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.
Fluent Bit is distributed as td-agent-bit package and is available for the latest stable CentOS system. The following architectures are supported
x86_64
aarch64 / arm64v8
We provide td-agent-bit through a Yum repository. In order to add the repository reference to your system, please add a new file called td-agent-bit.repo in /etc/yum.repos.d/ with the following content:
note: we encourage you always enable the gpgcheck for security reasons. All our packages are signed.
From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.
The GPG Key fingerprint of the new key is:
The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.
The GPG Key fingerprint of the old key is:
Refer to the supported platform documentation to see which platforms are supported in each release.
Once your repository is configured, run the following command to install it:
Now the following step is to instruct Systemd to enable the service:
If you do a status check, you should see a similar output like this:
The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.
Fluent Bit is distributed as td-agent-bit package and is available for the latest Amazon Linux 2. The following architectures are supported
x86_64
aarch64 / arm64v8
We provide td-agent-bit through a Yum repository. In order to add the repository reference to your system, please add a new file called td-agent-bit.repo in /etc/yum.repos.d/ with the following content:
note: we encourage you always enable the gpgcheck for security reasons. All our packages are signed.
From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.
The GPG Key fingerprint of the new key is:
The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.
The GPG Key fingerprint of the old key is:
Refer to the supported platform documentation to see which platforms are supported in each release.
Once your repository is configured, run the following command to install it:
Now the following step is to instruct systemd to enable the service:
If you do a status check, you should see a similar output like this:
The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.
AWS maintains a distribution of Fluent Bit combining the latest official release with a set of Go Plugins for sending logs to AWS services. AWS and Fluent Bit are working together to rewrite their plugins for inclusion in the official Fluent Bit distribution.
Fluent Bit includes Amazon CloudWatch Logs plugin named cloudwatch_logs
, Amazon Kinesis Firehose plugin named kinesis_firehose
and Amazon Kinesis Data Streams plugin named kinesis_streams
which are higher performance than Go plugins.
Also, Fluent Bit includes S3 output plugin named s3
.
AWS vends SSM Public Parameters with the regional repository link for each image. These parameters can be queried by any AWS account.
To see a list of available version tags in a given region, run the following command:
To see the ECR repository URI for a given image tag in a given region, run the following:
You can use these SSM public parameters as parameters in your CloudFormation templates:
Fluent Bit is distributed as td-agent-bit package and is available for the latest stable Ubuntu system: Focal Fossa.
The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:
The GPG Key fingerprint of the new key is:
The GPG Key fingerprint of the old key is:
On Ubuntu, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:
Now let your system update the apt database:
We recommend upgrading your system (sudo apt-get upgrade
). This could avoid potential issues with expired certificates.
Using the following apt-get command you are able now to install the latest td-agent-bit:
Now the following step is to instruct systemd to enable the service:
If you do a status check, you should see a similar output like this:
The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.
Raspbian Buster (10)
Raspbian Stretch (9)
Raspbian Jessie (8)
The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:
The GPG Key fingerprint of the new key is:
The GPG Key fingerprint of the old key is:
On Debian and derivative systems such as Raspbian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:
Now let your system update the apt database:
We recommend upgrading your system (sudo apt-get upgrade
). This could avoid potential issues with expired certificates.
Using the following apt-get command you are able now to install the latest td-agent-bit:
Now the following step is to instruct systemd to enable the service:
If you do a status check, you should see a similar output like this:
The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.
Fluent Bit container images are available on Docker Hub ready for production usage. Current available images can be deployed in multiple architectures.
It's strongly suggested that you always use the latest image of Fluent Bit.
In addition, the main manifest provides images for arm64v8 and arm32v7 architectures. From a deployment perspective, there is no need to specify an architecture, the container client tool that pulls the image gets the proper layer for the running architecture.
For every architecture we build the layers using the following base images:
Download the last stable image from 1.8 series:
Once the image is in place, now run the following (useless) test which makes Fluent Bit measure CPU usage by the container:
That command will let Fluent Bit measure CPU usage every second and flush the results to the standard output, e.g:
Alpine Linux uses Musl C library instead of Glibc. Musl is not fully compatible with Glibc which generated many issues in the following areas when used with Fluent Bit:
Memory Allocator: to run Fluent Bit properly in high-load environments, we use Jemalloc as a default memory allocator which reduce fragmentation and provides better performance for our needs. Jemalloc cannot run smoothly with Musl and requires extra work.
Alpine Linux Musl functions bootstrap have a compatibility issue when loading Golang shared libraries, this generate problems when trying to load Golang output plugins in Fluent Bit.
Alpine Linux Musl Time format parser does not support Glibc extensions
Maintainers preference in terms of base image due to security and maintenance reasons are Distroless and Debian.
Our Docker containers images are deployed thousands of times per day, we take security and stability very seriously.
The latest tag most of the time points to the latest stable image. When we release a major update to Fluent Bit like for example from v1.3.x to v1.4.0, we don't move latest tag until 2 weeks after the release. That give us extra time to verify with our community that everything works as expected.
Currently, the image contains Go Plugins for:
AWS vends their container image via , and a set of highly available regional Amazon ECR repositories. For more information, see the .
The AWS for Fluent Bit image uses a custom versioning scheme because it contains multiple projects. To see what each release contains, check out the .
From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at so ensure this new one is added.
The previous key is still available at and may be required to install previous versions.
Refer to the to see which platforms are supported in each release.
Fluent Bit is distributed as td-agent-bit package and is available for the Raspberry, specifically for distribution, the following versions are supported:
From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at so ensure this new one is added.
The previous key is still available at and may be required to install previous versions.
Refer to the to see which platforms are supported in each release.
The following table describes the tags that are available on Docker Hub repository:
Tag(s) | Manifest Architectures | Description |
---|
Our x86_64 stable image is based on focusing on security containing just the Fluent Bit binary and minimal system libraries and basic configuration. Optionally, we provide debug images for x86_64 which contain a full shell and package manager that can be used to troubleshoot or for testing purposes.
Architecture | Base Image |
---|
1.8, 1.8.15 | x86_64, arm64v8, arm32v7 |
1.8-debug, 1.8.15-debug | x86_64 | v1.8.x releases (production + debug) |
1.8.14 | x86_64, arm64v8, arm32v7 |
1.8.14-debug | x86_64 | v1.8.x releases (production + debug) |
1.8.13 | x86_64, arm64v8, arm32v7 |
1.8.13-debug | x86_64 | v1.8.x releases (production + debug) |
1.8.12 | x86_64, arm64v8, arm32v7 |
1.8.12-debug | x86_64 | v1.8.x releases (production + debug) |
1.8.11 | x86_64, arm64v8, arm32v7 |
1.8.11-debug | x86_64 | v1.8.x releases + Busybox |
1.8.10 | x86_64, arm64v8, arm32v7 |
1.8.10-debug | x86_64 | v1.8.x releases + Busybox |
1.8.9 | x86_64, arm64v8, arm32v7 |
1.8.9-debug | x86_64 | v1.8.x releases + Busybox |
1.8.8 | x86_64, arm64v8, arm32v7 |
1.8.8-debug | x86_64 | v1.8.x releases + Busybox |
1.8.7 | x86_64, arm64v8, arm32v7 |
1.8.7-debug | x86_64 | v1.8.x releases + Busybox |
1.8.6 | x86_64, arm64v8, arm32v7 |
1.8.6-debug | x86_64 | v1.8.x releases + Busybox |
1.8.5 | x86_64, arm64v8, arm32v7 |
1.8.5-debug | x86_64 | v1.8.x releases + Busybox |
1.8.4 | x86_64, arm64v8, arm32v7 |
1.8.4-debug | x86_64 | v1.8.x releases + Busybox |
1.8.3 | x86_64, arm64v8, arm32v7 |
1.8.3-debug | x86_64 | v1.8.x releases + Busybox |
1.8.2 | x86_64, arm64v8, arm32v7 |
1.8.2-debug | x86_64 | v1.8.x releases + Busybox |
1.8.1 | x86_64, arm64v8, arm32v7 |
1.8.1-debug | x86_64 | v1.8.x releases + Busybox |
x86_64 |
arm64v8 | arm64v8/debian:bullseye-slim |
arm32v7 | arm32v7/debian:bullseye-slim |
Fluent Bit source code provides Bitbake recipes to configure, build and package the software for a Yocto based image. Note that specific steps of usage of these recipes in your Yocto environment (Poky) is out of the scope of this documentation.
We distribute two main recipes, one for testing/dev purposes and other with the latest stable release.
It's strongly recommended to always use the stable release of Fluent Bit recipe and not the one from GIT master for production deployments.
Fluent Bit >= v1.1.x fully supports x86_64, x86, arm32v7 and arm64v8.
Kubernetes Production Grade Log Processor
Fluent Bit is a lightweight and extensible Log Processor that comes with full support for Kubernetes:
Process Kubernetes containers logs from the file system or Systemd/Journald.
Enrich logs with Kubernetes Metadata.
Centralize your logs in third party storage services like Elasticsearch, InfluxDB, HTTP, etc.
Before getting started it is important to understand how Fluent Bit will be deployed. Kubernetes manages a cluster of nodes, so our log agent tool will need to run on every node to collect logs from every POD, hence Fluent Bit is deployed as a DaemonSet (a POD that runs on every node of the cluster).
When Fluent Bit runs, it will read, parse and filter the logs of every POD and will enrich each entry with the following information (metadata):
Pod Name
Pod ID
Container Name
Container ID
Labels
Annotations
To obtain this information, a built-in filter plugin called kubernetes talks to the Kubernetes API Server to retrieve relevant information such as the pod_id, labels and annotations, other fields such as pod_name, container_id and container_name are retrieved locally from the log file names. All of this is handled automatically, no intervention is required from a configuration aspect.
Our Kubernetes Filter plugin is fully inspired by the Fluentd Kubernetes Metadata Filter written by Jimmi Dyson.
Fluent Bit must be deployed as a DaemonSet, so on that way it will be available on every node of your Kubernetes cluster. To get started run the following commands to create the namespace, service account and role setup:
For Kubernetes v1.21 and below
For Kubernetes v1.22
The next step is to create a ConfigMap that will be used by our Fluent Bit DaemonSet:
The default configmap assumes that dockershim is utilized for the cluster. If a CRI runtime, such as containerd or CRI-O, is being utilized, the CRI parser should be utilized. More specifically, change the Parser
described in input-kubernetes.conf
from docker to cri.
If you are using Red Hat OpenShift you will also need to run the following
For Kubernetes versions older than v1.16, the DaemonSet resource is not available on apps/v1
, the resource is available on apiVersion: extensions/v1beta1
. Our current Daemonset Yaml files uses the new apiVersion
.
If you are using and older Kubernetes version, manually grab a copy of your Daemonset Yaml file and replace the value of apiVersion
from:
to
You can read more about this deprecation on Kubernetes v1.14 Changelog here:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#deprecations
Fluent Bit DaemonSet ready to be used with Elasticsearch on a normal Kubernetes Cluster:
If you are using Minikube for testing purposes, use the following alternative DaemonSet manifest:
Helm is a package manager for Kubernetes and allows you to quickly deploy application packages into your running cluster. Fluent Bit is distributed via a helm chart found in the Fluent Helm Charts repo: https://github.com/fluent/helm-charts.
To add the Fluent Helm Charts repo use the following command
To validate that the repo was added you can run helm search repo fluent
to ensure the charts were added. The default chart can then be installed by running the following
The default chart values include configuration to read container logs, with Docker parsing, systemd logs apply Kubernetes metadata enrichment and finally output to an Elasticsearch cluster. You can modify the values file included https://github.com/fluent/helm-charts/blob/master/charts/fluent-bit/values.yaml to specify additional outputs, health checks, monitoring endpoints, or other configuration options.
The default configuration of Fluent Bit makes sure of the following:
Consume all containers logs from the running Node.
The Tail input plugin will not append more than 5MB into the engine until they are flushed to the Elasticsearch backend. This limit aims to provide a workaround for backpressure scenarios.
The Kubernetes filter will enrich the logs with Kubernetes metadata, specifically labels and annotations. The filter only goes to the API Server when it cannot find the cached info, otherwise it uses the cache.
The default backend in the configuration is Elasticsearch set by the Elasticsearch Output Plugin. It uses the Logstash format to ingest the logs. If you need a different Index and Type, please refer to the plugin option and do your own adjustments.
There is an option called Retry_Limit set to False, that means if Fluent Bit cannot flush the records to Elasticsearch it will re-try indefinitely until it succeed.
Fluent Bit by default assumes that logs are formatted by the Docker interface standard. However, when using CRI you can run into issues with malformed JSON if you do not modify the parser used. Fluent Bit includes a CRI log parser that can be used instead. An example of the parser is seen below:
To use this parser change the Input section for your configuration from docker
to cri
Since v1.5.0, Fluent Bit supports deployment to Windows pods.
When deploying Fluent Bit to Kubernetes, there are three log files that you need to pay attention to.
C:\k\kubelet.err.log
This is the error log file from kubelet daemon running on host.
You will need to retain this file for future troubleshooting (to debug deployment failures etc.)
C:\var\log\containers\<pod>_<namespace>_<container>-<docker>.log
This is the main log file you need to watch. Configure Fluent Bit to follow this file.
It is actually a symlink to the Docker log file in C:\ProgramData\
, with some additional metadata on its file name.
C:\ProgramData\Docker\containers\<docker>\<docker>.log
This is the log file produced by Docker.
Normally you don't directly read from this file, but you need to make sure that this file is visible from Fluent Bit.
Typically, your deployment yaml contains the following volume configuration.
Assuming the basic volume configuration described above, you can apply the following config to start logging. You can visualize this configuration here
Windows pods often lack working DNS immediately after boot (#78479). To mitigate this issue, filter_kubernetes
provides a built-in mechanism to wait until the network starts up:
DNS_Retries
- Retries N times until the network start working (6)
DNS_Wait_Time
- Lookup interval between network status checks (30)
By default, Fluent Bit waits for 3 minutes (30 seconds x 6 times). If it's not enough for you, tweak the configuration as follows.
Fluent Bit is distributed as td-agent-bit package for Windows. Fluent Bit has two flavours of Windows installers: a ZIP archive (for quick testing) and an EXE installer (for system installation).
Currently the default configuration is intended for Linux only so will not function on Windows. Make sure to provide a valid Windows configuration with the installation, a sample one is shown below:
The latest stable version is 1.8.15, each version is available on the Github release as well as at https://fluentbit.io/releases/<Major Version>/Major>fluent-bit-<Full Version>-win[32|64].exe
:
Legacy td-agent-bit
packages are also available, just substitute fluent-bit
with td-agent-bit
in the URLs above.
To check the integrity, use Get-FileHash
cmdlet on PowerShell.
Download a ZIP archive from above. There are installers for 32-bit and 64-bit environments, so choose one suitable for your environment.
Then you need to expand the ZIP archive. You can do this by clicking "Extract All" on Explorer, or if you're using PowerShell, you can use Expand-Archive
cmdlet.
The ZIP package contains the following set of files.
Now, launch cmd.exe or PowerShell on your machine, and execute fluent-bit.exe
as follows.
If you see the following output, it's working fine!
To halt the process, press CTRL-C in the terminal.
Download an EXE installer from the download page. It has both 32-bit and 64-bit builds. Choose one which is suitable for you.
Then, double-click the EXE installer you've downloaded. Installation wizard will automatically start.
Click Next and proceed. By default, Fluent Bit is installed into C:\Program Files\td-agent-bit\
, so you should be able to launch fluent-bit as follow after installation.
The Windows installer is built by [CPack
using NSIS(https://cmake.org/cmake/help/latest/cpack_gen/nsis.html) and so supports the default options that all NSIS installers do for silent installation and the directory to install to.
To silently install to C:\fluent-bit
directory here is an example:
The uninstaller automatically provided also supports a silent un-install using the same /S
flag. This may be useful for provisioning with automation like Ansible, Puppet, etc.
Windows services are equivalent to "daemons" in UNIX (i.e. long-running background processes). Since v1.5.0, Fluent Bit has the native support for Windows Service.
Suppose you have the following installation layout:
To register Fluent Bit as a Windows service, you need to execute the following command on Command Prompt. Please be careful that a single space is required after binpath=
.
Now Fluent Bit can be started and managed as a normal Windows service.
To halt the Fluent Bit service, just execute the "stop" command.
To start Fluent Bit automatically on boot, execute the following:
C:\Program Files
Quotations are required if file paths contain spaces. Here is an example:
Instead of sc.exe
, PowerShell can be used to manage Windows services.
Create a Fluent Bit service:
Start the service:
Query the service status:
Stop the service:
Remove the service (requires PowerShell 6.0 or later)
If you need to create a custom executable, you can use the following procedure to compile Fluent Bit by yourself.
First, you need Microsoft Visual C++ to compile Fluent Bit. You can install the minimum toolkit by the following command:
When asked which packages to install, choose "C++ Build Tools" (make sure that "C++ CMake tools for Windows" is selected too) and wait until the process finishes.
Also you need to install flex and bison. One way to install them on Windows is to use winflexbison.
Add the path C:\WinFlexBison
to your systems environment variable "Path". Here's how to do that.
Also you need to install git to pull the source code from the repository.
Open the start menu on Windows and type "Developer Command Prompt".
Clone the source code of Fluent Bit.
Compile the source code.
Now you should be able to run Fluent Bit:
To create a ZIP package, call cpack
as follows:
Fluent Bit supports the usage of environment variables in any value associated to a key when using a configuration file.
The variables are case sensitive and can be used in the following format:
When Fluent Bit starts, the configuration reader will detect any request for ${MY_VARIABLE}
and will try to resolve its value.
Create the following configuration file (fluent-bit.conf
):
Open a terminal and set the environment variable:
The above command set the 'stdout' value to the variable
MY_OUTPUT
.
Run Fluent Bit with the recently created configuration file:
As you can see the service worked properly as the configuration was valid.
Fluent Bit might optionally use a configuration file to define how the service will behave, and before proceeding we need to understand how the configuration schema works.
The schema is defined by three concepts:
Sections
Entries: Key/Value
Indented Configuration Mode
A simple example of a configuration file is as follows:
A section is defined by a name or title inside brackets. Looking at the example above, a Service section has been set using [SERVICE] definition. Section rules:
All section content must be indented (4 spaces ideally).
Multiple sections can exist on the same file.
A section is expected to have comments and entries, it cannot be empty.
Any commented line under a section, must be indented too.
A section may contain Entries, an entry is defined by a line of text that contains a Key and a Value, using the above example, the [SERVICE]
section contains two entries, one is the key Daemon with value off and the other is the key Log_Level with the value debug. Entries rules:
An entry is defined by a key and a value.
A key must be indented.
A key must contain a value which ends in the breakline.
Multiple keys with the same name can exist.
Also commented lines are set prefixing the # character, those lines are not processed but they must be indented too.
Fluent Bit configuration files are based in a strict Indented Mode, that means that each configuration file must follow the same pattern of alignment from left to right when writing text. By default an indentation level of four spaces from left to right is suggested. Example:
As you can see there are two sections with multiple entries and comments, note also that empty lines are allowed and they do not need to be indented.
This page describes the main configuration file used by Fluent Bit
The main configuration file supports four types of sections:
Service
Input
Filter
Output
In addition, it's also possible to split the main configuration file in multiple files using the feature to include external files:
Include File
The Service section defines global properties of the service, the keys available as of this version are described in the following table:
The following is an example of a SERVICE section:
An INPUT section defines a source (related to an input plugin), here we will describe the base configuration for each INPUT section. Note that each input plugin may add it own configuration keys:
The Name is mandatory and it let Fluent Bit know which input plugin should be loaded. The Tag is mandatory for all plugins except for the input forward plugin (as it provides dynamic tags).
The following is an example of an INPUT section:
A FILTER section defines a filter (related to an filter plugin), here we will describe the base configuration for each FILTER section. Note that each filter plugin may add it own configuration keys:
The Name is mandatory and it let Fluent Bit know which filter plugin should be loaded. The Match or Match_Regex is mandatory for all plugins. If both are specified, Match_Regex takes precedence.
The following is an example of an FILTER section:
The OUTPUT section specify a destination that certain records should follow after a Tag match. The configuration support the following keys:
The following is an example of an OUTPUT section:
The following configuration file example demonstrates how to collect CPU metrics and flush the results every five seconds to the standard output:
To avoid complicated long configuration files is better to split specific parts in different files and call them (include) from one main file.
Starting from Fluent Bit 0.12 the new configuration command @INCLUDE has been added and can be used in the following way:
The configuration reader will try to open the path somefile.conf, if not found, it will assume it's a relative path based on the path of the base configuration file, e.g:
Main configuration file path: /tmp/main.conf
Included file: somefile.conf
Fluent Bit will try to open somefile.conf, if it fails it will try /tmp/somefile.conf.
The @INCLUDE command only works at top-left level of the configuration line, it cannot be used inside sections.
Wildcard character (*) is supported to include multiple files, e.g:
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Version | Recipe | Description |
---|---|---|
INSTALLERS | SHA256 CHECKSUMS |
---|---|
One of the ways to configure Fluent Bit is using a main configuration file. Fluent Bit allows to use one configuration file which works at a global scope and uses the defined previously.
Key | Description | Default Value |
---|
Key | Description |
---|
Key | Description |
---|
Key | Description |
---|
You can also visualize Fluent Bit INPUT, FILTER, and OUTPUT configuration via
devel
Build Fluent Bit from GIT master. This recipe aims to be used for development and testing purposes only.
v1.8.12
Build latest stable version of Fluent Bit.
Name | Name of the input plugin. |
Tag | Tag name associated to all records coming from this plugin. |
Name | Name of the filter plugin. |
Match | A pattern to match against the tags of incoming records. It's case sensitive and support the star (*) character as a wildcard. |
Match_Regex | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regex syntax. |
Name | Name of the output plugin. |
Match | A pattern to match against the tags of incoming records. It's case sensitive and support the star (*) character as a wildcard. |
Match_Regex | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regex syntax. |
flush | Set the flush time in | 5 |
grace | Set the grace time in | 5 |
daemon | Boolean value to set if Fluent Bit should run as a Daemon (background) or not. Allowed values are: yes, no, on and off. note: If you are using a Systemd based unit as the one we provide in our packages, do not turn on this option. | Off |
dns.mode | Set the primary transport layer protocol used by the asynchronous DNS resolver which can be overriden on a per plugin basis | UDP |
log_file | Absolute path for an optional log file. By default all logs are redirected to the standard error interface (stderr). |
log_level | Set the logging verbosity level. Allowed values are: off, error, warn, info, debug and trace. Values are accumulative, e.g: if 'debug' is set, it will include error, warning, info and debug. Note that trace mode is only available if Fluent Bit was built with the WITH_TRACE option enabled. | info |
parsers_file | Path for a |
plugins_file |
streams_file |
http_server | Enable built-in HTTP Server | Off |
http_listen | Set listening interface for HTTP Server when it's enabled | 0.0.0.0 |
http_port | Set TCP Port for the HTTP Server | 2020 |
coro_stack_size | Set the coroutines stack size in bytes. The value must be greater than the page size of the running system. Don't set too small value (say 4096), or coroutine threads can overrun the stack buffer. Do not change the default value of this parameter unless you know what you are doing. | 24576 |
scheduler.cap | Set a maximum retry time in second. The property is supported from v1.8.7. | 2000 |
scheduler.base | Set a base of exponential backoff. The property is supported from v1.8.7. | 5 |
Configuration files must be flexible enough for any deployment need, but they must keep a clean and readable format.
Fluent Bit Commands extends a configuration file with specific built-in features. The list of commands available as of Fluent Bit 0.12 series are:
Configuring a logging pipeline might lead to an extensive configuration file. In order to maintain a human-readable configuration, it's suggested to split the configuration in multiple files.
The @INCLUDE command allows the configuration reader to include an external configuration file, e.g:
The above example defines the main service configuration file and also include two files to continue the configuration:
Note that despites the order of inclusion, Fluent Bit will ALWAYS respect the following order:
Service
Inputs
Filters
Outputs
Fluent Bit supports configuration variables, one way to expose this variables to Fluent Bit is through setting a Shell environment variable, the other is through the @SET command.
The @SET command can only be used at root level of each line, meaning it cannot be used inside a section, e.g:
An Upstream defines a set of nodes that will be targeted by an output plugin, by the nature of the implementation an output plugin must support the Upstream feature. The following plugin(s) have Upstream support:
The current balancing mode implemented is round-robin.
To define an Upstream it's required to create an specific configuration file that contains an UPSTREAM and one or multiple NODE sections. The following table describe the properties associated to each section. Note that all of them are mandatory:
A Node might contain additional configuration keys required by the plugin, on that way we provide enough flexibility for the output plugin, a common use case is Forward output where if TLS is enabled, it requires a shared key (more details in the example below).
In addition to the properties defined in the table above, the network operations against a defined node can optionally be done through the use of TLS for further encryption and certificates use.
The following example defines an Upstream called forward-balancing which aims to be used by Forward output plugin, it register three Nodes:
node-1: connects to 127.0.0.1:43000
node-2: connects to 127.0.0.1:44000
node-3: connects to 127.0.0.1:45000 using TLS without verification. It also defines a specific configuration option required by Forward output called shared_key.
Note that every Upstream definition must exists on it own configuration file in the file system. Adding multiple Upstreams in the same file or different files is not allowed.
Path for a plugins
configuration file. A plugins configuration file allows to define paths for external plugins, for an example .
Path for the Stream Processor configuration file. To learn more about Stream Processing configuration go .
Command | Prototype | Description |
---|---|---|
It's common that Fluent Bit aims to connect to external services to deliver the logs over the network, this is the case of , and within others. Being able to connect to one node (host) is normal and enough for more of the use cases, but there are other scenarios where balancing across different nodes is required. The Upstream feature provides such capability.
Section | Key | Description |
---|
The TLS options available are described in the section and can be added to the any Node section.
Certain configuration directives in Fluent Bit refer to unit sizes such as when defining the size of a buffer or specific limits, we can find these in plugins like , or in generic properties like .
Starting from v0.11.10, all unit sizes have been standardized across the core and plugins, the following table describes the options that can be used and what they mean:
Suffix | Description | Example |
---|
@INCLUDE FILE
Include a configuration file
@SET KEY=VAL
Set a configuration variable
UPSTREAM | name | Defines a name for the Upstream in question. |
NODE | name | Defines a name for the Node in question. |
host | IP address or hostname of the target host. |
port | TCP port of the target service. |
When a suffix is not specified, it's assumed that the value given is a bytes representation. | Specifying a value of 32000, means 32000 bytes |
k, K, KB, kb | Kilobyte: a unit of memory equal to 1,000 bytes. | 32k means 32000 bytes. |
m, M, MB, mb | Megabyte: a unit of memory equal to 1,000,000 bytes | 1M means 1000000 bytes |
g, G, GB, gb | Gigabyte: a unit of memory equal to 1,000,000,000 bytes | 1G means 1000000000 bytes |
In an ideal world, applications might log their messages within a single line, but in reality applications generate multiple log messages that sometimes belong to the same context. But when is time to process such information it gets really complex. Consider application stack traces which always have multiple log lines.
Starting from Fluent Bit v1.8, we have implemented a unified Multiline core functionality to solve all the user corner cases. In this section, you will learn about the features and configuration options available.
The Multiline parser engine exposes two ways to configure and use the functionality:
Built-in multiline parser
Configurable multiline parser
Without any extra configuration, Fluent Bit exposes certain pre-configured parsers (built-in) to solve specific multiline parser cases, e.g:
Besides the built-in parsers listed above, through the configuration files is possible to define your own Multiline parsers with their own rules.
A multiline parser is defined in a parsers configuration file by using a [MULTILINE_PARSER]
section definition. The Multiline parser must have a unique name and a type plus other configured properties associated with each type.
To understand which Multiline parser type is required for your use case you have to know beforehand what are the conditions in the content that determines the beginning of a multiline message and the continuation of subsequent lines. We provide a regex based configuration that supports states to handle from the most simple to difficult cases.
Before start configuring your parser you need to know the answer to the following questions:
What is the regular expression (regex) that matches the first line of a multiline message ?
What are the regular expressions (regex) that match the continuation lines of a multiline message ?
When matching regex, we have to define states, some states define the start of a multiline message while others are states for the continuation of multiline messages. You can have multiple continuation states definitions to solve complex cases.
The first regex that matches the start of a multiline message is called start_state, then other regexes continuation lines can have different state names.
A rule specifies how to match a multiline pattern and perform the concatenation. A rule is defined by 3 specific components:
state name
regular expression pattern
next state
A rule might be defined as follows (comments added to simplify the definition) :
In the example above, we have defined two rules, each one has its own state name, regex paterns, and the next state name. Every field that composes a rule must be inside double quotes.
The first rule of state name must always be start_state, and the regex pattern must match the first line of a multiline message, also a next state must be set to specify how the possible continuation lines would look like.
To simplify the configuration of regular expressions, you can use the Rubular web site. We have posted an example by using the regex described above plus a log line that matches the pattern: https://rubular.com/r/NDuyKwlTGOvq2g
The following example provides a full Fluent Bit configuration file for multiline parsing by using the definition explained above.
The following example files can be located at: https://github.com/fluent/fluent-bit/tree/master/documentation/examples/multiline/regex-001
Example files content:
This is the primary Fluent Bit configuration file. It includes the parsers_multiline.conf
and tails the file test.log
by applying the multiline parser multiline-regex-test
. Then it sends the processing to the standard output.
This second file defines a multiline parser for the example.
An example file with multiline content:
By running Fluent Bit with the given configuration file you will obtain:
The lines that did not match a pattern are not considered as part of the multiline message, while the ones that matched the rules were concatenated properly.
A full feature set to access content of your records
Fluent Bit works internally with structured records and it can be composed of an unlimited number of keys and values. Values can be anything like a number, string, array, or a map.
Having a way to select a specific part of the record is critical for certain core functionalities or plugins, this feature is called Record Accessor.
consider Record Accessor a simple grammar to specify record content and other miscellaneous values.
A record accessor rule starts with the character $
. Using the structured content above as an example the following table describes how to access a record:
The following table describe some accessing rules and the expected returned value:
If the accessor key does not exist in the record like the last example $labels['undefined']
, the operation is simply omitted, no exception will occur.
The feature is enabled on a per plugin basis, not all plugins enable this feature. As an example consider a configuration that aims to filter records using grep that only matches where labels have a color blue:
The file content to process in test.log
is the following:
Running Fluent Bit with the configuration above the output will be:
Fluent Bit provides integrated support for Transport Layer Security (TLS) and it predecessor Secure Sockets Layer (SSL) respectively. In this section we will refer as TLS only for both implementations.
Each output plugin that requires to perform Network I/O can optionally enable TLS and configure the behavior. The following table describes the properties available:
The listed properties can be enabled in the configuration file, specifically on each output plugin section or directly through the command line.
The following output plugins can take advantage of the TLS feature:
In addition, other plugins implements a sub-set of TLS support, meaning, with restricted configuration:
By default HTTP output plugin uses plain TCP, enabling TLS from the command line can be done with:
In the command line above, the two properties tls and tls.verify where enabled for demonstration purposes (we strongly suggest always keep verification ON).
The same behavior can be accomplished using a configuration file:
By default when Fluent Bit process data, it uses Memory as a primary and temporary place to store the records, but there are certain scenarios where would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities.
Choosing the right configuration is critical and the behavior of the service can be conditioned based in the backpressure settings. Before to jump into the configuration properties let's understand the relationship between Chunks, Memory, Filesystem and Backpressure.
Understanding the chunks, buffering and backpressure concepts is critical for a proper configuration. Let's do a recap of the meaning of these concepts.
When an input plugin (source) emit records, the engine group the records together in a Chunk. A Chunk size usually is around 2MB. By configuration, the engine decide where to place this Chunk, the default is that all chunks are created only in memory.
As mentioned above, the Chunks generated by the engine are placed in memory but this is configurable.
If memory is the only mechanism set for the input plugin, it will just store data as much as it can there (memory). This is the fastest mechanism with less system overhead, but if the service is not able to deliver the records fast enough because of a slow network or an unresponsive remote service, Fluent Bit memory usage will increase since it will accumulate more data than it can deliver.
On a high load environment with backpressure the risks of having high memory usage is the chance to get killed by the Kernel (OOM Killer). A workaround for this backpressure scenario is to limit the amount of memory in records that an input plugin can register, this configuration property is called mem_buf_limit
: if a plugin have enqueued more than mem_buf_limit
, it won't be able to ingest more until it data can be delivered or flushed properly. On this scenario the input plugin in question is paused.
The workaround of mem_buf_limit
is good for certain scenarios and environments, it helps to control the memory usage of the service, but at the costs that if a file gets rotated while paused, you might lose that data since it won't be able to register new records. This can happen with any input source plugin. The goal of mem_buf_limit
is memory control and survival of the service.
For full data safety guarantee, use filesystem buffering.
Filesystem buffering enabled helps with backpressure and overall memory control.
Behind the scenes, Memory and Filesystem buffering mechanisms are not mutual exclusive, indeed when enabling filesystem buffering for your input plugin (source) you are getting the best of the two worlds: performance and data safety.
How this Filesystem buffering mechanism deals with high memory usage and backpressure ?: Fluent Bit controls the number of Chunks that are up
in memory.
By default, the engine allows to have 128 Chunks up
in memory in total (considering all Chunks), this value is controlled by service property storage.max_chunks_up
. The active Chunks that are up
are ready for delivery and the ones that still are receiving records. Any other remaining Chunk is in a down
state, which means that's only in the filesystem and won't be up
in memory unless is ready to be delivered.
If the input plugin has enabled mem_buf_limit
and storage.type
as filesystem
, when reaching the mem_buf_limit
threshold, instead of the plugin being paused, all new data will go to Chunks that are down
in the filesystem. This allows to control the memory usage by the service but also providing a a guarantee that the service won't lose any data.
Limiting Filesystem space for Chunks
Fluent Bit implements the concept of logical queues: a Chunk based on its Tag, can be routed to multiple destinations, so internally we keep a reference from where a Chunk was created and where it needs to go.
It's common to find cases that if we have multiple destinations for a Chunk, one of the destination might be slower than the other, and maybe one of the destinations is generating backpressure and not all of them. On this scenario how do we limit the amount of filesystem Chunks that we are logically queueing ?.
Starting from Fluent Bit v1.6, we introduced the new configuration property for output plugins called storage.total_limit_size
which limits the number of Chunks that exists in the file system for a certain logical output destination. If one destinations reaches the storage.total_limit_size
limit, the oldest Chunk from it queue for that logical output destination will be discarded.
The storage layer configuration takes place in three areas:
Service Section
Input Section
Output Section
The known Service section configure a global environment for the storage layer, the Input sections defines which buffering mechanism to use and the output the limits for the logical queues.
a Service section will look like this:
that configuration configure an optional buffering mechanism where it root for data is /var/log/flb-storage/, it will use normal synchronization mode, without checksum and up to a maximum of 5MB of memory when processing backlog data.
Optionally, any Input plugin can configure their storage preference, the following table describe the options available:
The following example configure a service that offers filesystem buffering capabilities and two Input plugins being the first based in filesystem and the second with memory only.
If certain chunks are filesystem storage.type based, it's possible to control the size of the logical queue for an output plugin. The following table describe the options available:
The following example create records with CPU usage samples in the filesystem and then they are delivered to Google Stackdriver service limiting the logical queue (buffering) to 5M:
If for some reason Fluent Bit gets offline because of a network issue, it will continuing buffering CPU samples but just keeping a maximum of 5M of the newest data.
Parser | Description |
---|---|
Property | Description | Default |
---|---|---|
Format | Accessed Value |
---|---|
Property | Description | Default |
---|
Fluent Bit supports . If you are serving multiple hostnames on a single IP address (a.k.a. virtual hosting), you can make use of tls.vhost
to connect to a specific hostname.
The end-goal of is to collect, parse, filter and ship logs to a central place. In this workflow there are many phases and one of the critical pieces is the ability to do buffering : a mechanism to place processed data into a temporary location until is ready to be shipped.
When the Filesystem buffering is enabled, the behavior of the engine is different, upon Chunk creation, it stores the content in memory but also it maps a copy on disk (through ), this Chunk is active in memory and backed up in disk is called to be up
which means "the chunk content is up in memory".
The Service section refers to the section defined in the main :
Key | Description | Default |
---|
Key | Description | Default |
---|
Key | Description | Default |
---|
docker
Process a log entry generated by a Docker container engine. This parser supports the concatenation of log entries split by Docker.
cri
Process a log entry generated by CRI-O container engine. Same as the docker parser, it supports concatenation of log entries
go
Process log entries generated by a Go based language application and perform concatenation if multiline messages are detected.
python
Process log entries generated by a Python based language application and perform concatenation if multiline messages are detected.
java
Process log entries generated by a Google Cloud Java language application and perform concatenation if multiline messages are detected.
name
Specify a unique name for the Multiline Parser definition. A good practice is to prefix the name with the word multiline_
to avoid confusion with normal parser's definitions.
type
Set the multiline mode, for now, we support the type regex
.
parser
Name of a pre-defined parser that must be applied to the incoming content before applying the regex rule. If no parser is defined, it's assumed that's a raw text and not a structured message.
Note: when a parser is applied to a raw text, then the regex is applied against a specific key of the structured message by using the key_content
configuration property (see below).
key_content
For an incoming structured message, specify the key that contains the data that should be processed by the regular expression and possibly concatenated.
flush_timeout
Timeout in milliseconds to flush a non-terminated multiline buffer. Default is set to 5 seconds.
5s
rule
Configure a rule to match a multiline pattern. The rule has a specific format described below. Multiple rules can be defined.
$log
"some message"
$labels['color']
"blue"
$labels['project']['env']
"production"
$labels['unset']
null
$labels['undefined']
tls | enable or disable TLS support | Off |
tls.verify | force certificate validation | On |
tls.debug | Set TLS debug verbosity level. It accept the following values: 0 (No debug), 1 (Error), 2 (State change), 3 (Informational) and 4 Verbose | 1 |
tls.ca_file | absolute path to CA certificate file |
tls.ca_path | absolute path to scan for certificate files |
tls.crt_file | absolute path to Certificate file |
tls.key_file | absolute path to private Key file |
tls.key_passwd | optional password for tls.key_file file |
tls.vhost | hostname to be used for TLS SNI extension |
storage.type | Specify the buffering mechanism to use. It can be memory or filesystem. | memory |
storage.max_chunks_pause | Specify if file storage is to be paused when reaching the chunk limit. | off |
storage.total_limit_size | Limit the maximum number of Chunks in the filesystem for the current output logical destination. |
In certain environments is common to see that logs or data being ingested is faster than the ability to flush it to some destinations. The common case is reading from big log files and dispatching the logs to a backend over the network which takes some time to respond, this generate backpressure leading to a high memory consumption in the service.
In order to avoid backpressure, Fluent Bit implements a mechanism in the engine that restrict the amount of data than an input plugin can ingest, this is done through the configuration parameter Mem_Buf_Limit.
As described in the Buffering concepts section, Fluent Bit offers an hybrid mode for data handling: in-memory and filesystem (optional).
In memory
is always available and can be restricted with Mem_Buf_Limit. If your plugin gets restricted because of the configuration and you are under a backpressure scenario, you won't be able to ingest more data until the data chunks that are in memory can flushed.
Depending of the input plugin type in use, this might lead to discard incoming data (e.g: TCP input plugin), but you can rely on the secondary filesystem buffering to be safe.
If in addition to Mem_Buf_Limit the input plugin defined a storage.type
of filesystem
(as described in Buffering & Storage), when the limit is reached, all the new data will be stored safety in the file system.
This option is disabled by default and can be applied to all input plugins. Let's explain it behavior using the following scenario:
Mem_Buf_Limit is set to 1MB (one megabyte)
input plugin tries to append 700KB
engine route the data to an output plugin
output plugin backend (HTTP Server) is down
engine scheduler will retry the flush after 10 seconds
input plugin tries to append 500KB
At this exact point, the engine will allow to append those 500KB of data into the engine: in total we have 1.2MB. The options works in a permissive mode before to reach the limit, but the limit is exceeded the following actions are taken:
block local buffers for the input plugin (cannot append more data)
notify the input plugin invoking a pause callback
The engine will protect it self and will not append more data coming from the input plugin in question; Note that is the plugin responsibility to keep their state and take some decisions about what to do on that paused state.
After some seconds if the scheduler was able to flush the initial 700KB of data or it gave up after retrying, that amount memory is released and internally the following actions happens:
Upon data buffer release (700KB), the internal counters get updated
Counters now are set at 500KB
Since 500KB is < 1MB it checks the input plugin state
If the plugin is paused, it invokes a resume callback
input plugin can continue appending more data
Each plugin is independent and not all of them implements the pause and resume callbacks. As said, these callbacks are just a notification mechanism for the plugin.
The plugin who implements and keep a good state is the Tail Input plugin. When the pause callback is triggered, it stop their collectors and stop appending data. Upon resume, it re-enable the collectors.
In certain scenarios would be ideal to estimate how much memory Fluent Bit could be using, this is very useful for containerized environments where memory limits are a must.
In order to estimate we will assume that the input plugins have set the Mem_Buf_Limit option (you can learn more about it in the Backpressure section).
Input plugins append data independently, so in order to do an estimation a limit should be imposed through the Mem_Buf_Limit option. If the limit was set to 10MB we need to estimate that in the worse case, the output plugin likely could use 20MB.
Fluent Bit has an internal binary representation for the data being processed, but when this data reach an output plugin, this one will likely create their own representation in a new memory buffer for processing. The best example are the InfluxDB and Elasticsearch output plugins, both needs to convert the binary representation to their respective-custom JSON formats before to talk to their backend servers.
So, if we impose a limit of 10MB for the input plugins and considering the worse case scenario of the output plugin consuming 20MB extra, as a minimum we need (30MB x 1.2) = 36MB.
Is well known that in intensive environments where memory allocations happens in the order of magnitude, the default memory allocator provided by Glibc could lead to a high fragmentation, reporting a high memory usage by the service.
It's strongly suggested that in any production environment, Fluent Bit should be built with jemalloc enabled (e.g. -DFLB_JEMALLOC=On
). Jemalloc is an alternative memory allocator that can reduce fragmentation (among others things) resulting in better performance.
You can check if Fluent Bit has been built with Jemalloc using the following command:
The output should looks like:
If the FLB_HAVE_JEMALLOC option is listed in Build Flags, everything will be fine.
storage.path | Set an optional location in the file system to store streams and chunks of data. If this parameter is not set, Input plugins can only use in-memory buffering. |
storage.sync | Configure the synchronization mode used to store the data into the file system. It can take the values normal or full. | normal |
storage.checksum | Enable the data integrity check when writing and reading data from the filesystem. The storage layer uses the CRC32 algorithm. | Off |
storage.max_chunks_up | If the input plugin has enabled | 128 |
storage.backlog.mem_limit | If storage.path is set, Fluent Bit will look for data chunks that were not delivered and are still in the storage layer, these are called backlog data. This option configure a hint of maximum value of memory to use when processing these records. | 5M |
storage.metrics | off |
Fluent Bit has an Engine that helps to coordinate the data ingestion from input plugins and call the Scheduler to decide when is time to flush the data through one or multiple output plugins. The Scheduler flush new data every a fixed time of seconds and Schedule retries when asked.
Once an output plugin gets call to flush some data, after processing that data it can notify the Engine three possible return statuses:
OK
Retry
Error
If the return status was OK, it means it was successfully able to process and flush the data, if it returned an Error status, means that an unrecoverable error happened and the engine should not try to flush that data again. If a Retry was requested, the Engine will ask the Scheduler to retry to flush that data, the Scheduler will decide how many seconds to wait before that happen.
The Scheduler provides a simple configuration option called Retry_Limit which can be set independently on each output section. This option allows to disable retries or impose a limit to try N times and then discard the data after reaching that limit:
The following example configure two outputs where the HTTP plugin have an unlimited number of retries and the Elasticsearch plugin have a limit of 5 times:
Fluent Bit implements a unified networking interface that is exposed to components like plugins. This interface abstract all the complexity of general I/O and is fully configurable.
A common use case is when a component or plugin needs to connect to a service to send and receive data. Despite the operational mode sounds easy to deal with, there are many factors that can make things hard like unresponsive services, networking latency or any kind of connectivity error. The networking interface aims to abstract and simplify the network I/O handling, minimize risks and optimize performance.
Most of the time creating a new TCP connection to a remote server is straightforward and takes a few milliseconds. But there are cases where DNS resolving, slow network or incomplete TLS handshakes might create long delays, or incomplete connection statuses.
The net.connect_timeout
allows to configure the maximum time to wait for a connection to be established, note that this value already considers the TLS handshake process.
The net.connect_timeout_log_error
indicates if an error should be logged in case of connect timeout. If disabled, the timeout is logged as debug level message instead.
On environments with multiple network interfaces, might be desired to choose which interface to use for our data that will flow through the network.
The net.source_address
allows to specify which network address must be used for a TCP connection and data flow.
TCP is a connected oriented channel, to deliver and receive data from a remote end-point in most of cases we use a TCP connection. This TCP connection can be created and destroyed once is not longer needed, this approach has pros and cons, here we will refer to the opposite case: keep the connection open.
The concept of Connection Keepalive
refers to the ability of the client (Fluent Bit on this case) to keep the TCP connection open in a persistent way, that means that once the connection is created and used, instead of close it, it can be recycled. This feature offers many benefits in terms of performance since communication channels are always established before hand.
Any component that uses TCP channels like HTTP or TLS, can take advantage of this feature. For configuration purposes use the net.keepalive
property.
If a connection is keepalive enabled, there might be scenarios where the connection can be unused for long periods of time. Having an idle keepalive connection is not helpful and is recommendable to keep them alive if they are used.
In order to control how long a keepalive connection can be idle, we expose the configuration property called net.keepalive_idle_timeout
.
If a transport layer protocol is specified, the plugin whose configuration section the net.dns.mode
setting is specified on overrides the global dns.mode
value and issues DNS requests using the specified protocol which can be either TCP or UDP
For plugins that rely on networking I/O, the following section describes the network configuration properties available and how they can be used to optimize performance or adjust to different configuration needs:
As an example, we will send 5 random messages through a TCP output connection, in the remote side we will use nc
(netcat) utility to see the data.
Put the following configuration snippet in a file called fluent-bit.conf
:
In another terminal, start nc
and make it listen for messages on TCP port 9090:
Now start Fluent Bit with the configuration file written above and you will see the data flowing to netcat:
If the net.keepalive
option is not enabled, Fluent Bit will close the TCP connection and netcat will quit, here we can see how the keepalive connection works.
After the 5 records arrive, the connection will keep idle and after 10 seconds it will be closed due to net.keepalive_idle_timeout
.
If http_server
option has been enable in the main [SERVICE]
section, this option registers a new endpoint where internal metrics of the storage layer can be consumed. For more details refer to the section.
Value | Description | |
---|---|---|
Property | Description | Default |
---|---|---|
Retry_Limit
N
Integer value to set the maximum number of retries allowed. N must be >= 1 (default: 1)
Retry_Limit
no_limits
or False
When Retry_Limit is set to no_limits
orFalse
, means that there is not limit for the number of retries that the Scheduler can do.
Retry_Limit
no_retries
When Retry_Limit is set to no_retries, means that reries are disabled and Scheduler would not try to send data to destination if it failed first time.
net.connect_timeout
Set maximum time expressed in seconds to wait for a TCP connection to be established, this include the TLS handshake time.
10
net.connect_timeout_log_error
On connection timeout, specify if it should log an error. When disabled, the timeout is logged as a debug message
true
net.source_address
Specify network address (interface) to use for connection and data traffic.
net.keepalive
Enable or disable connection keepalive support. Accepts a boolean value: on / off.
on
net.keepalive_idle_timeout
Set maximum time expressed in seconds for an idle keepalive connection.
30
net.keepalive_max_recycle
Set the maximum number of times a keepalive connection can be used before it is destroyed.
0
net.dns.mode
Set the primary transport layer protocol used by the asynchronous DNS resolver for connections established in the plugin where this configuration value is used
UDP
Fluent Bit is a powerful log processing tool that can deal with different sources and formats, in addition it provides several filters that can be used to perform custom modifications. This flexibility is really good but while your pipeline grows, it's strongly recommended to validate your data and structure.
We encourage Fluent Bit users to integrate data validation in their CI systems
A simplified view of our data processing pipeline is as follows:
In a normal production environment, many Inputs, Filters, and Outputs are defined in the configuration, so integrating a continuous validation of your configuration against expected results is a must. For this requirement, Fluent Bit provides a specific Filter called Expect which can be used to validate expected Keys and Values from your records and takes some action when an exception is found.
As an example, consider the following pipeline where your source of data is a normal file with JSON content on it and then two filters: grep to exclude certain records and record_modifier to alter the record content adding and removing specific keys.
Ideally you want to add checkpoints of validation of your data between each step so you can know if your data structure is correct, we do this by using expect filter.
Expect filter sets rules that aims to validate certain criteria like:
does the record contain a key A ?
does the record not contains key A?
does the record key A value equals NULL ?
does the record key A value a different value than NULL ?
does the record key A value equals B ?
Every expect filter configuration can expose specific rules to validate the content of your records, it supports the following configuration properties:
Consider the following JSON file called data.log
with the following content:
The following Fluent Bit configuration file will configure a pipeline to consume the log above apply an expect filter to validate that keys color
and label
exists:
note that if for some reason the JSON parser failed or is missing in the tail
input (line 9), the expect
filter will trigger the exit
action. As a test, go ahead and comment out or remove line 9.
As a second step, we will extend our pipeline and we will add a grep filter to match records that map label
contains a key called name
with value abc
, then an expect filter to re-validate that condition:
When deploying your configuration in production, you might want to remove the expect filters from your configuration since it's an unnecessary extra work unless you want to have a 100% coverage of checks at runtime.
Learn how to monitor your Fluent Bit data pipelines
Fluent Bit comes with built-it features to allow you to monitor the internals of your pipeline, connect to Prometheus and Grafana, Health checks and also connectors to use external services for such purposes:
Fluent Bit comes with a built-in HTTP Server that can be used to query internal information and monitor metrics of each running plugin.
The monitoring interface can be easily integrated with Prometheus since we support it native format.
To get started, the first step is to enable the HTTP Server from the configuration file:
the above configuration snippet will instruct Fluent Bit to start it HTTP Server on TCP Port 2020 and listening on all network interfaces:
now with a simple curl command is enough to gather some information:
Note that we are sending the curl command output to the jq program which helps to make the JSON data easy to read from the terminal. Fluent Bit don't aim to do JSON pretty-printing.
Fluent Bit aims to expose useful interfaces for monitoring, as of Fluent Bit v0.14 the following end points are available:
Query the service uptime with the following command:
it should print a similar output like this:
Query internal metrics in JSON format with the following command:
it should print a similar output like this:
Query internal metrics in Prometheus Text 0.0.4 format:
this time the same metrics will be in Prometheus format instead of JSON:
By default configured plugins on runtime get an internal name in the format plugin_name.ID. For monitoring purposes, this can be confusing if many plugins of the same type were configured. To make a distinction each configured input or output section can get an alias that will be used as the parent name for the metric.
Now when querying the metrics we get the aliases in place instead of the plugin name:
Fluent bit now supports four new configs to set up the health check.
Note: Not every error log means an error nor be counted, the errors retry failures count only on specific errors which is the example in config table description
So the feature works as: Based on the HC_Period customer setup, if the real error number is over HC_Errors_Count
or retry failure is over HC_Retry_Failure_Count
, fluent bit will be considered as unhealthy. The health endpoint will return HTTP status 500 and String error
. Otherwise it's healthy, will return HTTP status 200 and string ok
The equation is:
Note: the HC_Errors_Count and HC_Retry_Failure_Count only count for output plugins and count a sum for errors and retry failures from all output plugins which is running.
See the config example:
The command to call health endpoint
Based on the fluent bit status, the result will be:
HTTP status 200 and "ok" in response to healthy status
HTTP status 500 and "error" in response for unhealthy status
With the example config, the health status is determined by following equation:
If (HC_Errors_Count > 5) OR (HC_Retry_Failure_Count > 5) IN 5 seconds is TRUE, then it's unhealthy.
If (HC_Errors_Count > 5) OR (HC_Retry_Failure_Count > 5) IN 5 seconds is FALSE, then it's healthy.
Register your Fluent Bit agent will take less than one minute, steps:
In your Fluent Bit configuration file, append the following configuration section:
Make sure to replace your API key in the configuration. After a few seconds upon restart your Fluent Bit agent, the Calyptia Cloud Dashboard will list your agent. Metrics will take around 30 seconds to shows up.
Fluent Bit v1.4 introduces the Dump Internals feature that can be triggered easily from the command line triggering the CONT
Unix signal.
note: this feature is only available on Linux and BSD family operating systems
Run the following kill
command to signal Fluent Bit:
The command
pidof
aims to lookup the Process ID of Fluent Bit. You can replace the
Fluent Bit will dump the following information to the standard output interface (stdout):
The dump provides insights for every input instance configured.
Overall ingestion status of the plugin.
When an input plugin ingest data into the engine, a Chunk is created. A Chunk can contains multiple records. Upon flush time, the engine creates a Task that contains the routes for the Chunk associated in question.
The Task dump describes the tasks associated to the input plugin:
The Chunks dump tells more details about all the chunks that the input plugin has generated and are still being processed.
Depending of the buffering strategy and limits imposed by configuration, some Chunks might be up
(in memory) or down
(filesystem).
Fluent Bit relies on a custom storage layer interface designed for hybrid buffering. The Storage Layer
entry contains a total summary of Chunks registered by Fluent Bit:
Property | Description |
---|---|
URI | Description | Data Format |
---|
The following example set an alias to the INPUT section which is using the input plugin:
Fluent Bit's exposed can be leveraged to create dashboards and alerts.
The provided is heavily inspired by 's but with a few key differences such as the use of the instance
label (see ), stacked graphs and a focus on Fluent Bit metrics.
Sample alerts are available .
Config Name | Description | Default Value |
---|
is a hosted service that allows you to monitor your Fluent Bit agents including data flow, metrics and configurations.
Go to and sign-in
On the left menu click on and generate/copy your API key
If want to get in touch with Calyptia team, just send an email to
When the service is running we can export to see the overall status of the data flow of the service. But there are other use cases where we would like to know the current status of the internals of the service, specifically to answer questions like what's the current status of the internal buffers ? , the Dump Internals feature is the answer.
Entry | Sub-entry | Description |
---|
Entry | Description |
---|
Entry | Sub-entry | Description |
---|
Entry | Sub-Entry | Description |
---|
You may wish to test a logging pipeline locally to observe how it deals with log messages. The following is a walk-through for running Fluent Bit and Elasticsearch locally with which can serve as an example for testing other plugins locally.
Refer to the to create a configuration to test.
Use to run Fluent Bit (with the configuration file mounted) and Elasticsearch.
key_exists
Check if a key with a given name exists in the record.
key_not_exists
Check if a key does not exist in the record.
key_val_is_null
check that the value of the key is NULL.
key_val_is_not_null
check that the value of the key is NOT NULL.
key_val_eq
check that the value of the key equals the given value in the configuration.
action
action to take when a rule does not match. The available options are warn
or exit
. On warn
, a warning message is sent to the logging layer when a mismatch of the rules above is found; using exit
makes Fluent Bit abort with status code 255
.
/ | Fluent Bit build information | JSON |
/api/v1/uptime | Get uptime information in seconds and human readable format | JSON |
/api/v1/metrics | Internal metrics per loaded plugin | JSON |
/api/v1/metrics/prometheus | Internal metrics per loaded plugin ready to be consumed by a Prometheus Server | Prometheus Text 0.0.4 |
/api/v1/storage | Get internal metrics of the storage layer / buffered data. This option is enabled only if in the | JSON |
/api/v1/health | Fluent Bit health check result | String |
Health_Check | enable Health check feature | Off |
HC_Errors_Count | the error count to meet the unhealthy requirement, this is a sum for all output plugins in a defined HC_Period, example for output error: | 5 |
HC_Retry_Failure_Count | the retry failure count to meet the unhealthy requirement, this is a sum for all output plugins in a defined HC_Period, example for retry failure: | 5 |
HC_Period | The time period by second to count the error and retry failure data point | 60 |
total_tasks | Total number of active tasks associated to data generated by the input plugin. |
new | Number of tasks not assigned yet to an output plugin. Tasks are in |
running | Number of active tasks being processed by output plugins. |
size | Amount of memory used by the Chunks being processed (Total chunks size). |
total_chunks | Total number of Chunks generated by the input plugin that are still being processed by the engine. |
up_chunks | Total number of Chunks that are loaded in memory. |
down_chunks | Total number of Chunks that are stored in the filesystem but not loaded in memory yet. |
busy_chunks | Chunks marked as busy (being flushed) or locked. Busy Chunks are immutable and likely are ready to (or being) processed. |
size | Amount of bytes used by the Chunk. |
size err | Number of Chunks in an error state where it size could not be retrieved. |
total chunks | Total number of Chunks |
mem chunks | Total number of Chunks memory-based |
fs chunks | Total number of Chunks filesystem based |
up | Total number of filesystem chunks up in memory |
down | Total number of filesystem chunks down (not loaded in memory) |
Enable traffic through a proxy server via HTTP_PROXY environment variable
Fluent Bit supports setting up a HTTP proxy for all egress HTTP/HTTPS traffic by setting HTTP_PROXY
environment variable:
You can set up basic authentication with HTTP_PROXY=http://<username>:<password>@<proxy host>:<port>
to provide your username
and password
when connecting to the proxy.
You can also set up HTTP_PROXY=http://<proxy host>:<port>
to omit username
and password
if there is none.
The HTTP_PROXY
environment variable is a standard way for setting a HTTP proxy in a containerized environment, and it is also natively supported by any application written in Go. Therefore, we follow and implement the same convention for Fluent Bit.
Note: HTTP proxy is also supported using the HTTP output plugin. This configuration continues to work, however it should not be used together with the HTTP_PROXY
environment variable. This is because under the hood, the HTTP_PROXY
environment variable based proxy support is implemented by setting up a TCP connection tunnel via HTTP CONNECT. Unlike the plugin's implementation, this supports both HTTP and HTTPS egress traffic.
In some environments, we wish HTTP traffic for some domains don't go through the HTTP_PROXY, and this is where we need to use NO_PROXY
environment variable.
NO_PROXY
is a comma-separated list of host names that shouldn't go through any proxy is set in (only an asterisk, * matches all hosts), e.g. foo.com,bar.com
. This is as a curl convention.
One typical use case for NO_PROXY
is when running fluent-bit in a Kubernetes environment, where we want:
All real egress traffic goes through a HTTP proxy.
All "Kubernetes local" traffic does not go through the HTTP proxy.
We can set NO_PROXY=127.0.0.1,localhost,kubernetes.default.svc
in this case.
overlimit |
mem_size | Current memory size in use by the input plugin in-memory. |
mem_limit | Limit set by Mem_Buf_Limit. |
The collectd input plugin allows you to receive datagrams from collectd service.
The plugin supports the following configuration parameters:
Here is a basic configuration example.
With this configuration, Fluent Bit listens to 0.0.0.0:25826
, and outputs incoming datagram packets to stdout.
You must set the same types.db files that your collectd server uses. Otherwise, Fluent Bit may not be able to interpret the payload properly.
Learn how to monitor your data pipeline with external services
A Data Pipeline represents a flow of data that goes through the inputs (sources), filers, and output (sinks). There are a couple of ways to monitor the pipeline. We recommend the following sections for a better understanding and steps to get started:
The cpu input plugin, measures the CPU usage of a process or the whole system by default (considering per CPU core). It reports values in percentage unit for every interval of time set. At the moment this plugin is only available for Linux.
The following tables describes the information generated by the plugin. The keys below represent the data used by the overall system, all values associated to the keys are in a percentage unit (0 to 100%):
The CPU metrics plugin creates metrics that are log-based (I.e. JSON payload). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.
In addition to the keys reported in the above table, a similar content is created per CPU core. The cores are listed from 0 to N as the Kernel reports:
The plugin supports the following configuration parameters:
In order to get the statistics of the CPU usage of your system, you can run the plugin from the command line or through the configuration file:
In your main configuration file append the following Input & Output sections:
A plugin based on Prometheus Node Exporter to collect system / host level metrics
The initial release of Node Exporter Metrics contains a subset of collectors and metrics available from Prometheus Node Exporter and we plan to expand them over time.
Important note: Metrics collected with Node Exporter Metrics flow through a separate pipeline from logs and current filters do not operate on top of metrics.
This plugin is currently only supported on Linux based operating systems\
The following table describes the available collectors as part of this plugin. All of them are enabled by default and respects the original metrics name, descriptions, and types from Prometheus Exporter, so you can use your current dashboards without any compatibility problem.
note: the Version column specifies the Fluent Bit version where the collector is available.
You can test the expose of the metrics by using curl:
When deploying Fluent Bit in a container you will need to specify additional settings to ensure that Fluent Bit has access to the host operating system. The following docker command deploys Fluent Bit with specific mount paths and settings enabled to ensure that Fluent Bit can collect from the host. These are then exposed over port 2021.
If you like dashboards for monitoring, Grafana is one of the preferred options. In our Fluent Bit source code repository, we have pushed a simple **docker-compose **example. Steps:
Now open your browser in the address http://127.0.0.1:3000. When asked for the credentials to access Grafana, just use the **admin **username and admin password.
Note that by default Grafana dashboard plots the data from the last 24 hours, so just change it to Last 5 minutes to see the recent data being collected.
If the plugin has been configured with , this entry will report if the plugin is over the limit or not at the moment of the dump. If it is overlimit, it will print yes
, otherwise no
.
Key | Description | Default |
---|---|---|
key | description |
---|
key | description |
---|
Key | Description | Default |
---|
As described above, the CPU input plugin gathers the overall usage every one second and flushed the information to the output on the fifth second. On this example we used the stdout plugin to demonstrate the output records. In a real use-case you may want to flush this information to some central aggregator such as or .
is a popular way to collect system level metrics from operating systems, such as CPU / Disk / Network / Process statistics. Fluent Bit 1.8.0 includes node exporter metrics plugin that builds off the Prometheus design to collect system level metrics without having to manage two separate processes or agents.
Key | Description | Default |
---|
Name | Description | OS | Version |
---|
In the following configuration file, the input plugin _node_exporter_metrics collects _metrics every 2 seconds and exposes them through our output plugin on HTTP/TCP port 2021.
Our current plugin implements a sub-set of the available collectors in the original Prometheus Node Exporter, if you would like that we prioritize a specific collector please open a Github issue by using the following template: -
Listen
Set the address to listen to
0.0.0.0
Port
Set the port to listen to
25826
TypesDB
Set the data specification file
/usr/share/collectd/types.db
cpu_p | CPU usage of the overall system, this value is the summatory of time spent on user and kernel space. The result takes in consideration the numbers of CPU cores in the system. |
user_p | CPU usage in User mode, for short it means the CPU usage by user space programs. The result of this value takes in consideration the numbers of CPU cores in the system. |
system_p | CPU usage in Kernel mode, for short it means the CPU usage by the Kernel. The result of this value takes in consideration the numbers of CPU cores in the system. |
cpuN.p_cpu | Represents the total CPU usage by core N. |
cpuN.p_user | Total CPU spent in user mode or user space programs associated to this core. |
cpuN.p_system | Total CPU spent in system or kernel mode associated to this core. |
Interval_Sec | Polling interval in seconds | 1 |
Interval_NSec | Polling interval in nanoseconds | 0 |
PID | Specify the ID (PID) of a running process in the system. By default the plugin monitors the whole system but if this option is set, it will only monitor the given process ID. |
scrape_interval | The rate at which metrics are collected from the host operating system | 5 seconds |
path.procfs | The mount point used to collect process information and metrics | /proc/ |
path.sysfs | The path in the filesystem used to collect system metrics | /sys/ |
cpu | Exposes CPU statistics. | Linux | v1.8 |
cpufreq | Exposes CPU frequency statistics. | Linux | v1.8 |
diskstats | Exposes disk I/O statistics. | Linux | v1.8 |
filefd | Exposes file descriptor statistics from | Linux | v1.8.2 |
loadavg | Exposes load average. | Linux | v1.8 |
meminfo | Exposes memory statistics. | Linux | v1.8 |
netdev | Exposes network interface statistics such as bytes transferred. | Linux | v1.8.2 |
stat | Exposes various statistics from | Linux | v1.8 |
time | Exposes the current system time. | Linux | v1.8 |
uname | Exposes system information as provided by the uname system call. | Linux | v1.8 |
vmstat | Exposes statistics from | Linux | v1.8.2 |
The disk input plugin, gathers the information about the disk throughput of the running system every certain interval of time and reports them.
The Disk I/O metrics plugin creates metrics that are log-based (I.e. JSON payload). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.
The plugin supports the following configuration parameters:
In order to get disk usage from your system, you can run the plugin from the command line or through the configuration file:
In your main configuration file append the following Input & Output sections:
Note: Total interval (sec) = Interval_Sec + (Interval_Nsec / 1000000000).
e.g. 1.5s = 1s + 500000000ns
The docker input plugin allows you to collect Docker container metrics such as memory usage and CPU consumption.
Content:
The plugin supports the following configuration parameters:
If you set neither Include
nor Exclude
, the plugin will try to get metrics from all the running containers.
Here is an example configuration that collects metrics from two docker instances (6bab19c3a0f9
and 14159be4ca2c
).
This configuration will produce records like below.
The exec input plugin, allows to execute external program and collects event logs.
This plugin will not function in the distroless production images (AMD64 currently) as it needs a functional /bin/sh
which is not present. It will function in the 1.8.12 and later -debug
images though as well as the ARM production images as these include a full shell.
The plugin supports the following configuration parameters:
You can run the plugin from the command line or through the configuration file:
The following example will read events from the output of ls.
In your main configuration file append the following Input & Output sections:
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
The docker events input plugin uses the docker API to capture server events. A complete list of possible events returned by this plugin can be found
Key | Description | Default |
---|
Key | Description |
---|
Interval_Sec
Polling interval (seconds).
1
Interval_NSec
Polling interval (nanosecond).
0
Dev_Name
Device name to limit the target. (e.g. sda). If not set, in_disk gathers information from all of disks and partitions.
all disks
Interval_Sec
Polling interval in seconds
1
Include
A space-separated list of containers to include
Exclude
A space-separated list of containers to exclude
Unix_Path | The docker socket unix path | /var/run/docker.sock |
Buffer_Size | The size of the buffer used to read docker events (in bytes) | 8192 |
Parser | Specify the name of a parser to interpret the entry as a structured message. | None |
Key | When a message is unstructured (no parser applied), it's appended as a string under the key name message. | message |
Reconnect.Retry_limits | The maximum number of retries allowed. The plugin tries to reconnect with docker socket when EOF is detected. | 5 |
Reconnect.Retry_interval | The retrying interval. Unit is second. | 1 |
Command | The command to execute. |
Parser | Specify the name of a parser to interpret the entry as a structured message. |
Interval_Sec | Polling interval (seconds). |
Interval_NSec | Polling interval (nanosecond). |
Buf_Size |
Oneshot | Only run once at startup. This allows collection of data precedent to fluent-bit's startup (bool, default: false) |
The dummy input plugin, generates dummy events. It is useful for testing, debugging, benchmarking and getting started with Fluent Bit.
The plugin supports the following configuration parameters:
You can run the plugin from the command line or through the configuration file:
In your main configuration file append the following Input & Output sections:
A plugin to collect Fluent Bit's own metrics
Fluent Bit exposes its own metrics to allow you to monitor the internals of your pipeline. The collected metrics can be processed similarly to those from the Prometheus Node Exporter input plugin. They can be sent to output plugins including Prometheus Exporter or Prometheus Remote Write.
Important note: Metrics collected with Node Exporter Metrics flow through a separate pipeline from logs and current filters do not operate on top of metrics.
In the following configuration file, the input plugin _node_exporter_metrics collects _metrics every 2 seconds and exposes them through our Prometheus Exporter output plugin on HTTP/TCP port 2021.
You can test the expose of the metrics by using curl:
The head input plugin, allows to read events from the head of file. It's behavior is similar to the head command.
The plugin supports the following configuration parameters:
This mode is useful to get a specific line. This is an example to get CPU frequency from /proc/cpuinfo.
/proc/cpuinfo is a special file to get cpu information.
Cpu frequency is "cpu MHz : 2791.009". We can get the line with this configuration file.
Output is
In order to read the head of a file, you can run the plugin from the command line or through the configuration file:
The following example will read events from the /proc/uptime file, tag the records with the uptime name and flush them back to the stdout plugin:
In your main configuration file append the following Input & Output sections:
Note: Total interval (sec) = Interval_Sec + (Interval_Nsec / 1000000000).
e.g. 1.5s = 1s + 500000000ns
Forward is the protocol used by Fluent Bit and Fluentd to route messages between peers. This plugin implements the input service to listen for Forward messages.
The plugin supports the following configuration parameters:
In order to receive Forward messages, you can run the plugin from the command line or through the configuration file as shown in the following examples.
From the command line you can let Fluent Bit listen for Forward messages with the following options:
By default the service will listen an all interfaces (0.0.0.0) through TCP port 24224, optionally you can change this directly, e.g:
In the example the Forward messages will only arrive through network interface under 192.168.3.2 address and TCP Port 9090.
In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you can send some messages using the fluent-cat tool (this tool is provided by Fluentd:
In Fluent Bit we should see the following output:
The kmsg input plugin reads the Linux Kernel log buffer since the beginning, it gets every record and parse it field as priority, sequence, seconds, useconds, and message.
In order to start getting the Linux Kernel messages, you can run the plugin from the command line or through the configuration file:
As described above, the plugin processed all messages that the Linux Kernel reported, the output has been truncated for clarification.
In your main configuration file append the following Input & Output sections:
Health input plugin allows you to check how healthy a TCP server is. It does the check by issuing a TCP connection every a certain interval of time.
The plugin supports the following configuration parameters:
In order to start performing the checks, you can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit generate the checks with the following options:
In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you will see some random values in the output interface similar to this:
Size of the buffer (check for allowed values)
Key | Description |
---|---|
Key | Description | Default |
---|---|---|
Key | Description |
---|---|
Key | Description | Default |
---|---|---|
Key | Description |
---|
Dummy
Dummy JSON record. Default: {"message":"dummy"}
Start_time_sec
Dummy base timestamp in seconds. Default: 0
Start_time_nsec
Dummy base timestamp in nanoseconds. Default: 0
Rate
Events number generated per second. Default: 1
Samples
If set, the events number will be limited. e.g. If Samples=3, the plugin only generates three events and stops.
scrape_interval
The rate at which metrics are collected from the host operating system
2 seconds
scrape_on_start
Scrape metrics upon start, useful to avoid waiting for 'scrape_interval' for the first round of metrics.
false
File
Absolute path to the target file, e.g: /proc/uptime
Buf_Size
Buffer size to read the file.
Interval_Sec
Polling interval (seconds).
Interval_NSec
Polling interval (nanosecond).
Add_Path
If enabled, filepath is appended to each records. Default value is false.
Key
Rename a key. Default: head.
Lines
Line number to read. If the number N is set, in_head reads first N lines like head(1) -n.
Split_line
If enabled, in_head generates key-value pair per line.
Listen
Listener network interface.
0.0.0.0
Port
TCP port to listen for incoming connections.
24224
Unix_Path
Specify the path to unix socket to receive a Forward message. If set, Listen
and Port
are ignored.
Buffer_Max_Size
Specify the maximum buffer memory size used to receive a Forward message. The value must be according to the Unit Size specification.
6144000
Buffer_Chunk_Size
By default the buffer to store the incoming Forward messages, do not allocate the maximum memory allowed, instead it allocate memory when is required. The rounds of allocations are set by Buffer_Chunk_Size. The value must be according to the Unit Size specification.
1024000
Tag_Prefix
Prefix incoming tag with the defined value.
Host | Name of the target host or IP address to check. |
Port | TCP port where to perform the connection check. |
Interval_Sec | Interval in seconds between the service checks. Default value is 1. |
Internal_Nsec | Specify a nanoseconds interval for service checks, it works in conjunction with the Interval_Sec configuration key. Default value is 0. |
Alert | If enabled, it will only generate messages if the target TCP service is down. By default this option is disabled. |
Add_Host | If enabled, hostname is appended to each records. Default value is false. |
Add_Port | If enabled, port number is appended to each records. Default value is false. |
The mem input plugin, gathers the information about the memory and swap usage of the running system every certain interval of time and reports the total amount of memory and the amount of free available.
In order to get memory and swap usage from your system, you can run the plugin from the command line or through the configuration file:
In your main configuration file append the following Input & Output sections:
The MQTT input plugin, allows to retrieve messages/data from MQTT control packets over a TCP connection. The incoming data to receive must be a JSON map.
The plugin supports the following configuration parameters:
In order to start listening for MQTT messages, you can run the plugin from the command line or through the configuration file:
Since the MQTT input plugin let Fluent Bit behave as a server, we need to dispatch some messages using some MQTT client, in the following example mosquitto tool is being used for the purpose:
The following command line will send a message to the MQTT input plugin:
In your main configuration file append the following Input & Output sections:
The netif input plugin gathers network traffic information of the running system every certain interval of time, and reports them.
The Network I/O Metrics plugin creates metrics that are log-based (I.e. JSON payload). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.
The plugin supports the following configuration parameters:
In order to monitor network traffic from your system, you can run the plugin from the command line or through the configuration file:
In your main configuration file append the following Input & Output sections:
Note: Total interval (sec) = Interval_Sec + (Interval_Nsec / 1000000000).
e.g. 1.5s = 1s + 500000000ns
Process input plugin allows you to check how healthy a process is. It does so by performing a service check at every certain interval of time specified by the user.
The Process metrics plugin creates metrics that are log-based (I.e. JSON payload). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.
The plugin supports the following configuration parameters:
In order to start performing the checks, you can run the plugin from the command line or through the configuration file:
The following example will check the health of crond process.
In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you will see the health of process:
Key | Description |
---|---|
Key | Description |
---|---|
Key | Description |
---|---|
Listen
Listener network interface, default: 0.0.0.0
Port
TCP port where listening for connections, default: 1883
Interface
Specify the network interface to monitor. e.g. eth0
Interval_Sec
Polling interval (seconds). default: 1
Interval_NSec
Polling interval (nanosecond). default: 0
Verbose
If true, gather metrics precisely. default: false
Proc_Name
Name of the target Process to check.
Interval_Sec
Interval in seconds between the service checks. Default value is 1.
Interval_Nsec
Specify a nanoseconds interval for service checks, it works in conjunction with the Interval_Sec configuration key. Default value is 0.
Alert
If enabled, it will only generate messages if the target process is down. By default this option is disabled.
Fd
If enabled, a number of fd is appended to each records. Default value is true.
Mem
If enabled, memory usage of the process is appended to each records. Default value is true.
The stdin plugin allows to retrieve valid JSON text messages over the standard input interface (stdin). In order to use it, specify the plugin name as the input, e.g:
As input data the stdin plugin recognize the following JSON data formats:
A better example to demonstrate how it works will be through a Bash script that generates messages and writes them to Fluent Bit. Write the following content in a file named test.sh:
Give the script execution permission:
Now lets start the script and Fluent Bit in the following way:
The plugin supports the following configuration parameters:
The serial input plugin, allows to retrieve messages/data from a Serial interface.
In order to retrieve messages over the Serial interface, you can run the plugin from the command line or through the configuration file:
The following example loads the input serial plugin where it set a Bitrate of 9600, listen from the /dev/tnt0 interface and use the custom tag data to route the message.
The above interface (/dev/tnt0) is an emulation of the serial interface (more details at bottom), for demonstrative purposes we will write some message to the other end of the interface, in this case /dev/tnt1, e.g:
In Fluent Bit you should see an output like this:
Now using the Separator configuration, we could send multiple messages at once (run this command after starting Fluent Bit):
In your main configuration file append the following Input & Output sections:
The following content is some extra information that will allow you to emulate a serial interface on your Linux system, so you can test this Serial input plugin locally in case you don't have such interface in your computer. The following procedure has been tested on Ubuntu 15.04 running a Linux Kernel 4.0.
Download the sources
Unpack and compile
Copy the new kernel module into the kernel modules directory
Load the module
You should see new serial ports in /dev/ (ls /dev/tnt*) Give appropriate permissions to the new serial ports:
When the module is loaded, it will interconnect the following virtual interfaces:
Random input plugin generate very simple random value samples using the device interface /dev/urandom, if not available it will use a unix timestamp as value.
The plugin supports the following configuration parameters:
In order to start generating random samples, you can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit generate the samples with the following options:
In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you will see the reports in the output interface similar to this:
Key | Description | Default |
---|---|---|
Key | Description |
---|---|
Key | Description |
---|---|
Buffer_Size
Set the buffer size to read data. This value is used to increase buffer size. The value must be according to the Unit Size specification.
16k
File
Absolute path to the device entry, e.g: /dev/ttyS0
Bitrate
The bitrate for the communication, e.g: 9600, 38400, 115200, etc
Min_Bytes
The serial interface will expect at least Min_Bytes to be available before to process the message (default: 1)
Separator
Allows to specify a separator string that's used to determinate when a message ends.
Format
Specify the format of the incoming data stream. The only option available is 'json'. Note that Format and Separator cannot be used at the same time.
Samples
If set, it will only generate a specific number of samples. By default this value is set to -1, which will generate unlimited samples.
Interval_Sec
Interval in seconds between samples generation. Default value is 1.
Internal_Nsec
Specify a nanoseconds interval for samples generation, it works in conjunction with the Interval_Sec configuration key. Default value is 0.
The statsd input plugin allows you to receive metrics via StatsD protocol.
Content:
The plugin supports the following configuration parameters:
Here is a configuration example.
Now you can input metrics through the UDP port as follows:
Fluent Bit will produce the following records:
The Systemd input plugin allows to collect log messages from the Journald daemon on Linux environments.
The plugin supports the following configuration parameters:
In order to receive Systemd messages, you can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit listen for Systemd messages with the following options:
In the example above we are collecting all messages coming from the Docker service.
In your main configuration file append the following Input & Output sections:
Key | Description | Default |
---|
Key | Description | Default |
---|
Listen | Listener network interface. | 0.0.0.0 |
Port | UDP port where listening for connections | 8125 |
Path | Optional path to the Systemd journal directory, if not set, the plugin will use default paths to read local-only logs. |
Max_Fields | Set a maximum number of fields (keys) allowed per record. | 8000 |
Max_Entries | When Fluent Bit starts, the Journal might have a high number of logs in the queue. In order to avoid delays and reduce memory usage, this option allows to specify the maximum number of log entries that can be processed per round. Once the limit is reached, Fluent Bit will continue processing the remaining log entries once Journald performs the notification. | 5000 |
Systemd_Filter | Allows to perform a query over logs that contains a specific Journald key/value pairs, e.g: _SYSTEMD_UNIT=UNIT. The Systemd_Filter option can be specified multiple times in the input section to apply multiple filters as required. |
Systemd_Filter_Type | Define the filter type when Systemd_Filter is specified multiple times. Allowed values are And and Or. With And a record is matched only when all of the Systemd_Filter have a match. With Or a record is matched when any of the Systemd_Filter has a match. | Or |
Tag | The tag is used to route messages but on Systemd plugin there is an extra functionality: if the tag includes a star/wildcard, it will be expanded with the Systemd Unit file ( |
DB | Specify the absolute path of a database file to keep track of Journald cursor. |
DB.Sync | Full |
Read_From_Tail | Start reading new entries. Skip entries already stored in Journald. | Off |
Lowercase | Lowercase the Journald field (key). | Off |
Strip_Underscores | Remove the leading underscore of the Journald field (key). For example the Journald field _PID becomes the key PID. | Off |
Set a default synchronization (I/O) method. values: Extra, Full, Normal, Off. This flag affects how the internal SQLite engine do synchronization to disk, for more details about each option please refer to . note: this option was introduced on Fluent Bit v1.4.6.
The tcp input plugin allows to retrieve structured JSON or raw messages over a TCP network interface (TCP port).
The plugin supports the following configuration parameters:
In order to receive JSON messages over TCP, you can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit listen for JSON messages with the following options:
By default the service will listen an all interfaces (0.0.0.0) through TCP port 5170, optionally you can change this directly, e.g:
In the example the JSON messages will only arrive through network interface under 192.168.3.2 address and TCP Port 9090.
In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you can send some messages using the netcat:
In Fluent Bit we should see the following output:
When receiving payloads in JSON format, there are high performance penalties. Parsing JSON is a very expensive task so you could expect your CPU usage increase under high load environments.
To get faster data ingestion, consider to use the option Format none
to avoid JSON parsing if not needed.
Key | Description | Default |
---|---|---|
Listen
Listener network interface.
0.0.0.0
Port
TCP port where listening for connections
5170
Buffer_Size
Specify the maximum buffer size in KB to receive a JSON message. If not set, the default size will be the value of Chunk_Size.
Chunk_Size
By default the buffer to store the incoming JSON messages, do not allocate the maximum memory allowed, instead it allocate memory when is required. The rounds of allocations are set by Chunk_Size in KB. If not set, Chunk_Size is equal to 32 (32KB).
32
Format
Specify the expected payload format. It support the options json and none. When using json, it expects JSON maps, when is set to none, it will split every record using the defined Separator (option below).
json
Separator
When the expected Format is set to none, Fluent Bit needs a separator string to split the records. By default it uses the breakline character (LF or 0x10).
Syslog input plugins allows to collect Syslog messages through a Unix socket server (UDP or TCP) or over the network using TCP or UDP.
The plugin supports the following configuration parameters:
When using Syslog input plugin, Fluent Bit requires access to the parsers.conf file, the path to this file can be specified with the option -R or through the Parsers_File key on the [SERVICE] section (more details below).
When udp or unix_udp is used, the buffer size to receive messages is configurable only through the Buffer_Chunk_Size option which defaults to 32kb.
In order to receive Syslog messages, you can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit listen for Forward messages with the following options:
By default the service will create and listen for Syslog messages on the unix socket /tmp/in_syslog
In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you can send some messages using the logger tool:
In Fluent Bit we should see the following output:
The following content aims to provide configuration examples for different use cases to integrate Fluent Bit and make it listen for Syslog messages from your systems.
Put the following content in your fluent-bit.conf file:
then start Fluent Bit.
Add a new file to your rsyslog config rules called 60-fluent-bit.conf inside the directory /etc/rsyslog.d/ and add the following content:
then make sure to restart your rsyslog daemon:
Put the following content in your fluent-bit.conf file:
then start Fluent Bit.
Add a new file to your rsyslog config rules called 60-fluent-bit.conf inside the directory /etc/rsyslog.d/ and place the following content:
Make sure that the socket file is readable by rsyslog (tweak the Unix_Perm
option shown above).
The tail input plugin allows to monitor one or several text files. It has a similar behavior like tail -f
shell command.
The plugin reads every matched file in the Path
pattern and for every new line found (separated by a ), it generates a new record. Optionally a database file can be used so the plugin can have a history of tracked files and a state of offsets, this is very useful to resume a state if the service is restarted.
The plugin supports the following configuration parameters:
Note that if the database parameter DB
is not specified, by default the plugin will start reading each target file from the beginning. This also might cause some unwanted behavior, for example when a line is bigger that Buffer_Chunk_Size
and Skip_Long_Lines
is not turned on, the file will be read from the beginning of each Refresh_Interval
until the file is rotated.
Starting from Fluent Bit v1.8 we have introduced a new Multiline core functionality. For Tail input plugin, it means that now it supports the old configuration mechanism but also the new one. In order to avoid breaking changes, we will keep both but encourage our users to use the latest one. We will call the two mechanisms as:
Multiline Core
Old Multiline
The new multiline core is exposed by the following configuration:
As stated in the Multiline Parser documentation, now we provide built-in configuration modes. Note that when using a new multiline.parser
definition, you must disable the old configuration from your tail section like:
parser
parser_firstline
parser_N
multiline
multiline_flush
docker_mode
If you are running Fluent Bit to process logs coming from containers like Docker or CRI, you can use the new built-in modes for such purposes. This will help to reassembly multiline messages originally split by Docker or CRI:
The two options separated by a comma means multi-format: try docker
and cri
multiline formats.
We are still working on extending support to do multiline for nested stack traces and such. Over the Fluent Bit v1.8.x release cycle we will be updating the documentation.
For the old multiline configuration, the following options exist to configure the handling of multilines logs:
Docker mode exists to recombine JSON log lines split by the Docker daemon due to its line length limit. To use this feature, configure the tail plugin with the corresponding parser and then enable Docker mode:
In order to tail text or log files, you can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit parse text files with the following options:
In your main configuration file append the following Input & Output sections. An example visualization can be found here
When using multi-line configuration you need to first specify Multiline On
in the configuration and use the Parser_Firstline
and additional parser parameters Parser_N
if needed. If we are trying to read the following Java Stacktrace as a single event
We need to specify a Parser_Firstline
parameter that matches the first line of a multi-line event. Once a match is made Fluent Bit will read all future lines until another match with Parser_Firstline
is made .
In the case above we can use the following parser, that extracts the Time as time
and the remaining portion of the multiline as log
If we want to further parse the entire event we can add additional parsers with Parser_N
where N is an integer. The final Fluent Bit configuration looks like the following:
Our output will be as follows.
The tail input plugin a feature to save the state of the tracked files, is strongly suggested you enabled this. For this purpose the db property is available, e.g:
When running, the database file /path/to/logs.db will be created, this database is backed by SQLite3 so if you are interested into explore the content, you can open it with the SQLite client tool, e.g:
Make sure to explore when Fluent Bit is not hard working on the database file, otherwise you will see some Error: database is locked messages.
By default SQLite client tool do not format the columns in a human read-way, so to explore in_tail_files table you can create a config file in ~/.sqliterc with the following content:
Fluent Bit keep the state or checkpoint of each file through using a SQLite database file, so if the service is restarted, it can continue consuming files from it last checkpoint position (offset). The default options set are enabled for high performance and corruption-safe.
The SQLite journaling mode enabled is Write Ahead Log
or WAL
. This allows to improve performance of read and write operations to disk. When enabled, you will see in your file system additional files being created, consider the following configuration statement:
The above configuration enables a database file called test.db
and in the same path for that file SQLite will create two additional files:
test.db-shm
test.db-wal
Those two files aims to support the WAL
mechanism that helps to improve performance and reduce the number system calls required. The -wal
file refers to the file that stores the new changes to be committed, at some point the WAL
file transactions are moved back to the real database file. The -shm
file is a shared-memory type to allow concurrent-users to the WAL
file.
The WAL
mechanism give us higher performance but also might increase the memory usage by Fluent Bit. Most of this usage comes from the memory mapped and cached pages. In some cases you might see that memory usage keeps a bit high giving the impression of a memory leak, but actually is not relevant unless you want your memory metrics back to normal. Starting from Fluent Bit v1.7.3 we introduced the new option db.journal_mode
mode that sets the journal mode for databases, by default it will be WAL (Write-Ahead Logging)
, currently allowed configurations for db.journal_mode
are DELETE | TRUNCATE | PERSIST | MEMORY | WAL | OFF
.
File rotation is properly handled, including logrotate's copytruncate mode.
Note that the Path
patterns cannot match the rotated files. Otherwise, the rotated file would be read again and lead to duplicate records.
The winlog input plugin allows you to read Windows Event Log.
The plugin supports the following configuration parameters:
Note that if you do not set db, the plugin will read channels from the beginning on each startup.
Here is a minimum configuration example.
Note that some Windows Event Log channels (like Security
) requires an admin privilege for reading. In this case, you need to run fluent-bit as an administrator.
If you want to do a quick test, you can run this plugin from the command line.
The JSON parser is the simplest option: if the original log source is a JSON map string, it will take its structure and convert it directly to the internal binary representation.
A simple configuration that can be found in the default parsers configuration file, is the entry to parse Docker log files (when the tail input plugin is used):
The following log entry is a valid content for the parser defined above:
After processing, its internal representation will be:
The time has been converted to Unix timestamp (UTC) and the map reduced to each component of the original message.
The thermal input plugin reports system temperatures periodically -- each second by default. Currently this plugin is only available for Linux.
The following tables describes the information generated by the plugin.
The plugin supports the following configuration parameters:
In order to get temperature(s) of your system, you can run the plugin from the command line or through the configuration file:
Some systems provide multiple thermal zones. In this example monitor only thermal_zone0 by name, once per minute.
In your main configuration file append the following Input & Output sections:
The parser engine is fully configurable and can process log entries based in two types of format:
By default, Fluent Bit provides a set of pre-configured parsers that can be used for different use cases such as logs from:
Apache
Nginx
Docker
Syslog rfc5424
Syslog rfc3164
Parsers are defined in one or multiple configuration files that are loaded at start time, either from the command line or through the main Fluent Bit configuration file.
Multiple parsers can be defined and each section has it own properties. The following table describes the available options for each parser definition:
All parsers must be defined in a parsers.conf file, not in the Fluent Bit global configuration file. The parsers file expose all parsers available that can be used by the Input plugins that are aware of this feature. A parsers file can have multiple entries like this:
For more information about the parsers available, please refer to the default parsers file distributed with Fluent Bit source code:
In addition, we extended our time resolution to support fractional seconds like 2017-05-17T15:44:31**.187512963**Z. Since Fluent Bit v0.12 we have full support for nanoseconds resolution, the %L format option for Time_Format is provided as a way to indicate that content must be interpreted as fractional seconds.
Note: The option %L is only valid when used after seconds (
%S
) or seconds since the Epoch (%s
), e.g:%S.%L
or%s.%L
The regex parser allows to define a custom Ruby Regular Expression that will use a named capture feature to define which content belongs to which key name.
Note: understanding how regular expressions works is out of the scope of this content.
From a configuration perspective, when the format is set to regex, is mandatory and expected that a Regex configuration key exists.
The following parser configuration example aims to provide rules that can be applied to an Apache HTTP Server log entry:
As an example, takes the following Apache HTTP Server log entry:
The above content do not provide a defined structure for Fluent Bit, but enabling the proper parser we can help to make a structured representation of it:
A common pitfall is that you cannot use characters other than alphabets, numbers and underscore in group names. For example, a group name like (?<user-name>.*)
will cause an error due to containing an invalid character (-
).
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description |
---|---|
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|
key | description |
---|
Key | Description |
---|
Parsers are an important component of , with them you can take any unstructured log entry and give them a structure that makes easier it processing and further filtering.
(named capture)
Note: If you are using Regular Expressions note that Fluent Bit uses Ruby based regular expressions and we encourage to use web site as an online editor to test them.
Key | Description |
---|
Time resolution and it format supported are handled by using the libc system function.
Fluent Bit uses regular expression library on Ruby mode, for testing purposes you can use the following web editor to test your expressions:
Important: do not attempt to add multiline support in your regular expressions if you are using input plugin since each line is handled as a separated entity. Instead use Tail support configuration feature.
Security Warning: Onigmo is a backtracking regex engine. You need to be careful not to use expensive regex patterns, or Onigmo can take very long time to perform pattern matching. For details, please read the article on OWASP.
In order to understand, learn and test regular expressions like the example above, we suggest you try the following Ruby Regular Expression Editor:
Mode
Defines transport protocol mode: unix_udp (UDP over Unix socket), unix_tcp (TCP over Unix socket), tcp or udp
unix_udp
Listen
If Mode is set to tcp or udp, specify the network interface to bind.
0.0.0.0
Port
If Mode is set to tcp or udp, specify the TCP port to listen for incoming connections.
5140
Path
If Mode is set to unix_tcp or unix_udp, set the absolute path to the Unix socket file.
Unix_Perm
If Mode is set to unix_tcp or unix_udp, set the permission of the Unix socket file.
0644
Parser
Specify an alternative parser for the message. If Mode is set to tcp or udp then the default parser is syslog-rfc5424 otherwise syslog-rfc3164-local is used. If your syslog messages have fractional seconds set this Parser value to syslog-rfc5424 instead.
Buffer_Chunk_Size
By default the buffer to store the incoming Syslog messages, do not allocate the maximum memory allowed, instead it allocate memory when is required. The rounds of allocations are set by Buffer_Chunk_Size. If not set, Buffer_Chunk_Size is equal to 32000 bytes (32KB). Read considerations below when using udp or unix_udp mode.
Buffer_Max_Size
Specify the maximum buffer size to receive a Syslog message. If not set, the default size will be the value of Buffer_Chunk_Size.
Buffer_Chunk_Size
Set the initial buffer size to read files data. This value is used to increase buffer size. The value must be according to the Unit Size specification.
32k
Buffer_Max_Size
Set the limit of the buffer size per monitored file. When a buffer needs to be increased (e.g: very long lines), this value is used to restrict how much the memory buffer can grow. If reading a file exceeds this limit, the file is removed from the monitored file list. The value must be according to the Unit Size specification.
32k
Path
Pattern specifying a specific log file or multiple ones through the use of common wildcards. Multiple patterns separated by commas are also allowed.
Path_Key
If enabled, it appends the name of the monitored file as part of the record. The value assigned becomes the key in the map.
Exclude_Path
Set one or multiple shell patterns separated by commas to exclude files matching certain criteria, e.g: Exclude_Path *.gz,*.zip
Offset_Key
If enabled, Fluent Bit appends the offset of the current monitored file as part of the record. The value assigned becomes the key in the map
Read_from_Head
For new discovered files on start (without a database offset/position), read the content from the head of the file, not tail.
False
Refresh_Interval
The interval of refreshing the list of watched files in seconds.
60
Rotate_Wait
Specify the number of extra time in seconds to monitor a file once is rotated in case some pending data is flushed.
5
Ignore_Older
Ignores files which modification date is older than this time in seconds. Supports m,h,d (minutes, hours, days) syntax.
Skip_Long_Lines
When a monitored file reaches its buffer capacity due to a very long line (Buffer_Max_Size), the default behavior is to stop monitoring that file. Skip_Long_Lines alter that behavior and instruct Fluent Bit to skip long lines and continue processing other lines that fits into the buffer size.
Off
Skip_Empty_Lines
Skips empty lines in the log file from any further processing or output.
Off
DB
Specify the database file to keep track of monitored files and offsets.
DB.sync
Set a default synchronization (I/O) method. Values: Extra, Full, Normal, Off. This flag affects how the internal SQLite engine do synchronization to disk, for more details about each option please refer to this section. Most of workload scenarios will be fine with normal
mode, but if you really need full synchronization after every write operation you should set full
mode. Note that full
has a high I/O performance cost.
normal
DB.locking
Specify that the database will be accessed only by Fluent Bit. Enabling this feature helps to increase performance when accessing the database but it restrict any external tool to query the content.
false
DB.journal_mode
sets the journal mode for databases (WAL). Enabling WAL provides higher performance. Note that WAL is not compatible with shared network file systems.
WAL
Mem_Buf_Limit
Set a limit of memory that Tail plugin can use when appending data to the Engine. If the limit is reach, it will be paused; when the data is flushed it resumes.
Exit_On_Eof
When reading a file will exit as soon as it reach the end of the file. Useful for bulk load and tests
false
Parser
Specify the name of a parser to interpret the entry as a structured message.
Key
When a message is unstructured (no parser applied), it's appended as a string under the key name log. This option allows to define an alternative name for that key.
log
Inotify_Watcher
Set to false to use file stat watcher instead of inotify.
true
Tag
Set a tag (with regex-extract fields) that will be placed on lines read. E.g. kube.<namespace_name>.<pod_name>.<container_name>
. Note that "tag expansion" is supported: if the tag includes an asterisk (*), that asterisk will be replaced with the absolute path of the monitored file (also see Workflow of Tail + Kubernetes Filter).
Tag_Regex
Set a regex to extract fields from the file name. E.g. (?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-
Static_Batch_Size
Set the maximum number of bytes to process per iteration for the monitored static files (files that already exists upon Fluent Bit start).
50M
multiline.parser
Specify one or multiple Multiline Parser definitions to apply to the content.
Multiline
If enabled, the plugin will try to discover multiline messages and use the proper parsers to compose the outgoing messages. Note that when this option is enabled the Parser option is not used.
Off
Multiline_Flush
Wait period time in seconds to process queued multiline messages
4
Parser_Firstline
Name of the parser that matches the beginning of a multiline message. Note that the regular expression defined in the parser must include a group name (named capture), and the value of the last match group must be a string
Parser_N
Optional-extra parser to interpret and structure multiline entries. This option can be used to define multiple parsers, e.g: Parser_1 ab1, Parser_2 ab2, Parser_N abN.
Docker_Mode
If enabled, the plugin will recombine split Docker log lines before passing them to any parser as configured above. This mode cannot be used at the same time as Multiline.
Off
Docker_Mode_Flush
Wait period time in seconds to flush queued unfinished split lines.
4
Docker_Mode_Parser
Specify an optional parser for the first line of the docker multiline mode. The parser name to be specified must be registered in the parsers.conf
file.
Channels | A comma-separated list of channels to read from. |
Interval_Sec | Set the polling interval for each channel. (optional) | 1 |
DB | Set the path to save the read offsets. (optional) |
name | The name of the thermal zone, such as thermal_zone0 |
type | The type of the thermal zone, such as x86_pkg_temp |
temp | Current temperature in celsius |
Interval_Sec | Polling interval (seconds). default: 1 |
Interval_NSec | Polling interval (nanoseconds). default: 0 |
name_regex | Optional name filter regex. default: None |
type_regex | Optional type filter regex. default: None |
The logfmt parser allows to parse the logfmt format described in https://brandur.org/logfmt . A more formal description is in https://godoc.org/github.com/kr/logfmt .
Here is an example configuration:
The following log entry is a valid content for the parser defined above:
After processing, it internal representation will be:
The ltsv parser allows to parse LTSV formatted texts.
Labeled Tab-separated Values (LTSV format is a variant of Tab-separated Values (TSV). Each record in a LTSV file is represented as a single line. Each field is separated by TAB and has a label and a value. The label and the value have been separated by ':'.
Here is an example how to use this format in the apache access log.
Config this in httpd.conf:
The parser.conf:
The following log entry is a valid content for the parser defined above:
After processing, it internal representation will be:
The time has been converted to Unix timestamp (UTC).
Name | Set an unique name for the parser in question. |
Format |
Regex | If format is regex, this option must be set specifying the Ruby Regular Expression that will be used to parse and compose the structured message. |
Time_Key | If the log entry provides a field with a timestamp, this option specifies the name of that field. |
Time_Format |
Time_Offset | Specify a fixed UTC time offset (e.g. -0600, +0200, etc.) for local dates. |
Time_Keep | By default when a time key is recognized and parsed, the parser will drop the original time field. Enabling this option will make the parser to keep the original time field and it value in the log entry. |
Types | Specify the data type of parsed field. The syntax is |
Decode_Field | Decode a field value, the only decoder available is |
There are certain cases where the log messages being parsed contains encoded data, a typical use case can be found in containerized environments with Docker: application logs it data in JSON format but becomes an escaped string, Consider the following example
Original message generated by the application:
Then the Docker log message become encapsulated as follows:
as you can see the original message is handled as an escaped string. Ideally in Fluent Bit we would like to keep having the original structured message and not a string.
Decoders are a built-in feature available through the Parsers file, each Parser definition can optionally set one or multiple decoders. There are two type of decoders type:
Decode_Field: if the content can be decoded in a structured message, append that structure message (keys and values) to the original log message.
Decode_Field_As: any content decoded (unstructured or structured) will be replaced in the same key/value, no extra keys are added.
Our pre-defined Docker Parser have the following definition:
Each line in the parser with a key Decode_Field instruct the parser to apply a specific decoder on a given field, optionally it offer the option to take an extra action if the decoder cannot succeed.
By default if a decoder fails to decode the field or want to try a next decoder, is possible to define an optional action. Available actions are:
Note that actions are affected by some restrictions:
on Decode_Field_As, if succeeded, another decoder of the same type in the same field can be applied only if the data continues being an unstructured message (raw text).
on Decode_Field, if succeeded, can only be applied once for the same field. By nature Decode_Field aims to decode a structured message.
Example input (from /path/to/log.log
in configuration below)
Example output
Configuration file
The fluent-bit-parsers.conf
file,
The AWS Filter Enriches logs with AWS Metadata. Currently the plugin adds the EC2 instance ID and availability zone to log records. To use this plugin, you must be running in EC2 and have the instance metadata service enabled.
The plugin supports the following configuration parameters:
Note: If you run Fluent Bit in a container, you may have to use instance metadata v1. The plugin behaves the same regardless of which version is used.
The following plugin looks up if a value in a specified list exists and then allows the addition of a record to indicate if found. Introduced in version 1.8.4
The plugin supports the following configuration parameters
In the following configuration we will read a file test1.log
that includes the following values
Additionally, we will use the following lookup file which contains a list of malicious IPs (ip_list.txt
)
In the configuration we are using $remote_addr as the lookup key and 7.7.7.7 is malicious. This means the record we would output for the last record would look like the following
Look up Geo data from IP
GeoIP2 Filter allows you to enrich the incoming data stream using location data from GeoIP2 database.
This plugin supports the following configuration parameters:
The following configuration will process incoming remote_addr
, and append country information retrieved from GeoLite2 database.
Each Record
parameter above specifies the following triplet:
The field name to be added to records (country
)
The lookup key to process (remote_addr
)
The query for GeoIP2 database (%{country.names.en}
)
By running Fluent Bit with the configuration above, you will see the following output:
Specify the format of the parser, the available options here are: , , or .
Specify the format of the time field so it can be recognized and analyzed properly. Fluent-bit uses strptime(3)
to parse time so you can refer to for available modifiers.
Name | Description |
---|---|
Name | Description |
---|---|
Key | Description | Default |
---|---|---|
Key | Description |
---|
Key | Description |
---|
Note that the GeoLite2-City.mmdb
database is available from .
json
handle the field content as a JSON map. If it find a JSON map it will replace the content with a structured map.
escaped
decode an escaped string.
escaped_utf8
decode a UTF8 escaped string.
try_next
if the decoder failed, apply the next Decoder in the list for the same field.
do_next
if the decoder succeeded or failed, apply the next Decoder in the list for the same field.
imds_version
Specify which version of the instance metadata service to use. Valid values are 'v1' or 'v2'.
v2
az
The availability zone; for example, "us-east-1a".
true
ec2_instance_id
The EC2 instance ID.
true
ec2_instance_type
The EC2 instance type.
false
private_ip
The EC2 instance private ip.
false
ami_id
The EC2 instance image id.
false
account_id
The account ID for current EC2 instance.
false
hostname
The hostname for current EC2 instance.
false
vpc_id
The VPC ID for current EC2 instance.
false
file | The single value file that Fluent Bit will use as a lookup table to determine if the specified |
lookup_key | The specific key to look up and determine if it exists, supports record accessor |
record | The record to add if the |
database | Path to the GeoIP2 database. |
lookup_key | Field name to process |
record | Defines the |
Made for testing: make sure that your records contain the expected key and values
The expect filter plugin allows you to validate that records match certain criteria in their structure, like validating that a key exists or it has a specific value.
The following page just describes the configuration properties available, for a detailed explanation of its usage and use cases, please refer the following page:
The plugin supports the following configuration parameters:
As mentioned on top, refer to the following page for specific details of usage of this filter:
Select or exclude records per patterns
The Grep Filter plugin allows you to match or exclude specific records based on regular expression patterns for values or nested values.
The plugin supports the following configuration parameters:
In order to start filtering records, you can run the filter from the command line or through the configuration file. The following example assumes that you have a file called lines.txt
with the following content:
Note: using the command line mode need special attention to quote the regular expressions properly. It's suggested to use a configuration file.
The following command will load the tail plugin and read the content of lines.txt
file. Then the grep filter will apply a regular expression rule over the log field (created by tail plugin) and only pass the records which field value starts with aa:
The filter allows to use multiple rules which are applied in order, you can have many Regex and Exclude entries as required.
if you want to exclude records that match given nested field (for example kubernetes.labels.app
), you can use the following rule:
Due to the necessity to have a flexible filtering mechanism, it is now possible to extend Fluent Bit capabilities by writing custom filters using Lua programming language. A Lua-based filter takes two steps:
Configure the Filter in the main configuration
Prepare a Lua script that will be used by the Filter
The plugin supports the following configuration parameters:
From the command line you can use the following options:
In your main configuration file append the following Input, Filter & Output sections:
The life cycle of a filter have the following steps:
Upon Tag matching by this filter, it may process or bypass the record.
If tag matched, it will accept the record and invoke the function defined in the call
property which basically is the name of a function defined in the Lua script
.
Invoke Lua function and pass each record in JSON format.
Upon return, validate return value and continue the pipeline.
The Lua script can have one or multiple callbacks that can be used by this filter. The function prototype is as follows:
Each callback must return three values:
For functional examples of this interface, please refer to the code samples provided in the source code of the project located here:
+Lua treats number as double. It means an integer field (e.g. IDs, log levels) will be converted double. To avoid type conversion, The type_int_key
property is available.
The Lua callback function can return an array of tables (i.e., array of records) in its third record return value. With this feature, the Lua filter can split one input record into multiple records according to custom logic.
For example:
Property | Description |
---|---|
Key | Value Format | Description |
---|
This plugin enables the feature to specify the KEY. Using the record accessor is suggested if you want to match values against nested values.
If you want to match or exclude records based on nested values, you can use a format as the KEY name. Consider the following record example:
The Lua filter allows you to modify the incoming records (even split one record into multiple records) using custom scripts.
Key | Description |
---|
In order to test the filter, you can run the plugin from the command line or through the configuration file. The following examples use the input plugin for data ingestion, invoke Lua filter using the script and call the function which only prints the same information to the standard output:
name | description |
---|
name | data type | description |
---|
Fluent Bit supports protected mode to prevent crash when executes invalid Lua script. See also .
See also .
key_exists
Check if a key with a given name exists in the record.
key_not_exists
Check if a key does not exist in the record.
key_val_is_null
check that the value of the key is NULL.
key_val_is_not_null
check that the value of the key is NOT NULL.
key_val_eq
check that the value of the key equals the given value in the configuration.
action
action to take when a rule does not match. The available options are warn
or exit
. On warn
, a warning message is sent to the logging layer when a mismatch of the rules above is found; using exit
makes Fluent Bit abort with status code 255
Regex | KEY REGEX | Keep records in which the content of KEY matches the regular expression. |
Exclude | KEY REGEX | Exclude records in which the content of KEY matches the regular expression. |
script | Path to the Lua script that will be used. This can be a relative path against the main configuration file. |
call | Lua function name that will be triggered to do filtering. It's assumed that the function is declared inside the script parameter defined above. |
type_int_key | If these keys are matched, the fields are converted to integer. If more than one key, delimit by space. Note that starting from Fluent Bit v1.6 integer data types are preserved and not converted to double as in previous versions. |
type_array_key | If these keys are matched, the fields are handled as array. If more than one key, delimit by space. It is useful the array can be empty. |
protected_mode | If enabled, Lua script will be executed in protected mode. It prevents Fluent Bit from crashing when invalid Lua script is executed or the triggered Lua function throws exceptions. Default is true. |
time_as_table | By default when the Lua script is invoked, the record timestamp is passed as a floating number which might lead to precision loss when it is converted back. If you desire timestamp precision, enabling this option will pass the timestamp as a Lua table with keys |
tag | Name of the tag associated with the incoming record. |
timestamp | Unix timestamp with nanoseconds associated with the incoming record. The original format is a double (seconds.nanoseconds) |
record | Lua table with the record content |
code | integer | The code return value represents the result and further action that may follows. If code equals -1, means that the record will be dropped. If code equals 0, the record will not be modified, otherwise if code equals 1, means the original timestamp and record have been modified so it must be replaced by the returned values from timestamp (second return value) and record (third return value). If code equals 2, means the original timestamp is not modified and the record has been modified so it must be replaced by the returned values from record (third return value). The code 2 is supported from v1.4.3. |
timestamp | double | If code equals 1, the original record timestamp will be replaced with this new value. |
record | table | If code equals 1, the original record information will be replaced with this new value. Note that the record value must be a valid Lua table. This value can be an array of tables (i.e., array of objects in JSON format), and in that case the input record is effectively split into multiple records. (see below for more details) |
The Record Modifier Filter plugin allows to append fields or to exclude specific fields.
The plugin supports the following configuration parameters: Remove_key and Allowlist_key are exclusive.
In order to start filtering records, you can run the filter from the command line or through the configuration file.
This is a sample in_mem record to filter.
The following configuration file is to append product name and hostname (via environment variable) to record.
You can also run the filter from command line.
The output will be
The following configuration file is to remove 'Swap.*' fields.
You can also run the filter from command line.
The output will be
The following configuration file is to remain 'Mem.*' fields.
You can also run the filter from command line.
The output will be
The Parser Filter plugin allows for parsing fields in event records.
The plugin supports the following configuration parameters:
This is an example of parsing a record {"data":"100 0.5 true This is example"}
.
The plugin needs a parser file which defines how to parse each field.
The path of the parser file should be written in configuration file under the [SERVICE] section.
The output is
You can see the records {"data":"100 0.5 true This is example"}
are parsed.
By default, the parser plugin only keeps the parsed fields in its output.
If you enable Reserve_Data
, all other fields are preserved:
This will produce the output:
If you enable Reserved_Data
and Preserve_Key
, the original key field will be preserved as well:
This will produce the following output:
The Modify Filter plugin allows you to change records using rules and conditions.
As an example using JSON notation to,
Rename Key2
to RenamedKey
Add a key OtherKey
with value Value3
if OtherKey
does not yet exist
Example (input)
Example (output)
The plugin supports the following rules:
Rules are case insensitive, parameters are not
Any number of rules can be set in a filter instance.
Rules are applied in the order they appear, with each rule operating on the result of the previous rule.
The plugin supports the following conditions:
Conditions are case insensitive, parameters are not
Any number of conditions can be set.
Conditions apply to the whole filter instance and all its rules. Not to individual rules.
All conditions have to be true
for the rules to be applied.
You can set Record Accessor as STRING:KEY
for nested key.
In order to start filtering records, you can run the filter from the command line or through the configuration file. The following invokes the Memory Usage Input Plugin, which outputs the following (example),
Note: Using the command line mode requires quotes parse the wildcard properly. The use of a configuration file is recommended.
The output of both the command line and configuration invocations should be identical and result in the following output.
Fluent Bit Kubernetes Filter allows to enrich your log files with Kubernetes metadata.
When Fluent Bit is deployed in Kubernetes as a DaemonSet and configured to read the log files from the containers (using tail or systemd input plugins), this filter aims to perform the following operations:
Analyze the Tag and extract the following metadata:
Pod Name
Namespace
Container Name
Container ID
Query Kubernetes API Server to obtain extra metadata for the POD in question:
Pod ID
Labels
Annotations
The data is cached locally in memory and appended to each record.
The plugin supports the following configuration parameters:
Kubernetes Filter aims to provide several ways to process the data contained in the log key. The following explanation of the workflow assumes that your original Docker parser defined in parsers.conf is as follows:
Since Fluent Bit v1.2 we are not suggesting the use of decoders (Decode_Field_As) if you are using Elasticsearch database in the output to avoid data type conflicts.
To perform processing of the log key, it's mandatory to enable the Merge_Log configuration property in this filter, then the following processing order will be done:
If a Pod suggest a parser, the filter will use that parser to process the content of log.
If the option Merge_Parser was set and the Pod did not suggest a parser, process the log content using the suggested parser in the configuration.
If no Pod was suggested and no Merge_Parser is set, try to handle the content as JSON.
If log value processing fails, the value is untouched. The order above is not chained, meaning it's exclusive and the filter will try only one of the options above, not all of them.
A flexible feature of Fluent Bit Kubernetes filter is that allow Kubernetes Pods to suggest certain behaviors for the log processor pipeline when processing the records. At the moment it support:
Suggest a pre-defined parser
Request to exclude logs
The following annotations are available:
The following Pod definition runs a Pod that emits Apache logs to the standard output, in the Annotations it suggest that the data should be processed using the pre-defined parser called apache:
There are certain situations where the user would like to request that the log processor simply skip the logs from the Pod in question:
Note that the annotation value is boolean which can take a true or false and must be quoted.
Kubernetes Filter depends on either Tail or Systemd input plugins to process and enrich records with Kubernetes metadata. Here we will explain the workflow of Tail and how it configuration is correlated with Kubernetes filter. Consider the following configuration example (just for demo purposes, not production):
In the input section, the Tail plugin will monitor all files ending in .log in path /var/log/containers/. For every file it will read every line and apply the docker parser. Then the records are emitted to the next step with an expanded tag.
Tail support Tags expansion, which means that if a tag have a star character (*), it will replace the value with the absolute path of the monitored file, so if you file name and path is:
then the Tag for every record of that file becomes:
note that slashes are replaced with dots.
When Kubernetes Filter runs, it will try to match all records that starts with kube. (note the ending dot), so records from the file mentioned above will hit the matching rule and the filter will try to enrich the records
Kubernetes Filter do not care from where the logs comes from, but it cares about the absolute name of the monitored file, because that information contains the pod name and namespace name that are used to retrieve associated metadata to the running Pod from the Kubernetes Master/API Server.
If you have large pod specifications (can be caused by large numbers of environment variables, etc.), be sure to increase the
Buffer_Size
parameter of the kubernetes filter. If object sizes exceed this buffer, some metadata will fail to be injected to the logs.
If the configuration property Kube_Tag_Prefix was configured (available on Fluent Bit >= 1.1.x), it will use that value to remove the prefix that was appended to the Tag in the previous Input section. Note that the configuration property defaults to kube.var.logs.containers. , so the previous Tag content will be transformed from:
to:
the transformation above do not modify the original Tag, just creates a new representation for the filter to perform metadata lookup.
that new value is used by the filter to lookup the pod name and namespace, for that purpose it uses an internal Regular expression:
If you want to know more details, check the source code of that definition here.
You can see on Rublar.com web site how this operation is performed, check the following demo link:
Under certain and not common conditions, a user would want to alter that hard-coded regular expression, for that purpose the option Regex_Parser can be used (documented on top).
So at this point the filter is able to gather the values of pod_name and namespace, with that information it will check in the local cache (internal hash table) if some metadata for that key pair exists, if so, it will enrich the record with the metadata value, otherwise it will connect to the Kubernetes Master/API Server and retrieve that information.
There is an issue reported about kube-apiserver fall over and become unresponsive when cluster is too large and too many requests are sent to it. For this feature, fluent bit Kubernetes filter will send the request to kubelet /pods endpoint instead of kube-apiserver to retrieve the pods information and use it to enrich the log. Since Kubelet is running locally in nodes, the request would be responded faster and each node would only get one request one time. This could save kube-apiserver power to handle other requests. When this feature is enabled, you should see no difference in the kubernetes metadata added to logs, but the Kube-apiserver bottleneck should be avoided when cluster is large.
There are some configuration setup needed for this feature.
Role Configuration for Fluent Bit DaemonSet Example:
The difference is that kubelet need a special permission for resource nodes/proxy
to get HTTP request in. When creating the role
or clusterRole
, you need to add nodes/proxy
into the rule for resource.
Fluent Bit Configuration Example:
So for fluent bit configuration, you need to set the Use_Kubelet
to true to enable this feature.
DaemonSet config Example:
The key point is to set hostNetwork
to true
and dnsPolicy
to ClusterFirstWithHostNet
that fluent bit DaemonSet could call Kubelet locally. Otherwise it could not resolve the dns for kubelet.
Now you are good to use this new feature!
Basically you should see no difference about your experience for enriching your log files with Kubernetes metadata.
To check if Fluent Bit is using the kubelet, you can check fluent bit logs and there should be a log like this:
And if you are in debug mode, you could see more:
The following section goes over specific log messages you may run into and how to solve them to ensure that Fluent Bit's Kubernetes filter is operating properly
If you are not seeing metadata added to your kubernetes logs and see the following in your log message, then you may be facing connectivity issues with the Kubernetes API server.
Potential fix #1: Check Kubernetes roles
When Fluent Bit is deployed as a DaemonSet it generally runs with specific roles that allow the application to talk to the Kubernetes API server. If you are deployed in a more restricted environment check that all the Kubernetes roles are set correctly.
You can test this by running the following command (replace fluentbit-system
with the namespace where your fluentbit is installed)
If set roles are configured correctly, it should simply respond with yes
.
For instance, using Azure AKS, running the above command may respond with:
If you have connectivity to the API server, but still "could not get meta for POD" - debug logging might give you a message with Azure does not have opinion for this user
. Then the following subject
may need to be included in the fluentbit
ClusterRoleBinding
:
appended to subjects
array:
Potential fix #2: Check Kubernetes IPv6
There may be cases where you have IPv6 on in the environment and you need to enable this within Fluent Bit. Under the service tag please set the following option ipv6
to on
.
Potential fix #3: Check connectivity to Kube_URL
By default the Kube_URL is set to https://kubernetes.default.svc:443
. Ensure that you have connectivity to this endpoint from within the cluster and that there are no special permission interfering with the connection.
In some cases, you may only see some objects being appended with metadata while other objects are not enriched. This can occur at times when local data is cached and does not contain the correct id for the kubernetes object that requires enrichment. For most Kubernetes objects the Kubernetes API server is updated which will then be reflected in Fluent Bit logs, however in some cases for Pod
objects this refresh to the Kubernetes API server can be skipped, causing metadata to be skipped.
Concatenate Multiline or Stack trace log messages. Available on Fluent Bit >= v1.8.2.
The Multiline Filter helps to concatenate messages that originally belong to one context but were split across multiple records or log lines. Common examples are stack traces or applications that print logs in multiple lines.
As part of the built-in functionality, without major configuration effort, you can enable one of ours built-in parsers with auto detection and multi format support:
go
python
ruby
java (Google Cloud Platform Java stacktrace format)
Some comments about this filter:
This filter does not perform buffering that persists across different Chunks. This filter process one Chunk at a time and is not suitable for sources that might send multiline messages in separated chunks.
For cases where Multiline mode is required and the source plugin does not support it, please file a Github Enhancement with such requirement and specific details of the use case.
The plugin supports the following configuration parameters:
The following example aims to parse a log file called test.log
that contains some full lines, a custom Java stacktrace and a Go stacktrace.
Example files content:
This is the primary Fluent Bit configuration file. It includes the parsers_multiline.conf
and tails the file test.log
by applying the multiline parsers multiline-regex-test
and go
. Then it sends the processing to the standard output.
This second file defines a multiline parser for the example. Note that a second multiline parser called go
is used in fluent-bit.conf, but this one is a built-in parser.
An example file with multiline and multiformat content:
By running Fluent Bit with the given configuration file you will obtain:
The lines that did not match a pattern are not considered as part of the multiline message, while the ones that matched the rules were concatenated properly.
Powerful and flexible routing
The rewrite_tag
filter, allows to re-emit a record under a new Tag. Once a record has been re-emitted, the original record can be preserved or discarded.
The way it works is defining rules that matches specific record key content against a regular expression, if a match exists, a new record with the defined Tag will be emitted, entering from the beginning of the pipeline. Multiple rules can be specified and they are processed in order until one of them matches.
The new Tag to define can be composed by:
Alphabet characters & Numbers
Original Tag string or part of it
Regular Expressions groups capture
Any key or sub-key of the processed record
Environment variables
The rewrite_tag
filter supports the following configuration parameters:
A rule aims to define matching criteria and specify how to create a new Tag for a record. You can define one or multiple rules in the same configuration section. The rules have the following format:
The key represents the name of the record key that holds the value that we want to use to match our regular expression. A key name is specified and prefixed with a $
. Consider the following structured record (formatted for readability):
If we wanted to match against the value of the key name
we must use $name
. The key selector is flexible enough to allow to match nested levels of sub-maps from the structure. If we wanted to check the value of the nested key s2
we can do it specifying $ss['s1']['s2']
, for short:
$name
= "abc-123"
$ss['s1']['s2']
= "flb"
Note that a key must point a value that contains a string, it's not valid for numbers, booleans, maps or arrays.
Using a simple regular expression we can specify a matching pattern to use against the value of the key specified above, also we can take advantage of group capturing to create custom placeholder values.
If we wanted to match any record that it $name
contains a value of the format string-number
like the example provided above, we might use:
Note that in our example we are using parentheses, this teams that we are specifying groups of data. If the pattern matches the value a placeholder will be created that can be consumed by the NEW_TAG section.
If $name
equals abc-123
, then the following placeholders will be created:
$0
= "abc-123"
$1
= "abc"
$2
= "123"
If the Regular expression do not matches an incoming record, the rule will be skipped and the next rule (if any) will be processed.
If a regular expression has matched the value of the defined key in the rule, we are ready to compose a new Tag for that specific record. The tag is a concatenated string that can contain any of the following characters: a-z
,A-Z
, 0-9
and .-,
.
A Tag can take any string value from the matching record, the original tag it self, environment variable or general placeholder.
Consider the following incoming data on the rule:
Tag = aa.bb.cc
Record = {"name": "abc-123", "ss": {"s1": {"s2": "flb"}}}
Environment variable $HOSTNAME = fluent
With such information we could create a very custom Tag for our record like the following:
the expected Tag to generated will be:
We make use of placeholders, record content and environment variables.
If a rule matches a rule the filter will emit a copy of the record with the new defined Tag. The property keep takes a boolean value to define if the original record with the old Tag must be preserved and continue in the pipeline or just be discarded.
You can use true
or false
to decide the expected behavior. There is no default value and this is a mandatory field in the rule.
The following configuration example will emit a dummy (hand-crafted) record, the filter will rewrite the tag, discard the old record and print the new record to the standard output interface:
The original tag test_tag
will be rewritten as from.test_tag.new.fluent.bit.out
:
Since rewrite_tag
emit new records that goes through the beginning of the pipeline, it exposes an additional metric called emit_records
that summarize the total number of emitted records.
Using the configuration provided above, if we query the metrics exposed in the HTTP interface we will see the following:
Command:
Metrics output:
The dummy input generated two records, the filter dropped two from the chunks and emitted two new ones under a different Tag.
The records generated are handled by the internal Emitter, so the new records are summarized in the Emitter metrics, take a look at the entry called emitter_for_rewrite_tag.0
.
The Emitter is an internal Fluent Bit plugin that allows other components of the pipeline to emit custom records. On this case rewrite_tag
creates an Emitter instance to use it exclusively to emit records, on that way we can have a granular control of who is emitting what.
The Emitter name in the metrics can be changed setting up the Emitter_Name
configuration property described above.
The Nest Filter plugin allows you to operate on or with nested data. Its modes of operation are
nest
- Take a set of records and place them in a map
lift
- Take a map by key and lift its records up
As an example using JSON notation, to nest keys matching the Wildcard
value Key*
under a new key NestKey
the transformation becomes,
Example (input)
Example (output)
As an example using JSON notation, to lift keys nested under the Nested_under
value NestKey*
the transformation becomes,
Example (input)
Example (output)
The plugin supports the following configuration parameters:
Note: Using the command line mode requires quotes parse the wildcard properly. The use of a configuration file is recommended.
The following command will load the mem plugin. Then the nest filter will match the wildcard rule to the keys and nest the keys matching Mem.*
under the new key NEST
.
The output of both the command line and configuration invocations should be identical and result in the following output.
This example nests all Mem.*
and Swap,*
items under the Stats
key and then reverses these actions with a lift
operation. The output appears unchanged.
This example takes the keys starting with Mem.*
and nests them under LAYER1
, which itself is then nested under LAYER2
, which is nested under LAYER3
.
This example starts with the 3-level deep nesting of Example 2 and applies the lift
filter three times to reverse the operations. The end result is that all records are at the top level, without nesting, again. One prefix is added for each level that is lifted.
Key | Description |
---|---|
Key | Description | Default |
---|---|---|
Operation | Parameter 1 | Parameter 2 | Description |
---|---|---|---|
Condition | Parameter | Parameter 2 | Description |
---|---|---|---|
Key | Description | Default |
---|---|---|
Annotation | Description | Default |
---|---|---|
The usage of this filter depends on a previous configuration of a definition.
If you aim to concatenate messages split originally by Docker or CRI container engines, we recommend doing the concatenation on , this same functionality exists there.
Property | Description |
---|
The following example files can be located at:
Tags are what makes possible. Tags are set in the configuration of the Input definitions where the records are generated, but there are certain scenarios where might be useful to modify the Tag in the pipeline so we can perform more advanced and flexible routing.
Key | Description |
---|
As described in the section, every component of the pipeline of Fluent Bit exposes metrics. The basic metrics exposed by this filter are drop_records
and add_records
, they summarize the total of dropped records from the incoming data chunk or the new records added.
Key | Value Format | Operation | Description |
---|
In order to start filtering records, you can run the filter from the command line or through the configuration file. The following invokes the , which outputs the following (example),
Record
Append fields. This parameter needs key and value pair.
Remove_key
If the key is matched, that field is removed.
Allowlist_key
If the key is not matched, that field is removed.
Whitelist_key
An alias of Allowlist_key
for backwards compatibility.
Key_Name
Specify field name in record to parse.
Parser
Specify the parser name to interpret the field. Multiple Parser entries are allowed (one per line).
Preserve_Key
Keep original Key_Name
field in the parsed result. If false, the field will be removed.
False
Reserve_Data
Keep all other original fields in the parsed result. If false, all other original fields will be removed.
False
Unescape_Key
If the key is an escaped string (e.g: stringify JSON), unescape the string before applying the parser.
False
Set
STRING:KEY
STRING:VALUE
Add a key/value pair with key KEY
and value VALUE
. If KEY
already exists, this field is overwritten
Add
STRING:KEY
STRING:VALUE
Add a key/value pair with key KEY
and value VALUE
if KEY
does not exist
Remove
STRING:KEY
NONE
Remove a key/value pair with key KEY
if it exists
Remove_wildcard
WILDCARD:KEY
NONE
Remove all key/value pairs with key matching wildcard KEY
Remove_regex
REGEXP:KEY
NONE
Remove all key/value pairs with key matching regexp KEY
Rename
STRING:KEY
STRING:RENAMED_KEY
Rename a key/value pair with key KEY
to RENAMED_KEY
if KEY
exists AND RENAMED_KEY
does not exist
Hard_rename
STRING:KEY
STRING:RENAMED_KEY
Rename a key/value pair with key KEY
to RENAMED_KEY
if KEY
exists. If RENAMED_KEY
already exists, this field is overwritten
Copy
STRING:KEY
STRING:COPIED_KEY
Copy a key/value pair with key KEY
to COPIED_KEY
if KEY
exists AND COPIED_KEY
does not exist
Hard_copy
STRING:KEY
STRING:COPIED_KEY
Copy a key/value pair with key KEY
to COPIED_KEY
if KEY
exists. If COPIED_KEY
already exists, this field is overwritten
Key_exists
STRING:KEY
NONE
Is true
if KEY
exists
Key_does_not_exist
STRING:KEY
NONE
Is true
if KEY
does not exist
A_key_matches
REGEXP:KEY
NONE
Is true
if a key matches regex KEY
No_key_matches
REGEXP:KEY
NONE
Is true
if no key matches regex KEY
Key_value_equals
STRING:KEY
STRING:VALUE
Is true
if KEY
exists and its value is VALUE
Key_value_does_not_equal
STRING:KEY
STRING:VALUE
Is true
if KEY
exists and its value is not VALUE
Key_value_matches
STRING:KEY
REGEXP:VALUE
Is true
if key KEY
exists and its value matches VALUE
Key_value_does_not_match
STRING:KEY
REGEXP:VALUE
Is true
if key KEY
exists and its value does not match VALUE
Matching_keys_have_matching_values
REGEXP:KEY
REGEXP:VALUE
Is true
if all keys matching KEY
have values that match VALUE
Matching_keys_do_not_have_matching_values
REGEXP:KEY
REGEXP:VALUE
Is true
if all keys matching KEY
have values that do not match VALUE
Buffer_Size
Set the buffer size for HTTP client when reading responses from Kubernetes API server. The value must be according to the Unit Size specification. A value of 0
results in no limit, and the buffer will expand as-needed. Note that if pod specifications exceed the buffer limit, the API response will be discarded when retrieving metadata, and some kubernetes metadata will fail to be injected to the logs.
32k
Kube_URL
API Server end-point
Kube_CA_File
CA certificate file
/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_CA_Path
Absolute path to scan for certificate files
Kube_Token_File
Token file
/var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix
When the source records comes from Tail input plugin, this option allows to specify what's the prefix used in Tail configuration.
kube.var.log.containers.
Merge_Log
When enabled, it checks if the log
field content is a JSON string map, if so, it append the map fields as part of the log structure.
Off
Merge_Log_Key
When Merge_Log
is enabled, the filter tries to assume the log
field from the incoming message is a JSON string message and make a structured representation of it at the same level of the log
field in the map. Now if Merge_Log_Key
is set (a string name), all the new structured fields taken from the original log
content are inserted under the new key.
Merge_Log_Trim
When Merge_Log
is enabled, trim (remove possible \n or \r) field values.
On
Merge_Parser
Optional parser name to specify how to parse the data contained in the log key. Recommended use is for developers or testing only.
Keep_Log
When Keep_Log
is disabled, the log
field is removed from the incoming message once it has been successfully merged (Merge_Log
must be enabled as well).
On
tls.debug
Debug level between 0 (nothing) and 4 (every detail).
-1
tls.verify
When enabled, turns on certificate validation when connecting to the Kubernetes API server.
On
Use_Journal
When enabled, the filter reads logs coming in Journald format.
Off
Cache_Use_Docker_Id
When enabled, metadata will be fetched from K8s when docker_id is changed.
Off
Regex_Parser
Set an alternative Parser to process record Tag and extract pod_name, namespace_name, container_name and docker_id. The parser must be registered in a parsers file (refer to parser filter-kube-test as an example).
K8S-Logging.Parser
Allow Kubernetes Pods to suggest a pre-defined Parser (read more about it in Kubernetes Annotations section)
Off
K8S-Logging.Exclude
Allow Kubernetes Pods to exclude their logs from the log processor (read more about it in Kubernetes Annotations section).
Off
Labels
Include Kubernetes resource labels in the extra metadata.
On
Annotations
Include Kubernetes resource annotations in the extra metadata.
On
Kube_meta_preload_cache_dir
If set, Kubernetes meta-data can be cached/pre-loaded from files in JSON format in this directory, named as namespace-pod.meta
Dummy_Meta
If set, use dummy-meta data (for test/dev purposes)
Off
DNS_Retries
DNS lookup retries N times until the network start working
6
DNS_Wait_Time
DNS lookup interval between network status checks
30
Use_Kubelet
this is an optional feature flag to get metadata information from kubelet instead of calling Kube Server API to enhance the log. This could mitigate the Kube API heavy traffic issue for large cluster.
Off
Kubelet_Port
kubelet port using for HTTP request, this only works when Use_Kubelet
set to On.
10250
Kube_Meta_Cache_TTL
configurable TTL for K8s cached metadata. By default, it is set to 0 which means TTL for cache entries is disabled and cache entries are evicted at random when capacity is reached. In order to enable this option, you should set the number to a time interval. For example, set this value to 60 or 60s and cache entries which have been created more than 60s will be evicted.
0
Kube_Token_Command
Command to get Kubernetes authorization token. By default, it will be NULL
and we will use token file to get token. If you want to manually choose a command to get it, you can set the command here. For example, run aws-iam-authenticator -i your-cluster-name token --token-only
to set token. This option is currently Linux-only.
fluentbit.io/parser[_stream][-container]
Suggest a pre-defined parser. The parser must be registered already by Fluent Bit. This option will only be processed if Fluent Bit configuration (Kubernetes Filter) have enabled the option K8S-Logging.Parser. If present, the stream (stdout or stderr) will restrict that specific stream. If present, the container can override a specific container in a Pod.
fluentbit.io/exclude[_stream][-container]
Request to Fluent Bit to exclude or not the logs generated by the Pod. This option will only be processed if Fluent Bit configuration (Kubernetes Filter) have enabled the option K8S-Logging.Exclude.
False
Rule | Defines the matching criteria and the format of the Tag for the matching record. The Rule format have four components: |
Emitter_Name | When the filter emits a record under the new Tag, there is an internal emitter plugin that takes care of the job. Since this emitter expose metrics as any other component of the pipeline, you can use this property to configure an optional name for it. |
Emitter_Storage.type | Define a buffering mechanism for the new records created. Note these records are part of the emitter plugin. This option support the values |
Emitter_Mem_Buf_Limit | Set a limit on the amount of memory the tag rewrite emitter can consume if the outputs provide backpressure. The default for this limit is |
Operation | ENUM [ | Select the operation |
Wildcard | FIELD WILDCARD |
| Nest records which field matches the wildcard |
Nest_under | FIELD STRING |
| Nest records matching the |
Nested_under | FIELD STRING |
| Lift records nested under the |
Add_prefix | FIELD STRING | ANY | Prefix affected keys with this string |
Remove_prefix | FIELD STRING | ANY | Remove prefix from affected keys if it matches this string |
The stdout filter plugin allows printing to the standard output the data flowed through the filter plugin, which can be very useful while debugging.
The plugin has no configuration parameters, is very simple to use.
We have specified to gather CPU usage metrics and print them out in a human-readable way when they flow through the stdout plugin.
multiline.parser |
multiline.key_content | Key name that holds the content to process. Note that a Multiline Parser definition can already specify the |
Tensorflow Filter allows running Machine Learning inference tasks on the records of data coming from input plugins or stream processor. This filter uses Tensorflow Lite as the inference engine, and requires Tensorflow Lite shared library to be present during build and at runtime.
Tensorflow Lite is a lightweight open-source deep learning framework that is used for mobile and IoT applications. Tensorflow Lite only handles inference (not training), therefore, it loads pre-trained models (.tflite
files) that are converted into Tensorflow Lite format (FlatBuffer
). You can read more on converting Tensorflow models here
The plugin supports the following configuration parameters:
Clone Tensorflow repository, install bazel package manager, and run the following command in order to create the shared library:
The script creates the shared library bazel-bin/tensorflow/lite/c/libtensorflowlite_c.so
. You need to copy the library to a location (such as /usr/lib
) that can be used by Fluent Bit.
Tensorflow filter plugin is disabled by default. You need to build Fluent Bit with Tensorflow plugin enabled. In addition, it requires access to Tensorflow Lite header files to compile. Therefore, you also need to pass the address of the Tensorflow source code on your machine to the build script:
If Tensorflow plugin initializes correctly, it reports successful creation of the interpreter, and prints a summary of model's input/output types and dimensions.
Currently supports single-input models
Uses Tensorflow 2.3 header files
The Throttle Filter plugin sets the average Rate of messages per Interval, based on leaky bucket and sliding window algorithm. In case of overflood, it will leak within certain rate.
The plugin supports the following configuration parameters:
Lets imagine we have configured:
we received 1 message first second, 3 messages 2nd, and 5 3rd. As you can see, disregard that Window is actually 5, we use "slow" start to prevent overflooding during the startup.
But as soon as we reached Window size * Interval, we will have true sliding window with aggregation over complete window.
When we have average over window is more than Rate, we will start dropping messages, so that
will become:
As you can see, last pane of the window was overwritten and 1 message was dropped.
You might noticed possibility to configure Interval of the Window shift. It is counter intuitive, but there is a difference between two examples above:
and
Even though both examples will allow maximum Rate of 60 messages per minute, first example may get all 60 messages within first second, and will drop all the rest for the entire minute:
While the second example will not allow more than 1 message per second every second, making output rate more smooth:
It may drop some data if the rate is ragged. I would recommend to use bigger interval and rate for streams of rare but important events, while keep Window bigger and Interval small for constantly intensive inputs.
Note: It's suggested to use a configuration file.
The following command will load the tail plugin and read the content of lines.txt file. Then the throttle filter will apply a rate limit and only pass the records which are read below the certain rate:
The example above will pass 1000 messages per second in average over 300 seconds.
An output plugin to submit Prometheus Metrics using the remote write protocol
The prometheus remote write plugin allows you to take metrics from Fluent Bit and submit them to a Prometheus server through the remote write mechanism.
Important Note: The prometheus exporter only works with metric plugins, such as Node Exporter Metrics
The Prometheus remote write plugin only works with metrics collected by one of the from metric input plugins. In the following example, host metrics are collected by the node exporter metrics plugin and then delivered by the prometheus remote write output plugin.
The following are examples of using Prometheus remote write with hosted services below
With Grafana Cloud hosted metrics you will need to use the specific host that is mentioned as well as specify the HTTP username and password given within the Grafana Cloud page.
With Logz.io hosted prometheus you will need to make use of the header option and add the Authorization Bearer with the proper key. The host and port may also differ within your specific hosted instance.
With Coralogix Metrics you may need to customize the URI. Additionally, you will make use of the header key with Coralogix private key.
Send logs and metrics to Amazon CloudWatch
The Amazon CloudWatch output plugin allows to ingest your records into the CloudWatch Logs service. Support for CloudWatch Metrics is also provided via EMF.
This is the documentation for the core Fluent Bit CloudWatch plugin written in C. It can replace the aws/amazon-cloudwatch-logs-for-fluent-bit Golang Fluent Bit plugin released last year. The Golang plugin was named cloudwatch
; this new high performance CloudWatch plugin is called cloudwatch_logs
to prevent conflicts/confusion. Check the amazon repo for the Golang plugin for details on the deprecation/migration plan for the original plugin.
See here for details on how AWS credentials are fetched.
In order to send records into Amazon Cloudwatch, you can run the plugin from the command line or through the configuration file:
The cloudwatch plugin, can read the parameters from the command line through the -p argument (property), e.g:
In your main configuration file append the following Output section:
Fluent Bit 1.7 adds a new feature called workers
which enables outputs to have dedicated threads. This cloudwatch_logs
plugin has partial support for workers. The plugin can support a single worker; enabling multiple workers will lead to errors/indeterminate behavior.
Example:
If you enable a single worker, you are enabling a dedicated thread for your CloudWatch output. We recommend starting without workers, evaluating the performance, and then enabling a worker if needed. For most users, the plugin can provide sufficient throughput without workers.
Fluent Bit has different input plugins (cpu, mem, disk, netif) to collect host resource usage metrics. cloudwatch_logs
output plugin can be used to send these host metrics to CloudWatch in Embedded Metric Format (EMF). If data comes from any of the above mentioned input plugins, cloudwatch_logs
output plugin will convert them to EMF format and sent to CloudWatch as JSON log. Additionally, if we set json/emf
as the value of log_format
config option, CloudWatch will extract custom metrics from embedded JSON payload.
Note: Right now, only cpu
and mem
metrics can be sent to CloudWatch.
For using the mem
input plugin and sending memory usage metrics to CloudWatch, we can consider the following example config file. Here, we use the aws
filter which adds ec2_instance_id
and az
(availability zone) to the log records. Later, in the output config section, we set ec2_instance_id
as our metric dimension.
The following config will set two dimensions to all of our metrics- ec2_instance_id
and az
.
Amazon distributes a container image with Fluent Bit and these plugins.
github.com/aws/aws-for-fluent-bit
Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:
For example, you can pull the image with latest version by:
If you see errors for image pull limits, try log into public ECR with your AWS credentials:
You can check the Amazon ECR Public official doc for more details
You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:
For more see the AWS for Fluent Bit github repo.
An output plugin to expose Prometheus Metrics
The prometheus exporter allows you to take metrics from Fluent Bit and expose them such that a Prometheus instance can scrape them.
Important Note: The prometheus exporter only works with metric plugins, such as Node Exporter Metrics
The Prometheus exporter only works with metrics captured from metric plugins. In the following example, host metrics are captured by the node exporter metrics plugin and then are routed to prometheus exporter. Within the output plugin two labels are added app="fluent-bit"
and color="blue"
Send logs to Amazon Kinesis Firehose
In order to send records into Amazon Kinesis Data Firehose, you can run the plugin from the command line or through the configuration file:
The firehose plugin, can read the parameters from the command line through the -p argument (property), e.g:
In your main configuration file append the following Output section:
Fluent Bit 1.7 adds a new feature called workers
which enables outputs to have dedicated threads. This kinesis_firehose
plugin fully supports workers.
Example:
If you enable a single worker, you are enabling a dedicated thread for your Firehose output. We recommend starting with without workers, evaluating the performance, and then adding workers one at a time until you reach your desired/needed throughput. For most users, no workers or a single worker will be sufficient.
Amazon distributes a container image with Fluent Bit and these plugins.
Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:
For example, you can pull the image with latest version by:
If you see errors for image pull limits, try log into public ECR with your AWS credentials:
You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:
Send logs to Amazon Kinesis Streams
In order to send records into Amazon Kinesis Data Streams, you can run the plugin from the command line or through the configuration file:
The kinesis_streams plugin, can read the parameters from the command line through the -p argument (property), e.g:
In your main configuration file append the following Output section:
Fluent Bit 1.7 adds a new feature called workers
which enables outputs to have dedicated threads. This kinesis_streams
plugin fully supports workers.
Example:
If you enable a single worker, you are enabling a dedicated thread for your Kinesis output. We recommend starting with without workers, evaluating the performance, and then adding workers one at a time until you reach your desired/needed throughput. For most users, no workers or a single worker will be sufficient.
Amazon distributes a container image with Fluent Bit and these plugins.
Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:
For example, you can pull the image with latest version by:
If you see errors for image pull limits, try log into public ECR with your AWS credentials:
You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:
Send logs, data, metrics to Amazon S3
The plugin allows you to specify a maximum file size, and a timeout for uploads. A file will be created in S3 when the max size is reached, or the timeout is reached- whichever comes first.
Records are stored in files in S3 as newline delimited JSON.
The plugin requires s3:PutObject
permission.
In Fluent Bit, all logs have an associated tag. The s3_key_format
option lets you inject the tag into the s3 key using the following syntax:
$TAG
=> the full tag
$TAG[n]
=> the nth part of the tag (index starting at zero). This syntax is copied from the rewrite tag filter. By default, “parts” of the tag are separated with dots, but you can change this with s3_key_format_tag_delimiters
.
In the example below, assume the date is January 1st, 2020 00:00:00 and the tag associated with the logs in question is my_app_name-logs.prod
.
With the delimiters as . and -, the tag will be split into parts as follows:
$TAG[0]
= my_app_name
$TAG[1]
= logs
$TAG[2]
= prod
So the key in S3 will be /prod/my_app_name/2020/01/01/00/00/00/bgdHN1NM.gz
.
The store_dir
is used to temporarily store data before it is uploaded. If Fluent Bit is stopped suddenly it will try to send all data and complete all uploads before it shuts down. If it can not send some data, on restart it will look in the store_dir
for existing data and will try to send it.
Multipart uploads are ideal for most use cases because they allow the plugin to upload data in small chunks over time. For example, 1 GB file can be created from 200 5MB chunks. While the file size in S3 will be 1 GB, only 5 MB will be buffered on disk at any one point in time.
In these situations, we recommend using the PutObject API, and sending data frequently, to avoid local buffering as much as possible. This will limit data loss in the event Fluent Bit is killed unexpectedly.
The following settings are recommended for this use case:
Fluent Bit 1.7 adds a new feature called workers
which enables outputs to have dedicated threads. This s3
plugin has partial support for workers. The plugin can only support a single worker; enabling multiple workers will lead to errors/indeterminate behavior.
Example:
If you enable a single worker, you are enabling a dedicated thread for your S3 output. We recommend starting without workers, evaluating the performance, and then enabling a worker if needed. For most users, the plugin can provide sufficient throughput without workers.
In order to send records into Amazon S3, you can run the plugin from the command line or through the configuration file.
The s3 plugin, can read the parameters from the command line through the -p argument (property), e.g:
In your main configuration file append the following Output section:
An example that using PutObject instead of multipart:
Amazon distributes a container image with Fluent Bit and this plugins.
Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:
For example, you can pull the image with latest version by:
If you see errors for image pull limits, try log into public ECR with your AWS credentials:
You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:
Specify one or multiple to apply to the content. You can specify multiple multiline parsers to detect different formats by separating them with a comma.
Key | Description | Default |
---|---|---|
Key | Value Format | Description |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description |
---|---|
Key | Description | Default |
---|
The Amazon Kinesis Data Firehose output plugin allows to ingest your records into the service.
This is the documentation for the core Fluent Bit Firehose plugin written in C. It can replace the Golang Fluent Bit plugin released last year. The Golang plugin was named firehose
; this new high performance and highly efficient firehose plugin is called kinesis_firehose
to prevent conflicts/confusion.
See for details on how AWS credentials are fetched.
Key | Description |
---|
You can check the for more details.
For more see .
The Amazon Kinesis Data Streams output plugin allows to ingest your records into the service.
This is the documentation for the core Fluent Bit Kinesis plugin written in C. It has all the core features of the Golang Fluent Bit plugin released in 2019. The Golang plugin was named kinesis
; this new high performance and highly efficient kinesis plugin is called kinesis_streams
to prevent conflicts/confusion.
See for details on how AWS credentials are fetched.
Key | Description |
---|
You can check the for more details.
For more see .
The Amazon S3 output plugin allows you to ingest your records into the cloud object store.
The plugin can upload data to S3 using the or using S3 . Multipart is the default and is recommended; Fluent Bit will stream data in a series of 'parts'. This limits the amount of data it has to buffer on disk at any point in time. By default, every time 5 MiB of data have been received, a new 'part' will be uploaded. The plugin can create files up to gigabytes in size from many small chunks/parts using the multipart API. All aspects of the upload process are configurable using the configuration options.
See for details on how AWS credentials are fetched.
Key | Description | Default |
---|
There is one minor drawback to multipart uploads- the file and data will not be visible in S3 until the upload is completed with a call. The plugin will attempt to make this call whenever Fluent Bit is shut down to ensure your data is available in s3. It will also store metadata about each upload in the store_dir
, ensuring that uploads can be completed when Fluent Bit restarts (assuming it has access to persistent disk and the store_dir
files will still be present on restart).
If you run Fluent Bit in an environment without persistent disk, or without the ability to restart Fluent Bit and give it access to the data stored in the store_dir
from previous executions- some considerations apply. This might occur if you run Fluent Bit on .
You can check the for more details.
For more see .
input_field
Specify the name of the field in the record to apply inference on.
model_file
Path to the model file (.tflite
) to be loaded by Tensorflow Lite.
include_input_fields
Include all input filed in filter's output
True
normalization_value
Divide input values to normalization_value
Rate
Integer
Amount of messages for the time.
Window
Integer
Amount of intervals to calculate average over. Default 5.
Interval
String
Time interval, expressed in "sleep" format. e.g 3s, 1.5m, 0.5h etc
Print_Status
Bool
Whether to print status messages with current rate and the limits to information logs
host
IP address or hostname of the target HTTP Server
127.0.0.1
http_user
Basic Auth Username
http_passwd
Basic Auth Password. Requires HTTP_user to be set
port
TCP port of the target HTTP Server
80
proxy
Specify an HTTP Proxy. The expected format of this value is http://host:port. Note that https is not supported yet. Please consider not setting this and use HTTP_PROXY
environment variable instead, which supports both http and https.
uri
Specify an optional HTTP URI for the target web server, e.g: /something
/
header
Add a HTTP header key/value pair. Multiple headers can be set.
log_response_payload
Log the response payload within the Fluent Bit log
false
add_label
This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields
Workers
Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.
2
region
The AWS region.
log_group_name
The name of the CloudWatch Log Group that you want log records sent to.
log_stream_name
The name of the CloudWatch Log Stream that you want log records sent to.
log_stream_prefix
Prefix for the Log Stream name. The tag is appended to the prefix to construct the full log stream name. Not compatible with the log_stream_name option.
log_key
By default, the whole log record will be sent to CloudWatch. If you specify a key name with this option, then only the value of that key will be sent to CloudWatch. For example, if you are using the Fluentd Docker log driver, you can specify log_key log
and only the log message will be sent to CloudWatch.
log_format
An optional parameter that can be used to tell CloudWatch the format of the data. A value of json/emf enables CloudWatch to extract custom metrics embedded in a JSON payload. See the Embedded Metric Format.
role_arn
ARN of an IAM role to assume (for cross account access).
auto_create_group
Automatically create the log group. Valid values are "true" or "false" (case insensitive). Defaults to false.
log_retention_days
If set to a number greater than zero, and newly create log group's retention policy is set to this many days. Valid values are: [1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 3653]
endpoint
Specify a custom endpoint for the CloudWatch Logs API.
metric_namespace
An optional string representing the CloudWatch namespace for the metrics. See Metrics Tutorial
section below for a full configuration.
metric_dimensions
A list of lists containing the dimension keys that will be applied to all metrics. The values within a dimension set MUST also be members on the root-node. For more information about dimensions, see Dimension and Dimensions. In the fluent-bit config, metric_dimensions is a comma and semicolon separated string. If you have only one list of dimensions, put the values as a comma separated string. If you want to put list of lists, use the list as semicolon separated strings. For example, if you set the value as 'dimension_1,dimension_2;dimension_3', we will convert it as [[dimension_1, dimension_2],[dimension_3]]
sts_endpoint
Specify a custom STS endpoint for the AWS STS API.
auto_retry_requests
Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues.
host | This is address Fluent Bit will bind to when hosting prometheus metrics. Note: | 0.0.0.0 |
port | This is the port Fluent Bit will bind to when hosting prometheus metrics | 2021 |
add_label | This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields |
region | The AWS region. |
delivery_stream | The name of the Kinesis Firehose Delivery stream that you want log records sent to. |
time_key | Add the timestamp to the record under this key. By default the timestamp from Fluent Bit will not be added to records sent to Kinesis. |
time_key_format | strftime compliant format string for the timestamp; for example, the default is '%Y-%m-%dT%H:%M:%S'. This option is used with time_key. |
log_key | By default, the whole log record will be sent to Firehose. If you specify a key name with this option, then only the value of that key will be sent to Firehose. For example, if you are using the Fluentd Docker log driver, you can specify |
role_arn | ARN of an IAM role to assume (for cross account access). |
endpoint | Specify a custom endpoint for the Firehose API. |
sts_endpoint | Custom endpoint for the STS API. |
auto_retry_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. |
region | The AWS region. |
stream | The name of the Kinesis Streams Delivery stream that you want log records sent to. |
time_key | Add the timestamp to the record under this key. By default the timestamp from Fluent Bit will not be added to records sent to Kinesis. |
time_key_format | strftime compliant format string for the timestamp; for example, the default is '%Y-%m-%dT%H:%M:%S'. This option is used with time_key. |
log_key | By default, the whole log record will be sent to Kinesis. If you specify a key name with this option, then only the value of that key will be sent to Kinesis. For example, if you are using the Fluentd Docker log driver, you can specify |
role_arn | ARN of an IAM role to assume (for cross account access). |
endpoint | Specify a custom endpoint for the Kinesis API. |
sts_endpoint | Custom endpoint for the STS API. |
auto_retry_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. |
region | The AWS region of you S3 bucket | us-east-1 |
bucket | S3 Bucket name | None |
json_date_key | Specify the name of the time key in the output record. To disable the time key just set the value to | date |
json_date_format | Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681) | iso8601 |
total_file_size | Specifies the size of files in S3. Maximum size is 50G, minimim is 1M. | 100M |
upload_chunk_size | The size of each 'part' for multipart uploads. Max: 50M | 5,242,880 bytes |
upload_timeout | Whenever this amount of time has elapsed, Fluent Bit will complete an upload and create a new file in S3. For example, set this value to 60m and you will get a new file every hour. | 10m |
store_dir | Directory to locally buffer data before sending. When multipart uploads are used, data will only be buffered until the | /tmp/fluent-bit/s3 |
s3_key_format | Format string for keys in S3. This option supports a UUID, strftime time formatters, a syntax for selecting parts of the Fluent log tag using a syntax inspired by the rewrite_tag filter. Add $UUID in the format string to insert a random string. Add $TAG in the format string to insert the full log tag; add $TAG[0] to insert the first part of the tag in the s3 key. The tag is split into “parts” using the characters specified with the | /fluent-bit-logs/$TAG/%Y/%m/%d/%H/%M/%S |
s3_key_format_tag_delimiters | A series of characters which will be used to split the tag into 'parts' for use with the s3_key_format option. See the in depth examples and tutorial in the documentation. | . |
use_put_object | Use the S3 PutObject API, instead of the multipart upload API. When this option is on, key extension is only available when $UUID is specified in | false |
role_arn | ARN of an IAM role to assume (ex. for cross account access). | None |
endpoint | Custom endpoint for the S3 API. | None |
sts_endpoint | Custom endpoint for the STS API. | None |
canned_acl | None |
compression | Compression type for S3 objects. 'gzip' is currently the only supported value. The Content-Encoding HTTP Header will be set to 'gzip'. Compression can be enabled when | None |
content_type | A standard MIME type for the S3 object; this will be set as the Content-Type HTTP header. This option can be enabled when | None |
send_content_md5 | Send the Content-MD5 header with PutObject and UploadPart requests, as is required when Object Lock is enabled. | false |
auto_retry_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. | false |
Send logs, metrics to Azure Log Analytics
In order to insert records into an Azure Log Analytics instance, you can run the plugin from the command line or through the configuration file:
The azure plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:
In your main configuration file append the following Input & Output sections:
Counter is a very simple plugin that counts how many records it's getting upon flush time. Plugin output is as follows:
You can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit count up a data with the following options:
In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you will see the reports in the output interface similar to this:
Official and Microsoft Certified Azure Storage Blob connector
Before getting started, make sure you already have an Azure Storage account. As a reference, the following link explains step-by-step how to set up your account:
We expose different configuration properties. The following table lists all the options available, and the next section has specific configuration details for the official service or the emulator.
As mentioned above, you can either deliver records to the official service or an emulator. Below we have an example for each use case.
The following configuration example generates a random message with a custom tag:
After you run the configuration file above, you will be able to query the data using the Azure Storage Explorer. The example above will generate the following content in the explorer:
The quickest way to get started is to install Azurite using npm:
then run the service:
after running that Fluent Bit configuration you will see the data flowing into Azurite:
Fluent Bit streams data into an existing BigQuery table using a service account that you specify. Therefore, before using the BigQuery output plugin, you must create a service account, create a BigQuery dataset and table, authorize the service account to write to the table, and provide the service account credentials to Fluent Bit.
To stream data into BigQuery, the first step is to create a Google Cloud service account for Fluent Bit:
Fluent Bit does not create datasets or tables for your data, so you must create these ahead of time. You must also grant the service account WRITER
permission on the dataset:
Within the dataset you will need to create a table for the data to reside in. You can follow the following instructions for creating your table. Pay close attention to the schema. It must match the schema of your output JSON. Unfortunately, since BigQuery does not allow dots in field names, you will need to use a filter to change the fields for many of the standard inputs (e.g, mem or cpu).
Fluent Bit BigQuery output plugin uses a JSON credentials file for authentication credentials. Download the credentials file by following these instructions:
If you are using a Google Cloud Credentials File, the following configuration is enough to get you started:
for S3 objects.
Azure output plugin allows to ingest your records into service.
To get more details about how to setup Azure Log Analytics, please refer to the following documentation:
Key | Description | default |
---|
The Azure Blob output plugin allows ingesting your records into service. This connector is designed to use the Append Blob and Block Blob API.
Our plugin works with the official Azure Service and also can be configured to be used with a service emulator such as .
Key | Description | default |
---|
comes with a default account_name
and shared_key
, so make sure to use the specific values provided in the example below (do an exact copy/paste):
BigQuery output plugin is an experimental plugin that allows you to stream records into service. The implementation does not support the following, which would be expected in a full production version:
.
using insertId
.
using templateSuffix
.
Key | Description | default |
---|
See Google's for further details.
Customer_ID | Customer ID or WorkspaceID string. |
Shared_Key | The primary or the secondary Connected Sources client authentication key. |
Log_Type | The name of the event type. | fluentbit |
google_service_credentials | Absolute path to a Google Cloud credentials JSON file | Value of the environment variable $GOOGLE_SERVICE_CREDENTIALS |
project_id | The project id containing the BigQuery dataset to stream into. | The value of the |
dataset_id | The dataset id of the BigQuery dataset to write into. This dataset must exist in your project. |
table_id | The table id of the BigQuery table to write into. This table must exist in the specified dataset and the schema must match the output. |
skip_invalid_rows | Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist. | Off |
ignore_unknown_values | Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors. | Off |
account_name | Azure Storage account name. This configuration property is mandatory |
shared_key | Specify the Azure Storage Shared Key to authenticate against the service. This configuration property is mandatory. |
container_name | Name of the container that will contain the blobs. This configuration property is mandatory |
blob_type | Specify the desired blob type. Fluent Bit supports | appendblob |
auto_create_container | If | on |
path | Optional path to store your blobs. If your blob name is |
emulator_mode | off |
endpoint |
tls | Enable or disable TLS encryption. Note that Azure service requires this to be turned on. | off |
Send logs to Datadog
The Datadog output plugin allows to ingest your logs into Datadog.
Before you begin, you need a Datadog account, a Datadog API key, and you need to activate Datadog Logs Management.
Get started quickly with this configuration file:
If you get a 403 Forbidden
error response, double check that you have a valid Datadog API key and that you have activated Datadog Logs Management.
FlowCounter is the protocol to count records. The flowcounter output plugin allows to count up records and its size.
The plugin supports the following configuration parameters:
You can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit count up a data with the following options:
In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you will see the reports in the output interface similar to this:
Forward is the protocol used by Fluentd to route messages between peers. The forward output plugin provides interoperability between Fluent Bit and Fluentd. There are no configuration steps required besides specifying where Fluentd is located, which can be a local or a remote destination.
This plugin offers two different transports and modes:
Forward (TCP): It uses a plain TCP connection.
Secure Forward (TLS): when TLS is enabled, the plugin switch to Secure Forward mode.
The following parameters are mandatory for either Forward for Secure Forward modes:
When using Secure Forward mode, the TLS mode requires to be enabled. The following additional configuration parameters are available:
Before proceeding, make sure that Fluentd is installed, if it's not the case please refer to the following Fluentd Installation document and go ahead with that.
Once Fluentd is installed, create the following configuration file example that will allow us to stream data into it:
That configuration file specifies that it will listen for TCP connections on the port 24224 through the forward input type. Then for every message with a fluent_bit TAG, will print the message to the standard output.
In one terminal launch Fluentd specifying the new configuration file created:
Now that Fluentd is ready to receive messages, we need to specify where the forward output plugin will flush the information using the following format:
If the TAG parameter is not set, the plugin will retain the tag. Keep in mind that TAG is important for routing rules inside Fluentd.
Using the CPU input plugin as an example we will flush CPU metrics to Fluentd with tag fluent_bit:
Now on the Fluentd side, you will see the CPU metrics gathered in the last seconds:
So we gathered CPU metrics and flushed them out to Fluentd properly.
DISCLAIMER: the following example does not consider the generation of certificates for best practice on production environments.
Secure Forward aims to provide a secure channel of communication with the remote Fluentd service using TLS.
Paste this content in a file called flb.conf:
Paste this content in a file called fld.conf:
If you're using Fluentd v1, set up it as below:
Start Fluentd:
Start Fluent Bit:
After five seconds, Fluent Bit will write records to Fluentd. In Fluentd output you will see a message like this:
The file output plugin allows to write the data received through the input plugin to file.
The plugin supports the following configuration parameters:
Output time, tag and json records. There is no configuration parameters for out_file.
Output the records as JSON (without additional tag
and timestamp
attributes). There is no configuration parameters for plain format.
Output the records as csv. Csv supports an additional configuration parameter.
Output the records as LTSV. LTSV supports an additional configuration parameter.
Output the records using a custom format template.
This accepts a formatting template and fills placeholders using corresponding values in a record.
For example, if you set up the configuration as below:
You will get the following output:
You can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit count up a data with the following options:
In your main configuration file append the following Input & Output sections:
The following instructions assumes that you have a fully operational Graylog server running in your environment.
If you're using Fluent Bit to collect Docker logs, note that Docker places your log in JSON under key log
. So you can set log
as your Gelf_Short_Message_Key
to send everything in Docker logs to Graylog. In this case, you need your log
value to be a string; so don't parse it using JSON parser.
The order of looking up the timestamp in this plugin is as follows:
Value of Gelf_Timestamp_Key
provided in configuration
Value of timestamp
key
Timestamp does not set by Fluent Bit. In this case, your Graylog server will set it to the current timestamp (now).
The version
of GELF message is also mandatory and Fluent Bit sets it to 1.1 which is the current latest version of GELF.
If you use udp
as transport protocol and set Compress
to true
, Fluent Bit compresses your packets in GZIP format, which is the default compression that Graylog offers. This can be used to trade more CPU load for saving network bandwidth.
If you're using Fluent Bit for shipping Kubernetes logs, you can use something like this as your configuration file:
By default, GELF tcp uses port 12201 and Docker places your logs in /var/log/containers
directory. The logs are placed in value of the log
key. For example, this is a log saved by Docker:
Now, this is what happens to this log:
Fluent Bit GELF plugin adds "version": "1.1"
to it.
We used this data
key as Gelf_Short_Message_Key
; so GELF plugin changes it to short_message
.
Timestamp is generated.
Finally, this is what our Graylog server input sees:
In order to insert records into a HTTP server, you can run the plugin from the command line or through the configuration file:
The http plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:
Using the format specified, you could start Fluent Bit through:
In your main configuration file, append the following Input & Output sections:
By default, the URI becomes tag of the message, the original tag is ignored. To retain the tag, multiple configuration sections have to be made based and flush to different URIs.
Another approach we also support is the sending the original message tag in a configurable header. It's up to the receiver to do what it wants with that header field: parse it and use it as the tag for example.
To configure this behaviour, add this config:
Provided you are using Fluentd as data receiver, you can combine in_http
and out_rewrite_tag_filter
to make use of this HTTP header.
Notice how we override the tag, which is from URI path, with our custom header
Suggested configuration for Sumo Logic using json_lines
with iso8601
timestamps. The PrivateKey
is specific to a configured HTTP collector.
Send logs to Elasticsearch (including Amazon OpenSearch Service)
In order to insert records into a Elasticsearch service, you can run the plugin from the command line or through the configuration file:
The es plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:
Using the format specified, you could start Fluent Bit through:
which is similar to do:
Some input plugins may generate messages where the field names contains dots, since Elasticsearch 2.0 this is not longer allowed, so the current es plugin replaces them with an underscore, e.g:
becomes
Since Elasticsearch 6.0, you cannot create multiple types in a single index. This means that you cannot set up your configuration as below anymore.
If you see an error message like below, you'll need to fix your configuration to use a single type on each index.
Rejecting mapping update to [search] as the final mapping would have more than 1 type
The Amazon OpenSearch Service adds an extra security layer where HTTP requests must be signed with AWS Sigv4. Fluent Bit v1.5 introduced full support for Amazon OpenSearch Service with IAM Authentication.
Example configuration:
Notice that the Port
is set to 443
, tls
is enabled, and AWS_Region
is set.
Example configuration:
Since v1.8.2, Fluent Bit started using create
method (instead of index
) for data submission. This makes Flunt Bit compatible with Datastream introduced in Elasticsearch 7.9.
If you see action_request_validation_exception
errors on your pipeline with Fluent Bit >= v1.8.2, you can fix it up by turning on Generate_ID
as follows:
If you want to send data to an Azure emulator service like , enable this option so the plugin will format the requests to the expected format.
If you are using an emulator, this option allows you to specify the absolute HTTP address of such service. e.g: .
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description |
---|---|
Key | Description |
---|---|
Key | Description |
---|---|
GELF is Extended Log Format. The GELF output plugin allows to send logs in GELF format directly to a Graylog input using TLS, TCP or UDP protocols.
According to , there are some mandatory and optional fields which are used by Graylog in GELF format. These fields are determined with Gelf\*_Key_ key in this plugin.
Key | Description | default |
---|
GELF output plugin supports TLS/SSL, for more details about the properties available and general configuration, please refer to the section.
If you're using , this parser can parse time and use it as timestamp of message. If all above fail, Fluent Bit tries to get timestamp extracted by your parser.
Your log timestamp has to be in format. If the Gelf_Timestamp_Key
value of your log is not in this format, your Graylog server will ignore it.
If you're using Fluent Bit in Kubernetes and you're using , this plugin adds host
value to your log by default, and you don't need to add it by your own.
If you use and use a Parser like the docker
parser shown above, it decodes your message and extracts data
(and any other present) field. This is how this log in looks like after decoding:
The , unnests fields inside log
key. In our example, it puts data
alongside stream
and time
.
adds host
name.
Any custom field (not present in ) is prefixed by an underline.
The http output plugin allows to flush your records into a HTTP endpoint. For now the functionality is pretty basic and it issues a POST request with the data records in (or JSON) format.
Key | Description | default |
---|
HTTP output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.
A sample Sumo Logic query for the input. (Requires json_lines
format with iso8601
date format for the timestamp
field).
The es output plugin, allows to ingest your records into an database. The following instructions assumes that you have a fully operational Elasticsearch service running in your environment.
Key | Description | default |
---|
The parameters index and type can be confusing if you are new to Elastic, if you have used a common relational database before, they can be compared to the database and table concepts. Also see
Elasticsearch output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.
In your main configuration file append the following Input & Output sections. You can visualize this configuration
For details, please read .
Fluent Bit v1.5 changed the default mapping type from flb_type
to _doc
, which matches the recommendation from Elasticsearch from version 6.2 forwards (). This doesn't work in Elasticsearch versions 5.6 through 6.1 (). Ensure you set an explicit map (such as doc
or flb_type
) in the configuration, as seen on the last line:
See for details on how AWS credentials are fetched.
Fluent Bit supports connecting to providing just the cloud_id
and the cloud_auth
settings.
Host
Required - The Datadog server where you are sending your logs.
http-intake.logs.datadoghq.com
TLS
Required - End-to-end security communications security protocol. Datadog recommends setting this to on
.
off
compress
Recommended - compresses the payload in GZIP format, Datadog supports and recommends setting this to gzip
.
apikey
Required - Your Datadog API key.
Proxy
Optional - Specify an HTTP Proxy. The expected format of this value is http://host:port. Note that https is not supported yet.
provider
To activate the remapping, specify configuration flag provider with value ecs
.
json_date_key
Date key name for output.
timestamp
include_tag_key
If enabled, a tag is appended to output. The key name is used tag_key
property.
false
tag_key
The key name of tag. If include_tag_key
is false, This property is ignored.
tagkey
dd_service
Recommended - The human readable name for your service generating the logs - the name of your application or database.
dd_source
Recommended - A human readable name for the underlying technology of your service. For example, postgres
or nginx
.
dd_tags
Optional - The tags you want to assign to your logs in Datadog.
dd_message_key
By default, the plugin searches for the key 'log' and remap the value to the key 'message'. If the property is set, the plugin will search the property name key.
Unit
The unit of duration. (second/minute/hour/day)
minute
Host
Target host where Fluent-Bit or Fluentd are listening for Forward messages.
127.0.0.1
Port
TCP Port of the target service.
24224
Time_as_Integer
Set timestamps in integer format, it enable compatibility mode for Fluentd v0.12 series.
False
Upstream
If Forward will connect to an Upstream instead of a simple host, this property defines the absolute path for the Upstream configuration file, for more details about this refer to the Upstream Servers documentation section.
Tag
Overwrite the tag as we transmit. This allows the receiving pipeline start fresh, or to attribute source.
Send_options
Always send options (with "size"=count of messages)
False
Require_ack_response
Send "chunk"-option and wait for "ack" response from server. Enables at-least-once and receiving server can control rate of traffic. (Requires Fluentd v0.14.0+ server)
False
Compress
Set to "gzip" to enable gzip compression. Incompatible with Time_as_Integer=True and tags set dynamically using the Rewrite Tag filter. (Requires Fluentd v0.14.7+ server)
Workers
Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.
2
Shared_Key
A key string known by the remote Fluentd used for authorization.
Empty_Shared_Key
Use this option to connect to Fluentd with a zero-length secret.
False
Username
Specify the username to present to a Fluentd server that enables user_auth
.
Password
Specify the password corresponding to the username.
Self_Hostname
Default value of the auto-generated certificate common name (CN).
localhost
tls
Enable or disable TLS support
Off
tls.verify
Force certificate validation
On
tls.debug
Set TLS debug verbosity level. It accept the following values: 0 (No debug), 1 (Error), 2 (State change), 3 (Informational) and 4 Verbose
1
tls.ca_file
Absolute path to CA certificate file
tls.crt_file
Absolute path to Certificate file.
tls.key_file
Absolute path to private Key file.
tls.key_passwd
Optional password for tls.key_file file.
Path
Directory path to store files. If not set, Fluent Bit will write the files on it's own positioned directory. note: this option was added on Fluent Bit v1.4.6
File
Set file name to store the records. If not set, the file name will be the tag associated with the records.
Format
The format of the file content. See also Format section. Default: out_file.
Mkdir
Recursively create output directory if it does not exist. Permissions set to 0755.
Workers
Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.
1
Delimiter
The character to separate each data. Default: ','
Delimiter
The character to separate each pair. Default: '\t'(TAB)
Label_Delimiter
The character to separate label and the value. Default: ':'
Template
The format string. Default: '{time} {message}'
Match | Pattern to match which tags of logs to be outputted by this plugin |
Host | IP address or hostname of the target Graylog server | 127.0.0.1 |
Port | The port that your Graylog GELF input is listening on | 12201 |
Mode | The protocol to use ( | udp |
Gelf_Short_Message_Key | A short descriptive message (MUST be set in GELF) | short_message |
Gelf_Timestamp_Key | Your log timestamp (SHOULD be set in GELF) | timestamp |
Gelf_Host_Key | Key which its value is used as the name of the host, source or application that sent this message. (MUST be set in GELF) | host |
Gelf_Full_Message_Key | Key to use as the long message that can i.e. contain a backtrace. (Optional in GELF) | full_message |
Gelf_Level_Key | level |
Packet_Size | If transport protocol is | 1420 |
Compress | If transport protocol is | true |
host | IP address or hostname of the target HTTP Server | 127.0.0.1 |
http_User | Basic Auth Username |
http_Passwd | Basic Auth Password. Requires HTTP_User to be set |
port | TCP port of the target HTTP Server | 80 |
Proxy |
uri | Specify an optional HTTP URI for the target web server, e.g: /something | / |
compress | Set payload compression mechanism. Option available is 'gzip' |
format | Specify the data format to be used in the HTTP request body, by default it uses msgpack. Other supported formats are json, json_stream and json_lines and gelf. | msgpack |
allow_duplicated_headers | Specify if duplicated headers are allowed. If a duplicated header is found, the latest key/value set is preserved. | true |
log_response_payload | Specify if the response paylod should be logged or not. | true |
header_tag | Specify an optional HTTP header field for the original message tag. |
header | Add a HTTP header key/value pair. Multiple headers can be set. |
json_date_key | Specify the name of the time key in the output record. To disable the time key just set the value to | date |
json_date_format | Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681) | double |
gelf_timestamp_key | Specify the key to use for |
gelf_host_key | Specify the key to use for the |
gelf_short_message_key | Specify the key to use as the |
gelf_full_message_key | Specify the key to use for the |
gelf_level_key | Specify the key to use for the |
successful_response_code | Specify what a successful HTTP response code is in case you need to retry for other HTTP codes (E.g. 204 where) |
Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
Host | IP address or hostname of the target Elasticsearch instance | 127.0.0.1 |
Port | TCP port of the target Elasticsearch instance | 9200 |
Path | Elasticsearch accepts new data on HTTP query path "/_bulk". But it is also possible to serve Elasticsearch behind a reverse proxy on a subpath. This option defines such path on the fluent-bit side. It simply adds a path prefix in the indexing HTTP POST URI. | Empty string |
Buffer_Size | 4KB |
Pipeline | Newer versions of Elasticsearch allows to setup filters called pipelines. This option allows to define which pipeline the database should use. For performance reasons is strongly suggested to do parsing and filtering on Fluent Bit side, avoid pipelines. |
AWS_Auth | Enable AWS Sigv4 Authentication for Amazon OpenSearch Service | Off |
AWS_Region | Specify the AWS region for Amazon OpenSearch Service |
AWS_STS_Endpoint | Specify the custom sts endpoint to be used with STS API for Amazon OpenSearch Service |
AWS_Role_ARN | AWS IAM Role to assume to put records to your Amazon cluster |
AWS_External_ID | External ID for the AWS IAM Role specified with |
Cloud_ID | If you are using Elastic's Elasticsearch Service you can specify the cloud_id of the cluster running |
Cloud_Auth | Specify the credentials to use to connect to Elastic's Elasticsearch Service running on Elastic Cloud |
HTTP_User | Optional username credential for Elastic X-Pack access |
HTTP_Passwd | Password for user defined in HTTP_User |
Index | Index name | fluent-bit |
Type | Type name | _doc |
Logstash_Format | Enable Logstash format compatibility. This option takes a boolean value: True/False, On/Off | Off |
Logstash_Prefix | When Logstash_Format is enabled, the Index name is composed using a prefix and the date, e.g: If Logstash_Prefix is equals to 'mydata' your index will become 'mydata-YYYY.MM.DD'. The last string appended belongs to the date when the data is being generated. | logstash |
Logstash_DateFormat | %Y.%m.%d |
Time_Key | When Logstash_Format is enabled, each record will get a new timestamp field. The Time_Key property defines the name of that field. | @timestamp |
Time_Key_Format | When Logstash_Format is enabled, this property defines the format of the timestamp. | %Y-%m-%dT%H:%M:%S |
Time_Key_Nanos | When Logstash_Format is enabled, enabling this property sends nanosecond precision timestamps. | Off |
Include_Tag_Key | When enabled, it append the Tag name to the record. | Off |
Tag_Key | When Include_Tag_Key is enabled, this property defines the key name for the tag. | _flb-key |
Generate_ID | When enabled, generate | Off |
Id_Key | If set, |
Replace_Dots | When enabled, replace field name dots with underscore, required by Elasticsearch 2.0-2.3. | Off |
Trace_Output | When enabled print the elasticsearch API calls to stdout (for diag only) | Off |
Trace_Error | When enabled print the elasticsearch API calls to stdout when elasticsearch returns an error (for diag only) | Off |
Current_Time_Index | Use current time for index generation instead of message record | Off |
Logstash_Prefix_Key | When included: the value in the record that belongs to the key will be looked up and over-write the Logstash_Prefix for index generation. If the key/value is not found in the record then the Logstash_Prefix option will act as a fallback. Nested keys are not supported (if desired, you can use the nest filter plugin to remove nesting) |
Suppress_Type_Name | Off |
Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
Kafka output plugin allows to ingest your records into an Apache Kafka service. This plugin use the official librdkafka C library (built-in dependency)
Setting
rdkafka.log.connection.close
tofalse
andrdkafka.request.required.acks
to 1 are examples of recommended settings of librdfkafka properties.
In order to insert records into Apache Kafka, you can run the plugin from the command line or through the configuration file:
The kafka plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:
In your main configuration file append the following Input & Output sections:
Fluent-bit comes with support for avro encoding for the out_kafka plugin. Avro support is optional and must be activated at build-time by using a build def with cmake: -DFLB_AVRO_ENCODER=On
such as in the following example which activates:
out_kafka with avro encoding
fluent-bit's prometheus
metrics via an embedded http endpoint
debugging support
builds the test suites
This is example fluent-bit config tails kubernetes logs, decorates the log lines with kubernetes metadata via the kubernetes filter, and then sends the fully decorated log lines to a kafka broker encoded with a specific avro schema.
The influxdb output plugin, allows to flush your records into a InfluxDB time series database. The following instructions assumes that you have a fully operational InfluxDB service running in your system.
InfluxDB output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.
In order to start inserting records into an InfluxDB service, you can run the plugin from the command line or through the configuration file:
The influxdb plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:
Using the format specified, you could start Fluent Bit through:
In your main configuration file append the following Input & Output sections:
Basic example of Tag_Keys
usage:
With Auto_Tags=On in this example cause error, because every parsed field value type is string. Best usage of this option in metrics like record where one or more field value is not string typed.
Basic example of Tags_List_Key
usage:
Before to start Fluent Bit, make sure the target database exists on InfluxDB, using the above example, we will insert the data into a fluentbit database.
Log into InfluxDB console:
Create the database:
Check the database exists:
The following command will gather CPU metrics from the system and send the data to InfluxDB database every five seconds:
Note that all records coming from the cpu input plugin, have a tag cpu, this tag is used to generate the measurement in InfluxDB
From InfluxDB console, choose your database:
Now query some specific fields:
The CPU input plugin gather more metrics per CPU core, in the above example we just selected three specific metrics. The following query will give a full result:
Query tagged keys:
And now query method key values:
The kafka-rest output plugin, allows to flush your records into a Kafka REST Proxy server. The following instructions assumes that you have a fully operational Kafka REST Proxy and Kafka services running in your environment.
Kafka REST Proxy output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.
In order to insert records into a Kafka REST Proxy service, you can run the plugin from the command line or through the configuration file:
The kafka-rest plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:
In your main configuration file append the following Input & Output sections:
LogDNA is an intuitive cloud based log management system that provides you an easy interface to query your logs once they are stored.
The Fluent Bit logdna
output plugin allows you to send your log or events to a LogDNA compliant service like:
Before to get started with the plugin configuration, make sure to obtain the proper account to get access to the service. You can start with a free trial in the following link:
One of the features of Fluent Bit + LogDNA integration is the ability to auto enrich each record with further context.
When the plugin process each record (or log), it tries to lookup for specific key names that might contain specific context for the record in question, the following table describe the keys and the discovery logic:
The following configuration example, will emit a dummy example record and ingest it on LogDNA. Copy and paste the following content in a file called logdna.conf
:
run Fluent Bit with the new configuration file:
Fluent Bit output:
Your record will be available and visible in your LogDNA dashboard after a few seconds.
In your LogDNA dashboard, go to the top filters and mark the Tags aa
and bb
, then you will be able to see your records as the example below:
In order to flush records, the nats plugin requires to know two parameters:
In order to override the default configuration values, the plugin uses the optional Fluent Bit network address format, e.g:
As described above, the target service and storage point can be changed, e.g:
For every set of records flushed to a NATS Server, Fluent Bit uses the following JSON format:
Each record is an individual entity represented in a JSON array that contains a UNIX_TIMESTAMP and a JSON map with a set of key/values. A summarized output of the CPU input plugin will looks as this:
The Fluent Bit loki
built-in output plugin allows you to send your log or events to a Loki service. It supports data enrichment with Kubernetes labels, custom label keys and Tenant ID within others.
Loki store the record logs inside Streams, a stream is defined by a set of labels, at least one label is required.
Fluent Bit implements a flexible mechanism to set labels by using fixed key/value pairs of text but also allowing to set as labels certain keys that exists as part of the records that are being processed. Consider the following JSON record (pretty printed for readability):
If you decide that your Loki Stream will be composed by two labels called job
and the value of the record key called stream
, your labels
configuration properties might look as follows:
When processing above's configuration, internally the ending labels for the stream in question becomes:
Another feature of Labels management is the ability to provide custom key names, using the same record accessor pattern we can specify the key name manually and let the value to be populated automatically at runtime, e.g:
When processing that new configuration, the internal labels will be:
label_keys
propertyThe additional configuration property called label_keys
allow to specify multiple record keys that needs to be placed as part of the outgoing Stream Labels, yes, this is a similar feature than the one explained above in the labels
property. Consider this as another way to set a record key in the Stream, but with the limitation that you cannot use a custom name for the key value.
The following configuration examples generate the same Stream Labels:
the above configuration accomplish the same than this one:
both will generate the following Streams label:
Note that if you are running in a Kubernetes environment, you might want to enable the option auto_kubernetes_labels
which will auto-populate the streams with the Pod labels for you. Consider the following configuration:
Based in the JSON example provided above, the internal stream labels will be:
This plugin inherit core Fluent Bit features to customize the network behavior and optionally enable TLS in the communication channel. For more details about the specific options available refer to the following articles:
Note that all options mentioned in the articles above must be enabled in the plugin configuration in question.
An example configuration - make sure to set the credentials and ensure the host URL matches the correct one for your deployment:
The following configuration example, will emit a dummy example record and ingest it on Loki . Copy and paste the following content into a file called out_loki.conf
:
run Fluent Bit with the new configuration file:
Fluent Bit output:
Key to be used as the log level. Its value must be in (between 0 and 7). (Optional in GELF)
Specify an HTTP Proxy. The expected format of this value is http://HOST:PORT
. Note that HTTPS is not currently supported. It is recommended not to set this and to configure the instead as they support both HTTP and HTTPS.
Specify the buffer size used to read the response from the Elasticsearch HTTP service. This option is useful for debugging purposes where is required to read full responses, note that response size grows depending of the number of records inserted. To set an unlimited amount of memory set this value to False, otherwise the value must be according to the specification.
Time format (based on ) to generate the second part of the Index name.
When enabled, mapping types is removed and Type
option is ignored. Types are deprecated in APIs in . This options is for v7.0 or later.
Key | Description | default |
---|---|---|
Key | Description | default |
---|---|---|
Key | Description | default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description |
---|---|
The nats output plugin, allows to flush your records into a end point. The following instructions assumes that you have a fully operational NATS Server in place.
parameter | description | default |
---|
only requires to know that it needs to use the nats output plugin, if no extra information is given, it will use the default values specified in the above table.
is multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate.
Be aware there is a separate Golang output plugin provided by with different configuration options.
Key | Description | Default |
---|
As you can see the label job
has the value fluentbit
and the second label is configured to access the nested map called sub
targeting the value of the key stream
. Note that the second label name must starts with a $
, that means that's a pattern so it provide you the ability to retrieve values from nested maps by using the key names.
: timeouts, keepalive and source address
: all about TLS configuration and certificates
Fluent Bit supports sending logs (and metrics) to by providing the appropriate URL and ensuring TLS is enabled.
format
Specify data format, options available: json, msgpack.
json
message_key
Optional key to store the message
message_key_field
If set, the value of Message_Key_Field in the record will indicate the message key. If not set nor found in the record, Message_Key will be used (if set).
timestamp_key
Set the key to store the record timestamp
@timestamp
timestamp_format
'iso8601' or 'double'
double
brokers
Single of multiple list of Kafka Brokers, e.g: 192.168.1.3:9092, 192.168.1.4:9092.
topics
Single entry or list of topics separated by comma (,) that Fluent Bit will use to send messages to Kafka. If only one topic is set, that one will be used for all records. Instead if multiple topics exists, the one set in the record by Topic_Key will be used.
fluent-bit
topic_key
If multiple Topics exists, the value of Topic_Key in the record will indicate the topic to use. E.g: if Topic_Key is router and the record is {"key1": 123, "router": "route_2"}, Fluent Bit will use topic route_2. Note that if the value of Topic_Key is not present in Topics, then by default the first topic in the Topics list will indicate the topic to be used.
dynamic_topic
adds unknown topics (found in Topic_Key) to Topics. So in Topics only a default topic needs to be configured
Off
queue_full_retries
Fluent Bit queues data into rdkafka library, if for some reason the underlying library cannot flush the records the queue might fills up blocking new addition of records. The queue_full_retries
option set the number of local retries to enqueue the data. The default value is 10 times, the interval between each retry is 1 second. Setting the queue_full_retries
value to 0
set's an unlimited number of retries.
10
rdkafka.{property}
{property}
can be any librdkafka properties
Host
IP address or hostname of the target InfluxDB service
127.0.0.1
Port
TCP port of the target InfluxDB service
8086
Database
InfluxDB database name where records will be inserted
fluentbit
Bucket
InfluxDB bucket name where records will be inserted - if specified, database
is ignored and v2 of API is used
Org
InfluxDB organization name where the bucket is (v2 only)
fluent
Sequence_Tag
The name of the tag whose value is incremented for the consecutive simultaneous events.
_seq
HTTP_User
Optional username for HTTP Basic Authentication
HTTP_Passwd
Password for user defined in HTTP_User
HTTP_Token
Authentication token used with InfluDB v2 - if specified, both HTTP_User and HTTP_Passwd are ignored
Tag_Keys
Space separated list of keys that needs to be tagged
Auto_Tags
Automatically tag keys where value is string. This option takes a boolean value: True/False, On/Off.
Off
Tags_List_Enabled
Dynamically tag keys which are in the string array at Tags_List_Key key. This option takes a boolean value: True/False, On/Off.
Off
Tags_List_Key
Key of the string array optionally contained within each log record that contains tag keys for that record
tags
Host
IP address or hostname of the target Kafka REST Proxy server
127.0.0.1
Port
TCP port of the target Kafka REST Proxy server
8082
Topic
Set the Kafka topic
fluent-bit
Partition
Set the partition number (optional)
Message_Key
Set a message key (optional)
Time_Key
The Time_Key property defines the name of the field that holds the record timestamp.
@timestamp
Time_Key_Format
Defines the format of the timestamp.
%Y-%m-%dT%H:%M:%S
Include_Tag_Key
Append the Tag name to the final record.
Off
Tag_Key
If Include_Tag_Key is enabled, this property defines the key name for the tag.
_flb-key
logdna_host
LogDNA API host address
logs.logdna.com
logdna_port
LogDNA TCP Port
443
api_key
API key to get access to the service. This property is mandatory.
hostname
Name of the local machine or device where Fluent Bit is running.
When this value is not set, Fluent Bit lookup the hostname and auto populate the value. If it cannot be found, an unknown
value will be set instead.
mac
Mac address. This value is optional.
ip
IP address of the local hostname. This value is optional.
tags
A list of comma separated strings to group records in LogDNA and simplify the query with filters.
file
Optional name of a file being monitored. Note that this value is only set if the record do not contain a reference to it.
app
Name of the application. This value is auto discovered on each record, if not found, the default value is used.
Fluent Bit
level
If the record contains a key called level
or severity
, it will populate the context level
key with that value. If not found, the context key is not set.
file
if the record contains a key called file
, it will populate the context file
with the value found, otherwise If the plugin configuration provided a file
property, that value will be used instead (see table above).
app
If the record contains a key called app
, it will populate the context app
with the value found, otherwise it will use the value set for app
in the configuration property (see table above).
meta
if the record contains a key called meta
, it will populate the context meta
with the value found.
host | IP address or hostname of the NATS Server | 127.0.0.1 |
port | TCP port of the target NATS Server | 4222 |
host | Loki hostname or IP address. Do not include the subpath, i.e. | 127.0.0.1 |
port | Loki TCP port | 3100 |
http_user | Set HTTP basic authentication user name |
http_passwd | Set HTTP basic authentication password |
tenant_id | Tenant ID used by default to push logs to Loki. If omitted or empty it assumes Loki is running in single-tenant mode and no X-Scope-OrgID header is sent. |
labels | Stream labels for API request. It can be multiple comma separated of strings specifying | job=fluentbit |
label_keys | Optional list of record keys that will be placed as stream labels. This configuration property is for records key only. More details in the Labels section. |
remove_keys | Optional list of keys to remove. |
drop_single_key | If set to true and after extracting labels only a single key remains, the log line sent to Loki will be the value of that key in line_format. | off |
line_format | Format to use when flattening the record to a log line. Valid values are | json |
auto_kubernetes_labels | If set to true, it will add all Kubernetes labels to the Stream labels | off |
tenant_id_key | Specify the name of the key from the original record that contains the Tenant ID. The value of the key is set as |
The null output plugin just throws away events.
The plugin doesn't support configuration parameters.
You can run the plugin from the command line or through the configuration file:
From the command line you can let Fluent Bit throws away events with the following options:
In your main configuration file append the following Input & Output sections:
The Slack output plugin delivers records or messages to your preferred Slack channel. It formats the outgoing content in JSON format for readability.
This connector uses the Slack Incoming Webhooks feature to post messages to Slack channels. Using this plugin in conjunction with the Stream Processor is a good combination for alerting.
Before configuring this plugin, make sure to setup your Incoming Webhook. For detailed step-by-step instructions, review the following official documentation:
Once you have obtained the Webhook address you can place it in the configuration below.
Get started quickly with this configuration file:
New Relic is a data management platform that gives you real-time insights of your data for developers, operations and management teams.
The Fluent Bit nrlogs
output plugin allows you to send your logs to New Relic service.
Before to get started with the plugin configuration, make sure to obtain the proper account to get access to the service. You can register and start with a free trial in the following link:
The following configuration example, will emit a dummy example record and ingest it on New Relic. Copy and paste the following content in a file called newrelic.conf
:
run Fluent Bit with the new configuration file:
Fluent Bit output:
PostgreSQL is a very popular and versatile open source database management system that supports the SQL language and that is capable of storing both structured and unstructured data, such as JSON objects.
Given that Fluent Bit is designed to work with JSON objects, the pgsql
output plugin allows users to send their data to a PostgreSQL database and store it using the JSONB
type.
PostgreSQL 9.4 or higher is required.
According to the parameters you have set in the configuration file, the plugin will create the table defined by the table
option in the database defined by the database
option hosted on the server defined by the host
option. It will use the PostgreSQL user defined by the user
option, which needs to have the right privileges to create such a table in that database.
NOTE: If you are not familiar with how PostgreSQL's users and grants system works, you might find useful reading the recommended links in the "References" section at the bottom.
A typical installation normally consists of a self-contained database for Fluent Bit in which you can store the output of one or more pipelines. Ultimately, it is your choice to to store them in the same table, or in separate tables, or even in separate databases based on several factors, including workload, scalability, data protection and security.
In this example, for the sake of simplicity, we use a single table called fluentbit
in a database called fluentbit
that is owned by the user fluentbit
. Feel free to use different names. Preferably, for security reasons, do not use the postgres
user (which has SUPERUSER
privileges).
fluentbit
userGenerate a robust random password (e.g. pwgen 20 1
) and store it safely. Then, as postgres
system user on the server where PostgreSQL is installed, execute:
At the prompt, please provide the password that you previously generated.
As a result, the user fluentbit
without superuser privileges will be created.
If you prefer, instead of the createuser
application, you can directly use the SQL command CREATE USER
.
fluentbit
databaseAs postgres
system user, please run:
This will create a database called fluentbit
owned by the fluentbit
user. As a result, the fluentbit
user will be able to safely create the data table.
Alternatively, you can use the SQL command CREATE DATABASE
.
Make sure that the fluentbit
user can connect to the fluentbit
database on the specified target host. This might require you to properly configure the pg_hba.conf
file.
Fluent Bit relies on libpq, the PostgreSQL native client API, written in C language. For this reason, default values might be affected by environment variables and compilation settings. The above table, in brackets, list the most common default values for each connection option.
For security reasons, it is advised to follow the directives included in the password file section.
In your main configuration file add the following section:
The output plugin automatically creates a table with the name specified by the table
configuration option and made up of the following fields:
tag TEXT
time TIMESTAMP WITHOUT TIMEZONE
data JSONB
As you can see, the timestamp does not contain any information about the time zone and it is therefore referred to the time zone used by the connection to PostgreSQL (timezone
setting).
For more information on the JSONB
data type in PostgreSQL, please refer to the JSON types page in the official documentation, where you can find instructions on how to index or query the objects (including jsonpath
introduced in PostgreSQL 12).
PostgreSQL 10 introduces support for declarative partitioning. In order to improve vertical scalability of the database, you can decide to partition your tables on time ranges (for example on a monthly basis). PostgreSQL supports also subpartitions, allowing you to even partition by hash your records (version 11+), and default partitions (version 11+).
For more information on horizontal partitioning in PostgreSQL, please refer to the Table partitioning page in the official documentation.
If you are starting now, our recommendation at the moment is to choose the latest major version of PostgreSQL.
PostgreSQL is a really powerful and extensible database engine. More expert users can indeed take advantage of BEFORE INSERT
triggers on the main table and re-route records on normalised tables, depending on tags and content of the actual JSON objects.
For example, you can use Fluent Bit to send HTTP log records to the landing table defined in the configuration file. This table contains a BEFORE INSERT
trigger (a function in plpgsql
language) that normalises the content of the JSON object and that inserts the record in another table (with its own structure and partitioning model). This kind of triggers allow you to discard the record from the landing table by returning NULL
.
Here follows a list of useful resources from the PostgreSQL documentation:
Send logs to Splunk HTTP Event Collector
Connectivity, transport and authentication configuration properties:
Content and Splunk metadata (fields) handling configuration properties:
In order to insert records into a Splunk service, you can run the plugin from the command line or through the configuration file:
The splunk plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:
In your main configuration file append the following Input & Output sections:
By default, the Splunk output plugin nests the record under the event
key in the payload sent to the HEC. It will also append the time of the record to a top level time
key.
If you would like to customize any of the Splunk event metadata, such as the host or target index, you can set Splunk_Send_Raw On
in the plugin configuration, and add the metadata as keys/values in the record. Note: with Splunk_Send_Raw
enabled, you are responsible for creating and populating the event
section of the payload.
For example, to add a custom index and hostname:
This will create a payload that looks like:
If the option splunk_send_raw
has been enabled, the user must take care to put all log details in the event field, and only specify fields known to Splunk in the top level event, if there is a mismatch, Splunk will return a HTTP error 400.
Consider the following example:
splunk_send_raw off
splunk_send_raw on
For up to date information about the valid keys in the top level object, refer to the Splunk documentation:
Sending to a Splunk Metric index requires the use of Splunk_send_raw
option being enabled and formatting the message properly. This includes three specific operations
Nest metric events under a "fields" property
Add metric_name:
to all metrics
Add index, source, sourcetype as fields in the message
The following configuration gathers CPU metrics, nests the appropriate field, adds the required identifiers and then sends to Splunk.
Before to get started with the plugin configuration, make sure to obtain the proper credentials to get access to the service. We strongly recommend to use a common JSON credentials file, reference link:
Your goal is to obtain a credentials JSON file that will be used later by Fluent Bit Stackdriver output plugin.
If you are using a Google Cloud Credentials File, the following configuration is enough to get started:
Example configuration file for k8s resource type:
local_resource_id is used by stackdriver output plugin to set the labels field for different k8s resource types. Stackdriver plugin will try to find the local_resource_id field in the log entry. If there is no field logging.googleapis.com/local_resource_id in the log, the plugin will then construct it by using the tag value of the log.
The local_resource_id should be in format:
k8s_container.<namespace_name>.<pod_name>.<container_name>
k8s_node.<node_name>
k8s_pod.<namespace_name>.<pod_name>
This implies that if there is no local_resource_id in the log entry then the tag of logs should match this format. Note that we have an option tag_prefix so it is not mandatory to use k8s_container(node/pod) as the prefix for tag.
An upstream connection error means Fluent Bit was not able to reach Google services, the error looks like this:
This belongs to a network issue by the environment where Fluent Bit is running, make sure that from the Host, Container or Pod you can reach the following Google end-points:
The error looks like this:
Do following check:
If the log entry does not contain the local_resource_id field, does the tag of the log match for format?
If tag_prefix is configured, does the prefix of tag specified in the input plugin match the tag_prefix?
Other implementations
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description | default |
---|
We have specified to gather usage metrics and print them out to the standard output in a human readable way:
Splunk output plugin allows to ingest your records into a service through the HTTP Event Collector (HEC) interface.
To get more details about how to setup the HEC in Splunk please refer to the following documentation:
Key | Description | default |
---|
Key | Description | default |
---|
Splunk output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.
For more information on the Splunk HEC payload format and all event meatadata Splunk accepts, see here:
With Splunk version 8.0> you can also use the Fluent Bit Splunk output plugin to send data to metric indices. This allows you to perform visualizations, metric queries, and analysis with other metrics you may be collecting. This is based off of Splunk 8.0 support of multi metric support via single JSON payload, more details can be found on
Stackdriver output plugin allows to ingest your records into service.
Key | Description | default |
---|
Github reference:
Stackdriver officially supports a .
We plan to support some . Use cases of special fields is .
webhook
Absolute address of the Webhook provided by Slack
base_uri
Full address of New Relic API end-point. By default the value points to the US end-point.
If you want to use the EU end-point you can set this key to the following value: https://log-api.eu.newrelic.com/log/v1
api_key
Your key for data ingestion. The API key is also called the ingestion key, you can get more details on how to generated in the official documentation here.
From a configuration perspective either an api_key
or an license_key
is required. New Relic suggest to use primary the api_key
.
license_key
Optional authentication parameter for data ingestion.
Note that New Relic suggest to use the api_key
instead. You can read more about the License Key here.
compress
Set the compression mechanism for the payload. This option allows two values: gzip
(enabled by default) or false
to disable compression.
gzip
Host
Hostname/IP address of the PostgreSQL instance
- (127.0.0.1)
Port
PostgreSQL port
- (5432)
User
PostgreSQL username
- (current user)
Password
Password of PostgreSQL username
-
Database
Database name to connect to
- (current user)
Table
Table name where to store data
-
Timestamp_Key
Key in the JSON object containing the record timestamp
date
Async
Define if we will use async or sync connections
false
min_pool_size
Minimum number of connection in async mode
1
max_pool_size
Maximum amount of connections in async mode
4
cockroachdb
Set to true
if you will connect the plugin with a CockroachDB
false
Format | Specify the data format to be printed. Supported formats are msgpack json, json_lines and json_stream. | msgpack |
json_date_key | Specify the name of the time key in the output record. To disable the time key just set the value to | date |
json_date_format | Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681) | double |
Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 1 |
host | IP address or hostname of the target Splunk service. | 127.0.0.1 |
port | TCP port of the target Splunk service. | 8088 |
splunk_token | Specify the Authentication Token for the HTTP Event Collector interface. |
http_user | Optional username for Basic Authentication on HEC |
http_passwd | Password for user defined in HTTP_User |
http_buffer_size | Buffer size used to receive Splunk HTTP responses | 2M |
compress | Set payload compression mechanism. The only available option is |
channel | Specify X-Splunk-Request-Channel Header for the HTTP Event Collector interface. |
Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
splunk_send_raw | When enabled, the record keys and values are set in the top level of the map instead of under the event key. Refer to the Sending Raw Events section from the docs for more details to make this option work properly. | off |
event_key | Specify the key name that will be used to send a single value as part of the record. |
event_host | Specify the key name that contains the host value. This option allows a record accessors pattern. |
event_source | Set the source value to assign to the event data. |
event_sourcetype | Set the sourcetype value to assign to the event data. |
event_sourcetype_key | Set a record key that will populate 'sourcetype'. If the key is found, it will have precedence over the value set in |
event_index | The name of the index by which the event data is to be indexed. |
event_index_key | Set a record key that will populate the |
event_field | Set event fields for the record. This option can be set multiple times and the format is |
google_service_credentials | Absolute path to a Google Cloud credentials JSON file | Value of environment variable $GOOGLE_SERVICE_CREDENTIALS |
service_account_email | Account email associated to the service. Only available if no credentials file has been provided. | Value of environment variable $SERVICE_ACCOUNT_EMAIL |
service_account_secret | Private key content associated with the service account. Only available if no credentials file has been provided. | Value of environment variable $SERVICE_ACCOUNT_SECRET |
metadata_server | Prefix for a metadata server. Can also set environment variable $METADATA_SERVER. |
location | The GCP or AWS region in which to store data about the resource. If the resource type is one of the generic_node or generic_task, then this field is required. |
namespace | A namespace identifier, such as a cluster name or environment. If the resource type is one of the generic_node or generic_task, then this field is required. |
node_id | A unique identifier for the node within the namespace, such as hostname or IP address. If the resource type is generic_node, then this field is required. |
job | An identifier for a grouping of related task, such as the name of a microservice or distributed batch. If the resource type is generic_task, then this field is required. |
task_id | A unique identifier for the task within the namespace and job, such as a replica index identifying the task within the job. If the resource type is generic_task, then this field is required. |
export_to_project_id | The GCP project that should receive these logs. | Defaults to the project ID of the google_service_credentials file, or the project_id from Google's metadata.google.internal server. |
resource | Set resource type of data. Supported resource types: k8s_container, k8s_node, k8s_pod, global, generic_node, generic_task, and gce_instance. | global, gce_instance |
k8s_cluster_name | The name of the cluster that the container (node or pod based on the resource type) is running in. If the resource type is one of the k8s_container, k8s_node or k8s_pod, then this field is required. |
k8s_cluster_location | The physical location of the cluster that contains (node or pod based on the resource type) the container. If the resource type is one of the k8s_container, k8s_node or k8s_pod, then this field is required. |
labels_key | The value of this field is used by the Stackdriver output plugin to find the related labels from jsonPayload and then extract the value of it to set the LogEntry Labels. | logging.googleapis.com/labels |
tag_prefix | Set the tag_prefix used to validate the tag of logs with k8s resource type. Without this option, the tag of the log must be in format of k8s_container(pod/node).* in order to use the k8s_container resource type. Now the tag prefix is configurable by this option (note the ending dot). | k8s_container., k8s_pod., k8s_node. |
severity_key | Specify the name of the key from the original record that contains the severity information. |
autoformat_stackdriver_trace | Rewrite the trace field to include the projectID and format it for use with Cloud Trace. When this flag is enabled, the user can get the correct result by printing only the traceID (usually 32 characters). | false |
Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
The tcp output plugin allows to send records to a remote TCP server. The payload can be formatted in different ways as required.
The following parameters are available to configure a secure channel connection through TLS:
We have specified to gather CPU usage metrics and send them in JSON lines mode to a remote end-point using netcat service, e.g:
Run the following in a separate terminal, netcat will start listening for messages on TCP port 5170
Start Fluent Bit
No more, no less, it just works.
The Syslog output plugin allows you to deliver messages to Syslog servers. It supports RFC3164 and RFC5424 formats through different transports such as UDP, TCP or TLS.
As of Fluent Bit v1.5.3 the configuration is very strict. You must be aware of the structure of your original record so you can configure the plugin to use specific keys to compose your outgoing Syslog message.
Future versions of Fluent Bit are expanding this plugin feature set to support better handling of keys and message composing.
Get started quickly with this configuration file:
The following is an example of how to configure the syslog_sd_key
to send Structured Data to the remote Syslog server.
Example log:
Example configuration file:
Example output:
The websocket output plugin allows to flush your records into a WebSocket endpoint. For now the functionality is pretty basic and it issues a HTTP GET request to do the handshake, and then use TCP connections to send the data records in either JSON or MessagePack (or JSON) format.
In order to insert records into a HTTP server, you can run the plugin from the command line or through the configuration file:
The websocket plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:
Using the format specified, you could start Fluent Bit through:
In your main configuration file, append the following Input & Output sections:
Websocket plugin is working with tcp keepalive mode, please refer to networking section for details. Since websocket is a stateful plugin, it will decide when to send out handshake to server side, for example when plugin just begins to work or after connection with server has been dropped. In general, the interval to init a new websocket handshake would be less than the keepalive interval. With that stratgy, it could detect and resume websocket connetions.
Once Fluent Bit is running, you can send some messages using the netcat:
In Fluent Bit we should see the following output:
From the output of fluent-bit log, we see that once data has been ingested into fluent bit, plugin would perform handshake. After a while, no data or traffic is undergoing, tcp connection would been abort. And then another piece of data arrived, a retry for websocket plugin has been triggered, with another handshake and data flush.
There is another scenario, once websocket server flaps in a short time, which means it goes down and up in a short time, fluent-bit would resume tcp connection immediately. But in that case, websocket output plugin is a malfunction state, it needs to restart fluent-bit to get back to work.
The plugin supports the following configuration parameters:
Ideally you don't want to expose your API key from the command line, using a configuration file is highly desired.
In your main configuration file append the following Input & Output sections:
The HTTP input plugin allows you to send custom records to an HTTP endpoint.
The http input plugin allows Fluent Bit to open up an HTTP port that you can then route data to in a dynamic way. This plugin supports dynamic tags which allow you to send data with different tags through the same input. An example video and curl message can be seen below
How to set tag
The tag for the HTTP input plugin is set by adding the tag to the end of the request URL. This tag is then used to route the event through the system. For example, in the following curl message below the tag set is app.log
**. **If you do not set the tag http.0
is automatically used. If you have multiple HTTP inputs then they will follow a pattern of http.N
where N is an integer representing the input.
Example Curl message
Key | Description | default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description | Default |
---|---|---|
Key | Description | default |
---|---|---|
The td output plugin, allows to flush your records into the cloud service.
Key | Description | Default |
---|
In order to start inserting records into , you can run the plugin from the command line or through the configuration file:
Host
Target host where Fluent-Bit or Fluentd are listening for Forward messages.
127.0.0.1
Port
TCP Port of the target service.
5170
Format
Specify the data format to be printed. Supported formats are msgpack json, json_lines and json_stream.
msgpack
json_date_key
Specify the name of the time key in the output record. To disable the time key just set the value to false
.
date
json_date_format
Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681)
double
Workers
Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.
2
tls
Enable or disable TLS support
Off
tls.verify
Force certificate validation
On
tls.debug
Set TLS debug verbosity level. It accept the following values: 0 (No debug), 1 (Error), 2 (State change), 3 (Informational) and 4 Verbose
1
tls.ca_file
Absolute path to CA certificate file
tls.crt_file
Absolute path to Certificate file.
tls.key_file
Absolute path to private Key file.
tls.key_passwd
Optional password for tls.key_file file.
host
Domain or IP address of the remote Syslog server.
127.0.0.1
port
TCP or UDP port of the remote Syslog server.
514
mode
Desired transport type. Available options are tcp
, tls
and udp
.
udp
syslog_format
The Syslog protocol format to use. Available options are rfc3164
and rfc5424
.
rfc5424
syslog_maxsize
The maximum size allowed per message. The value must be an integer representing the number of bytes allowed. If no value is provided, the default size is set depending of the protocol version specified by syslog_format
.
rfc3164
sets max size to 1024 bytes.
rfc5424
sets the size to 2048 bytes.
syslog_severity_key
The key name from the original record that contains the Syslog severity number. This configuration is optional.
syslog_facility_key
The key name from the original record that contains the Syslog facility number. This configuration is optional.
syslog_hostname_key
The key name from the original record that contains the hostname that generated the message. This configuration is optional.
syslog_appname_key
The key name from the original record that contains the application name that generated the message. This configuration is optional.
syslog_procid_key
The key name from the original record that contains the Process ID that generated the message. This configuration is optional.
syslog_msgid_key
The key name from the original record that contains the Message ID associated to the message. This configuration is optional.
syslog_sd_key
The key name from the original record that contains the Structured Data (SD) content. This configuration is optional.
syslog_message_key
The key name from the original record that contains the message to deliver. Note that this property is mandatory, otherwise the message will be empty.
Host
IP address or hostname of the target WebScoket Server
127.0.0.1
Port
TCP port of the target WebScoket Server
80
URI
Specify an optional HTTP URI for the target websocket server, e.g: /something
/
Format
Specify the data format to be used in the HTTP request body, by default it uses msgpack. Other supported formats are json, json_stream and json_lines and gelf.
msgpack
json_date_key
Specify the name of the date field in output
date
json_date_format
Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681)
double
Key | Description | default |
host | The address to listen on | 0.0.0.0 |
port | The port for Fluent Bit to listen on | 9880 |
buffer_max_size | Specify the maximum buffer size in KB to receive a JSON message. | 4M |
buffer_chunk_size | This sets the chunk size for incoming incoming JSON messages. These chunks are then stored/managed in the space available by buffer_max_size. | 512K |
API |
Database | Specify the name of your target database. |
Table | Specify the name of your target table where the records will be stored. |
Region | Set the service region, available values: US and JP | US |
The API key. To obtain it please log into the and in the API keys box, copy the API key hash.