1 of 100

2.1 Fluent Bit v2.1 Documentation

High Performance Telemetry Agent for Logs, Metrics and Traces

is a Fast and Lightweight Telemetry Agent for Logs, Metrics, and Traces for Linux, macOS, Windows, and BSD family operating systems. It has been made with a strong focus on performance to allow the collection and processing of telemetry data from different sources without complexity.

Features

High Performance: High throughput with low resources consumption
Data Parsing
- Convert your unstructured messages using our parsers: , , and
Metrics Support: Prometheus and OpenTelemetry compatible
Reliability and Data Integrity
- Handling
- in memory and file system
Networking
- Security: built-in TLS/SSL support
- Asynchronous I/O
Pluggable Architecture and : Inputs, Filters and Outputs
- More than 100 built-in plugins are available
- Extensibility
  - Write any input, filter or output plugin in C language
  - WASM: or
  - Bonus: write or
: expose internal metrics over HTTP in JSON and format
: Perform data selection and transformation using simple SQL queries
- Create new streams of data using query results
- Aggregation Windows
- Data analysis and prediction: Timeseries forecasting
Portable: runs on Linux, macOS, Windows and BSD systems

Fluent Bit, Fluentd and CNCF

is a graduated sub-project under the umbrella of . Fluent Bit is licensed under the terms of the .

Fluent Bit was originally created by . As a CNCF-hosted project, it is a fully vendor-neutral and community-driven project.

About

What is Fluent Bit?

Fluent Bit is a CNCF sub-project under the umbrella of Fluentd

is an open-source telemetry agent specifically designed to efficiently handle the challenges of collecting and processing telemetry data across a wide range of environments, from constrained systems to complex cloud infrastructures. Managing telemetry data from various sources and formats can be a constant challenge, particularly when performance is a critical factor.

Rather than serving as a drop-in replacement, Fluent Bit enhances the observability strategy for your infrastructure by adapting and optimizing your existing logging layer, as well as metrics and traces processing. Furthermore, Fluent Bit supports a vendor-neutral approach, seamlessly integrating with other ecosystems such as Prometheus and OpenTelemetry. Trusted by major cloud providers, banks, and companies in need of a ready-to-use telemetry agent solution, Fluent Bit effectively manages diverse data sources and formats while maintaining optimal performance.

Fluent Bit can be deployed as an edge agent for localized telemetry data handling or utilized as a central aggregator/collector for managing telemetry data across multiple sources and environments.

has been designed with performance and low resource consumption in mind.

A Brief History of Fluent Bit

Every project has a story

On 2014, the Fluentd team at Treasure Data was forecasting the need for a lightweight log processor for constraint environments like Embedded Linux and Gateways, the project aimed to be part of the Fluentd Ecosystem; at that moment, Eduardo created Fluent Bit, a new open source solution written from scratch available under the terms of the Apache License v2.0.\

After the project was around for some time, it got more traction for normal Linux systems, also with the new containerized world, the Cloud Native community asked to extend the project scope to support more sources, filters, and destinations. Not so long after, Fluent Bit became one of the preferred solutions to solve the logging challenges in Cloud environments.

Fluentd & Fluent Bit

The Production Grade Telemetry Ecosystem

Telemetry data processing in general can be complex, and at scale a bit more, that's why was born. Fluentd has become more than a simple tool, it has grown into a fullscale ecosystem that contains SDKs for different languages and sub-projects like .

On this page, we will describe the relationship between the and open source projects, as a summary we can say both are:

Licensed under the terms of Apache License v2.0
Graduated Hosted projects by the
Production Grade solutions: deployed million of times every single day.
Vendor neutral and community driven projects
Widely Adopted by the Industry: trusted by all major companies like AWS, Microsoft, Google Cloud and hundreds of others.

Both projects share a lot of similarities, is fully designed and built on top of the best ideas of architecture and general design. Choosing which one to use depends on the end-user needs.

The following table describes a comparison of different areas of the projects:

Fluentd

Fluent Bit

Both and can work as Aggregators or Forwarders, they both can complement each other or use them as standalone solutions. In the recent years, Cloud Providers switched from Fluentd to Fluent Bit for performance and compatibility reasons. Fluent Bit is now considered the next generation solution.

Concepts

Buffering

Performance and Data Safety

When processes data, it uses the system memory (heap) as a primary and temporary place to store the record logs before they get delivered, in this private memory area the records are processed.

Buffering refers to the ability to store the records somewhere, and while they are processed and delivered, still be able to store more. Buffering in memory is the fastest mechanism, but there are certain scenarios where it requires special strategies to deal with , data safety or reduce memory consumption by the service in constrained environments.

Network failures or latency on third party service is pretty common, and on scenarios where we cannot deliver data fast enough as we receive new data to process, we likely will face backpressure.

Our buffering strategies are designed to solve problems associated with backpressure and general delivery failures.

Fluent Bit as buffering strategies go, offers a primary buffering mechanism in memory and an optional secondary one using the file system. With this hybrid solution you can accommodate any use case safely and keep a high performance while processing your data.

Both mechanisms are not mutually exclusive and when the data is ready to be processed or delivered it will always be in memory, while other data in the queue might be in the file system until is ready to be processed and moved up to memory.

To learn more about the buffering configuration in Fluent Bit, please jump to the section.

Data Pipeline

Input

The way to gather data from your sources

Fluent Bit provides different Input Plugins to gather information from different sources, some of them just collect data from log files while others can gather metrics information from the operating system. There are many plugins for different needs.

When an input plugin is loaded, an internal instance is created. Every instance has its own and independent configuration. Configuration keys are often called properties.

Every input plugin has its own documentation section where it's specified how it can be used and what properties are available.

For more details, please refer to the Input Plugins section.

Parser

Convert Unstructured to Structured messages

Dealing with raw strings or unstructured messages is a constant pain; having a structure is highly desired. Ideally we want to set a structure to the incoming data by the Input Plugins as soon as they are collected:

The Parser allows you to convert from unstructured to structured data. As a demonstrative example consider the following Apache (HTTP Server) log entry:

192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395

The above log line is a raw string without format, ideally we would like to give it a structure that can be processed later easily. If the proper configuration is used, the log entry could be converted to:

{
  "host":    "192.168.2.20",
  "user":    "-",
  "method":  "GET",
  "path":    "/cgi-bin/try/",
  "code":    "200",
  "size":    "3395",
  "referer": "",
  "agent":   ""
 }

Parsers are fully configurable and are independently and optionally handled by each input plugin, for more details please refer to the Parsers section.

Filter

Modify, Enrich or Drop your records

In production environments we want to have full control of the data we are collecting, filtering is an important feature that allows us to alter the data before delivering it to some destination.

Filtering is implemented through plugins, so each filter available could be used to match, exclude or enrich your logs with some specific metadata.

We support many filters, A common use case for filtering is Kubernetes deployments. Every Pod log needs to get the proper metadata associated

Very similar to the input plugins, Filters run in an instance context, which has its own independent configuration. Configuration keys are often called properties.

For more details about the Filters available and their usage, please refer to the section.

Buffer

Data processing with reliability

Previously defined in the Buffering concept section, the buffer phase in the pipeline aims to provide a unified and persistent mechanism to store your data, either using the primary in-memory model or using the filesystem based mode.

The buffer phase already contains the data in an immutable state, meaning, no other filter can be applied.

Note that buffered data is not raw text, it's in Fluent Bit's internal binary representation.

Fluent Bit offers a buffering mechanism in the file system that acts as a backup system to avoid data loss in case of system failures.

Output

Destinations for your data: databases, cloud services and more!

The output interface allows us to define destinations for the data. Common destinations are remote services, local file system or standard interface with others. Outputs are implemented as plugins and there are many available.

When an output plugin is loaded, an internal instance is created. Every instance has its own independent configuration. Configuration keys are often called properties.

Every output plugin has its own documentation section specifying how it can be used and what properties are available.

For more details, please refer to the Output Plugins section.

Installation

Requirements

Fluent Bit uses very low CPU and Memory consumption, it's compatible with most of x86, x86_64, arm32v7 and arm64v8 based platforms. In order to build it you need the following components in your system for the build process:

Compiler: GCC or clang
CMake
Flex & Bison: only if you enable the Stream Processor or Record Accessor feature (both enabled by default)
Libyaml development headers and libraries

In the core there are not other dependencies, For certain features that depends on third party components like output plugins with special backend libraries (e.g: kafka), those are included in the main source code repository.

Sources

Build with Static Configuration

in normal operation mode allows to be configurable through or using specific arguments in the command line, while this is the ideal deployment case, there are scenarios where a more restricted configuration is required: static configuration mode.

Static configuration mode aims to include a built-in configuration in the final binary of Fluent Bit, disabling the usage of external files or flags at runtime.

Getting Started

Requirements

The following steps assumes you are familiar with configuring Fluent Bit using text files and you have experience building it from scratch as described in the section.

Configuration Directory

In your file system prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. It is mandatory that this directory contain as a minimum one configuration file called fluent-bit.conf containing the required , and sections. As an example create a new fluent-bit.conf file with the following content:

the configuration provided above will calculate CPU metrics from the running system and print them to the standard output interface.

Build with Custom Configuration

Inside Fluent Bit source code, get into the build/ directory and run CMake appending the FLB_STATIC_CONF option pointing the configuration directory recently created, e.g:

then build it:

At this point the fluent-bit binary generated is ready to run without necessity of further configuration:

Linux Packages

The most secure option is to create the repositories acccording to the instructions for your specific OS.

A simple installation script is provided to be used for most Linux targets. This will by default install the most recent version released.

This is purely a convenience helper and should always be validated prior to use.

GPG key updates

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at so ensure this new one is added.

The GPG Key fingerprint of the new key is:

The previous key is still available at and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

Refer to the to see which platforms are supported in each release.

Migration to Fluent Bit

From version 1.9, td-agent-bit is a deprecated package and is removed after 1.9.9. The correct package name to use now is fluent-bit.

Amazon Linux

Install on Amazon Linux

Fluent Bit is distributed as fluent-bit package and is available for the latest Amazon Linux 2 and Amazon Linux 2023. The following architectures are supported

x86_64
aarch64 / arm64v8

Single line install

A simple installation script is provided to be used for most Linux targets. This will always install the most recent version released.

curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

This is purely a convenience helper and should always be validated prior to use. The recommended secure deployment approach is to follow the instructions below.

Amazon Linux 2022

Amazon Linux 2022 was previously supported but is removed since it became GA Amazon Linux 2023

Configure Yum

We provide fluent-bit through a Yum repository. In order to add the repository reference to your system, please add a new file called fluent-bit.repo in /etc/yum.repos.d/ with the following content:

Amazon Linux 2

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/amazonlinux/2/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1

Amazon Linux 2023

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/amazonlinux/2023/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1

Note: we encourage you always enable the gpgcheck for security reasons. All our packages are signed.

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.

The GPG Key fingerprint of the new key is:

C3C0 A285 34B9 293E AF51  FABD 9F9D DC08 3888 C1CD
Fluentbit releases (Releases signing key) <[email protected]>

The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Refer to the supported platform documentation to see which platforms are supported in each release.

Install

Once your repository is configured, run the following command to install it:

sudo yum install fluent-bit

Now the following step is to instruct systemd to enable the service:

sudo systemctl start fluent-bit

If you do a status check, you should see a similar output like this:

$ systemctl status fluent-bit
● fluent-bit.service - Fluent Bit
   Loaded: loaded (/usr/lib/systemd/system/fluent-bit.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2016-07-07 02:08:01 BST; 9s ago
 Main PID: 3820 (fluent-bit)
   CGroup: /system.slice/fluent-bit.service
           └─3820 /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
...

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.

Redhat / CentOS

Install on Redhat / CentOS

Fluent Bit is distributed as fluent-bit package and is available for the latest stable CentOS system.

The following architectures are supported

x86_64
aarch64 / arm64v8

For CentOS 9+ we use CentOS Stream as the canonical base system.

Single line install

A simple installation script is provided to be used for most Linux targets. This will always install the most recent version released.

curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

This is purely a convenience helper and should always be validated prior to use. The recommended secure deployment approach is to follow the instructions below.

CentOS 8

CentOS 8 is now EOL so the default Yum repositories are unavailable.

Make sure to configure to use an appropriate mirror, for example:

$ sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-* && \
  sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*

An alternative is to use Rocky or Alma Linux which should be equivalent.

Configure Yum

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/centos/$releasever/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
repo_gpgcheck=1
enabled=1

It is best practice to always enable the gpgcheck and repo_gpgcheck for security reasons. We sign our repository metadata as well as all of our packages.

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.

The GPG Key fingerprint of the new key is:

C3C0 A285 34B9 293E AF51  FABD 9F9D DC08 3888 C1CD
Fluentbit releases (Releases signing key) <[email protected]>

The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Refer to the supported platform documentation to see which platforms are supported in each release.

Install

Once your repository is configured, run the following command to install it:

sudo yum install fluent-bit

Now the following step is to instruct Systemd to enable the service:

sudo systemctl start fluent-bit

If you do a status check, you should see a similar output like this:

$ systemctl status fluent-bit
● fluent-bit.service - Fluent Bit
   Loaded: loaded (/usr/lib/systemd/system/fluent-bit.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2016-07-07 02:08:01 BST; 9s ago
 Main PID: 3820 (fluent-bit)
   CGroup: /system.slice/fluent-bit.service
           └─3820 /opt/fluent-bit/bin/fluent-bit -c etc/fluent-bit/fluent-bit.conf
...

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.

FAQ

Yum install fails with a "404 - Page not found" error for the package mirror

The fluent-bit.repo file for the latest installations of Fluent-Bit uses a $releasever variable to determine the correct version of the package to install to your system:

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/centos/$releasever/$basearch/
...

Depending on your Red Hat distribution version, this variable may return a value other than the OS major release version (e.g., RHEL7 Server distributions return "7Server" instead of just "7"). The Fluent-Bit package url uses just the major OS release version, so any other value here will cause a 404.

In order to resolve this issue, you can replace the $releasever variable with your system's OS major release version. For example:

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/centos/7/$basearch/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
repo_gpgcheck=1
enabled=1

Debian

Fluent Bit is distributed as fluent-bit package and is available for the latest (and legacy) stable Debian systems: Bookworm and Bullseye. The following architectures are supported

x86_64
aarch64 / arm64v8

Single line install

A simple installation script is provided to be used for most Linux targets. This will always install the most recent version released.

This is purely a convenience helper and should always be validated prior to use. The recommended secure deployment approach is to follow the instructions below.

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages. Follow the official Debian wiki guidance:

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at so ensure this new one is added.

The GPG Key fingerprint of the new key is:

The previous key is still available at and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

Refer to the to see which platforms are supported in each release.

Update your sources lists

On Debian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file - ensure to set CODENAME to your specific (e.g. bookworm for Debian 12):

Update your repositories database

Now let your system update the apt database:

We recommend upgrading your system (sudo apt-get upgrade). This could avoid potential issues with expired certificates.

Install Fluent Bit

Using the following apt-get command you are able now to install the latest fluent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Ubuntu

Fluent Bit is distributed as fluent-bit package and is available for the latest stable Ubuntu system: Jammy Jellyfish.

Single line install

A simple installation script is provided to be used for most Linux targets. This will always install the most recent version released.

This is purely a convenience helper and should always be validated prior to use. The recommended secure deployment approach is to follow the instructions below.

Server GPG key

The first step is to add our server GPG key to your keyring to ensure you can get our signed packages. Follow the official Debian wiki guidance:

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at so ensure this new one is added.

The GPG Key fingerprint of the new key is:

The previous key is still available at and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

Refer to the to see which platforms are supported in each release.

Update your sources lists

On Ubuntu, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file - ensure to set CODENAME to your specific (e.g. focal for Ubuntu 20.04):

Update your repositories database

Now let your system update the apt database:

We recommend upgrading your system (sudo apt-get upgrade). This could avoid potential issues with expired certificates.

If you have the following error "Certificate verification failed", you might want to check if the package ca-certificates is properly installed (sudo apt-get install ca-certificates).

Install Fluent Bit

Using the following apt-get command you are able now to install the latest fluent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Raspbian / Raspberry Pi

Fluent Bit is distributed as fluent-bit package and is available for the Raspberry, specifically for Raspbian distribution, the following versions are supported:

Raspbian Bullseye (11)
Raspbian Buster (10)

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

curl https://packages.fluentbit.io/fluentbit.key | sudo apt-key add -

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.

The GPG Key fingerprint of the new key is:

C3C0 A285 34B9 293E AF51  FABD 9F9D DC08 3888 C1CD
Fluentbit releases (Releases signing key) <[email protected]>

The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Refer to the supported platform documentation to see which platforms are supported in each release.

Update your sources lists

On Debian and derivative systems such as Raspbian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file.

Raspbian 11 (Bullseye)

deb https://packages.fluentbit.io/raspbian/bullseye bullseye main

Raspbian 10 (Buster)

deb https://packages.fluentbit.io/raspbian/buster buster main

Update your repositories database

Now let your system update the apt database:

sudo apt-get update

We recommend upgrading your system (sudo apt-get upgrade). This could avoid potential issues with expired certificates.

Install Fluent Bit

Using the following apt-get command you are able now to install the latest fluent-bit:

sudo apt-get install fluent-bit

Now the following step is to instruct systemd to enable the service:

sudo service fluent-bit start

If you do a status check, you should see a similar output like this:

sudo service fluent-bit status
● fluent-bit.service - Fluent Bit
   Loaded: loaded (/lib/systemd/system/fluent-bit.service; disabled; vendor preset: enabled)
   Active: active (running) since mié 2016-07-06 16:58:25 CST; 2h 45min ago
 Main PID: 6739 (fluent-bit)
    Tasks: 1
   Memory: 656.0K
      CPU: 1.393s
   CGroup: /system.slice/fluent-bit.service
           └─6739 /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
...

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Containers on AWS

AWS maintains a distribution of Fluent Bit combining the latest official release with a set of Go Plugins for sending logs to AWS services. AWS and Fluent Bit are working together to rewrite their plugins for inclusion in the official Fluent Bit distribution.

Plugins

Currently, the image contains Go Plugins for:

Fluent Bit includes Amazon CloudWatch Logs plugin named cloudwatch_logs, Amazon Kinesis Firehose plugin named kinesis_firehose and Amazon Kinesis Data Streams plugin named kinesis_streams which are higher performance than Go plugins.

Also, Fluent Bit includes S3 output plugin named s3.

Versions and Regional Repositories

AWS vends their container image via , and a set of highly available regional Amazon ECR repositories. For more information, see the .

The AWS for Fluent Bit image uses a custom versioning scheme because it contains multiple projects. To see what each release contains, check out the .

SSM Public Parameters

AWS vends SSM Public Parameters with the regional repository link for each image. These parameters can be queried by any AWS account.

To see a list of available version tags in a given region, run the following command:

To see the ECR repository URI for a given image tag in a given region, run the following:

You can use these SSM public parameters as parameters in your CloudFormation templates:

Amazon EC2

Learn how to install Fluent Bit and the AWS output plugins on Amazon Linux 2 via AWS Systems Manager.

Yocto / Embedded Linux

source code provides Bitbake recipes to configure, build and package the software for a Yocto based image. Note that specific steps of usage of these recipes in your Yocto environment (Poky) is out of the scope of this documentation.

We distribute two main recipes, one for testing/dev purposes and other with the latest stable release.

Version

Recipe

Description

It's strongly recommended to always use the stable release of Fluent Bit recipe and not the one from GIT master for production deployments.

Fluent Bit and other architectures

Fluent Bit >= v1.1.x fully supports x86_64, x86, arm32v7 and arm64v8.

Administration

Classic mode

Format and Schema

Fluent Bit might optionally use a configuration file to define how the service will behave.

Before proceeding we need to understand how the configuration schema works.

The schema is defined by three concepts:

Sections
Entries: Key/Value
Indented Configuration Mode

A simple example of a configuration file is as follows:

Sections

A section is defined by a name or title inside brackets. Looking at the example above, a Service section has been set using [SERVICE] definition. Section rules:

All section content must be indented (4 spaces ideally).
Multiple sections can exist on the same file.
A section is expected to have comments and entries, it cannot be empty.
Any commented line under a section, must be indented too.

Entries: Key/Value

A section may contain Entries, an entry is defined by a line of text that contains a Key and a Value, using the above example, the [SERVICE] section contains two entries, one is the key Daemon with value off and the other is the key Log_Level with the value debug. Entries rules:

An entry is defined by a key and a value.
A key must be indented.
A key must contain a value which ends in the breakline.
Multiple keys with the same name can exist.

Also commented lines are set prefixing the # character, those lines are not processed but they must be indented too.

Indented Configuration Mode

Fluent Bit configuration files are based in a strict Indented Mode, that means that each configuration file must follow the same pattern of alignment from left to right when writing text. By default an indentation level of four spaces from left to right is suggested. Example:

As you can see there are two sections with multiple entries and comments, note also that empty lines are allowed and they do not need to be indented.

Variables

Fluent Bit supports the usage of environment variables in any value associated to a key when using a configuration file.

The variables are case sensitive and can be used in the following format:

${MY_VARIABLE}

When Fluent Bit starts, the configuration reader will detect any request for ${MY_VARIABLE} and will try to resolve its value.

When Fluent Bit is running under systemd (using the official packages), environment variables can be set in the following files:

/etc/default/fluent-bit (Debian based system)
/etc/sysconfig/fluent-bit (Others)

These files are ignored if they do not exist.

Example

Create the following configuration file (fluent-bit.conf):

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info

[INPUT]
    Name cpu
    Tag  cpu.local

[OUTPUT]
    Name  ${MY_OUTPUT}
    Match *

Open a terminal and set the environment variable:

$ export MY_OUTPUT=stdout

The above command set the 'stdout' value to the variable MY_OUTPUT.

Run Fluent Bit with the recently created configuration file:

$ bin/fluent-bit -c fluent-bit.conf
Fluent Bit v1.4.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/03/03 12:25:25] [ info] [engine] started
[0] cpu.local: [1491243925, {"cpu_p"=>1.750000, "user_p"=>1.750000, "system_p"=>0.000000, "cpu0.p_cpu"=>3.000000, "cpu0.p_user"=>2.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>4.000000, "cpu2.p_user"=>4.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

As you can see the service worked properly as the configuration was valid.

Upstream Servers

It's common that Fluent Bit output plugins aims to connect to external services to deliver the logs over the network, this is the case of HTTP, Elasticsearch and Forward within others. Being able to connect to one node (host) is normal and enough for more of the use cases, but there are other scenarios where balancing across different nodes is required. The Upstream feature provides such capability.

An Upstream defines a set of nodes that will be targeted by an output plugin, by the nature of the implementation an output plugin must support the Upstream feature. The following plugin(s) have Upstream support:

Forward

The current balancing mode implemented is round-robin.

Configuration

To define an Upstream it's required to create an specific configuration file that contains an UPSTREAM and one or multiple NODE sections. The following table describe the properties associated to each section. Note that all of them are mandatory:

Section

Key

Description

UPSTREAM

name

Defines a name for the Upstream in question.

NODE

name

Defines a name for the Node in question.

host

IP address or hostname of the target host.

port

TCP port of the target service.

Nodes and specific plugin configuration

A Node might contain additional configuration keys required by the plugin, on that way we provide enough flexibility for the output plugin, a common use case is Forward output where if TLS is enabled, it requires a shared key (more details in the example below).

Nodes and TLS (Transport Layer Security)

In addition to the properties defined in the table above, the network operations against a defined node can optionally be done through the use of TLS for further encryption and certificates use.

The TLS options available are described in the TLS/SSL section and can be added to the any Node section.

Configuration File Example

The following example defines an Upstream called forward-balancing which aims to be used by Forward output plugin, it register three Nodes:

node-1: connects to 127.0.0.1:43000
node-2: connects to 127.0.0.1:44000
node-3: connects to 127.0.0.1:45000 using TLS without verification. It also defines a specific configuration option required by Forward output called shared_key.

[UPSTREAM]
    name       forward-balancing

[NODE]
    name       node-1
    host       127.0.0.1
    port       43000

[NODE]
    name       node-2
    host       127.0.0.1
    port       44000

[NODE]
    name       node-3
    host       127.0.0.1
    port       45000
    tls        on
    tls.verify off
    shared_key secret

Note that every Upstream definition must exists on it own configuration file in the file system. Adding multiple Upstreams in the same file or different files is not allowed.

YAML Configuration

YAML configuration feature was introduced since FLuent Bit version 1.9 as experimental, and it is production ready since Fluent Bit 2.0.

Unit Sizes

Certain configuration directives in Fluent Bit refer to unit sizes such as when defining the size of a buffer or specific limits, we can find these in plugins like Tail Input, Forward Input or in generic properties like Mem_Buf_Limit.

Starting from Fluent Bit v0.11.10, all unit sizes have been standardized across the core and plugins, the following table describes the options that can be used and what they mean:

Suffix

Description

Example

When a suffix is not specified, it's assumed that the value given is a bytes representation.

Specifying a value of 32000, means 32000 bytes

k, K, KB, kb

Kilobyte: a unit of memory equal to 1,000 bytes.

32k means 32000 bytes.

m, M, MB, mb

Megabyte: a unit of memory equal to 1,000,000 bytes

1M means 1000000 bytes

g, G, GB, gb

Gigabyte: a unit of memory equal to 1,000,000,000 bytes

1G means 1000000000 bytes

Backpressure

Under certain scenarios it is possible for logs or data to be ingested or created faster than the ability to flush it to some destinations. One such common scenario is when reading from big log files, especially with a large backlog, and dispatching the logs to a backend over the network, which takes time to respond. This generates backpressure leading to high memory consumption in the service.

In order to avoid backpressure, Fluent Bit implements a mechanism in the engine that restricts the amount of data that an input plugin can ingest, this is done through the configuration parameters Mem_Buf_Limit and storage.Max_Chunks_Up.

As described in the Buffering concepts section, Fluent Bit offers two modes for data handling: in-memory only (default) and in-memory + filesystem (optional).

The default storage.type memory buffer can be restricted with Mem_Buf_Limit. If memory reaches this limit and you reach a backpressure scenario, you will not be able to ingest more data until the data chunks that are in memory can be flushed. The input will be paused and Fluent Bit will emit a [warn] [input] {input name or alias} paused (mem buf overlimit) log message. Depending on the input plugin in use, this might lead to discard incoming data (e.g: TCP input plugin). The tail plugin can handle pause without data loss; it will store its current file offset and resume reading later. When buffer memory is available, the input will resume collecting/accepting logs and Fluent Bit will emit a [info] [input] {input name or alias} resume (mem buf overlimit) message.

This risk of data loss can be mitigated by configuring secondary storage on the filesystem using the storage.type of filesystem (as described in Buffering & Storage). Initially, logs will be buffered to both memory and filesystem. When the storage.max_chunks_up limit is reached, all the new data will be stored safely only in the filesystem. Fluent Bit will stop enqueueing new data in memory and will only buffer to the filesystem. Please note that when storage.type filesystem is set, the Mem_Buf_Limit setting no longer has any effect, instead, the [SERVICE] level storage.max_chunks_up setting controls the size of the memory buffer.

Mem_Buf_Limit

This option is disabled by default and can be applied to all input plugins. Please note that Mem_Buf_Limit only applies with the default storage.type memory. Let's explain its behavior using the following scenario:

Mem_Buf_Limit is set to 1MB (one megabyte)
input plugin tries to append 700KB
engine route the data to an output plugin
output plugin backend (HTTP Server) is down
engine scheduler will retry the flush after 10 seconds
input plugin tries to append 500KB

At this exact point, the engine will allow appending those 500KB of data into the memory; in total it will have 1.2MB of data buffered. The limit is permissive and will allow a single write past the limit, but once the limit is exceeded the following actions are taken:

block local buffers for the input plugin (cannot append more data)
notify the input plugin invoking a pause callback

The engine will protect itself and will not append more data coming from the input plugin in question; note that it is the responsibility of the plugin to keep state and decide what to do in that paused state.

After some time, usually measured in seconds, if the scheduler was able to flush the initial 700KB of data or it has given up after retrying, that amount of memory is released and the following actions will occur:

Upon data buffer release (700KB), the internal counters get updated
Counters now are set at 500KB
Since 500KB is < 1MB it checks the input plugin state
If the plugin is paused, it invokes a resume callback
input plugin can continue appending more data

storage.max_chunks_up

Please note that when storage.type filesystem is set, the Mem_Buf_Limit setting no longer has any effect, instead, the [SERVICE] level storage.max_chunks_up setting controls the size of the memory buffer.

The setting behaves similarly to the above scenario with Mem_Buf_Limit when the non-default storage.pause_on_chunks_overlimit is enabled.

When (default) storage.pause_on_chunks_overlimit is disabled, the input will not pause when the memory limit is reached. Instead, it will switch to only buffering logs in the filesystem. The disk spaced used for filesystem buffering can be limited with storage.total_limit_size.

Please consule the Buffering & Storage docs for more information.

About pause and resume Callbacks

Each plugin is independent and not all of them implements the pause and resume callbacks. As said, these callbacks are just a notification mechanism for the plugin.

One example of a plugin that implements these callbacks and keeps state correctly is the Tail Input plugin. When the pause callback is triggered, it pauses its collectors and stops appending data. Upon resume, it resumes the collectors and continues ingesting data. Tail will track the current file offset when it pauses and resume at the same position. If the file has not been deleted or moved, it can still be read.

With the default storage.type memory and Mem_Buf_Limit, the following log messages will be emitted for pause and resume:

[warn] [input] {input name or alias} paused (mem buf overlimit)
[info] [input] {input name or alias} resume (mem buf overlimit)

With storage.type filesystem and storage.max_chunks_up, the following log messages will be emitted for pause and resume:

[input] {input name or alias} paused (storage buf overlimit
[input] {input name or alias} resume (storage buf overlimit

Memory Management

In certain scenarios it would be ideal to estimate how much memory Fluent Bit could be using, this is very useful for containerized environments where memory limits are a must.

In order to that we will assume that the input plugins have set the Mem_Buf_Limit option (you can learn more about it in the section).

Estimating

Input plugins append data independently, so in order to do an estimation, a limit should be imposed through the Mem_Buf_Limit option. If the limit was set to 10MB we need to estimate that in the worse case, the output plugin likely could use 20MB.

Fluent Bit has an internal binary representation for the data being processed, but when this data reaches an output plugin, it will likely create its own representation in a new memory buffer for processing. The best examples are the and output plugins, both need to convert the binary representation to their respective custom JSON formats before it can be sent to the backend servers.

So, if we impose a limit of 10MB for the input plugins and consider the worse case scenario of the output plugin consuming 20MB extra, as a minimum we need (30MB x 1.2) = 36MB.

Glibc and Memory Fragmentation

It is well known that in intensive environments where memory allocations happen in the orders of magnitude, the default memory allocator provided by Glibc could lead to high fragmentation, reporting a high memory usage by the service.

It's strongly suggested that in any production environment, Fluent Bit should be built with enabled (e.g. -DFLB_JEMALLOC=On). Jemalloc is an alternative memory allocator that can reduce fragmentation (among others things) resulting in better performance.

You can check if Fluent Bit has been built with Jemalloc using the following command:

The output should look like:

If the FLB_HAVE_JEMALLOC option is listed in Build Flags, everything will be fine.

HTTP Proxy

Enable traffic through a proxy server via HTTP_PROXY environment variable

HTTP Proxy

Fluent Bit supports configuring an HTTP proxy for all egress HTTP/HTTPS traffic via the HTTP_PROXY or http_proxy environment variable.

The format for the HTTP proxy environment variable is http://USER:PASS@HOST:PORT, where:

USER is the username when using basic authentication.
PASS is the password when using basic authentication.
HOST is the HTTP proxy hostname or IP address.
PORT is the port the HTTP proxy is listening on.

To use an HTTP proxy with basic authentication, provide the username and password:

When no authentication is required, omit the username and password:

The HTTP_PROXY environment variable is a for setting a HTTP proxy in a containerized environment, and it is also natively supported by any application written in Go. Therefore, we follow and implement the same convention for Fluent Bit. For convenience and compatibility, the http_proxy environment variable is also supported. When both the HTTP_PROXY and http_proxy environment variables are provided, HTTP_PROXY will be preferred.

Note: The also supports configuring an HTTP proxy. This configuration continues to work, however it should not be used together with the HTTP_PROXY or http_proxy environment variable. This is because under the hood, the environment variable based proxy configuration is implemented by setting up a TCP connection tunnel via . Unlike the plugin's implementation, this supports both HTTP and HTTPS egress traffic.

NO_PROXY

Not all traffic should flow through the HTTP proxy. In this case, the NO_PROXY or no_proxy environment variable should be used.

The format for the no proxy environment variable is a comma-separated list of hostnames or IP addresses whose traffic should not flow through the HTTP proxy.

A domain name matches itself and all its subdomains (i.e. foo.com matches foo.com and bar.foo.com):

A domain with a leading . only matches its subdomains (i.e. .foo.com matches bar.foo.com but not foo.com):

One typical use case for NO_PROXY is when running Fluent Bit in a Kubernetes environment, where we want:

All real egress traffic to flow through an HTTP proxy.
All local Kubernetes traffic to not flow through the HTTP proxy.

In this case, we can set:

For convenience and compatibility, the no_proxy environment variable is also supported. When both the NO_PROXY and no_proxy environment variables are provided, NO_PROXY will be preferred.

Hot Reload

Enable hot reload through SIGHUP signal or an HTTP endpoint

Fluent Bit supports the hot reloading feature when enabled via the configuration file or command line with -Y or --enable-hot-reload option.

Getting Started

To get started with reloading via HTTP, the first step is to enable the HTTP Server from the configuration file:

[SERVICE]
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_PORT    2020
    Hot_Reload   On
...

The above configuration snippet will enable the HTTP endpoint for hot reloading.

How to reload

Via HTTP

Hot reloading can be kicked via HTTP endpoints that are:

PUT /api/v2/reload
POST /api/v2/reload

If users don't enable the hot reloading feature, hot reloading via these endpoints will not work.

For using curl to reload Fluent Bit, users must specify an empty request body as:

curl -X POST -d '{}' localhost:2020/api/v2/reload

Via Signal

Hot reloading also can be kicked via SIGHUP.

SIGHUP signal is not supported on Windows. So, users can't enable this feature on Windows.

Limitations

The hot reloading feature is currently working on Linux and macOS. Windows is not supported yet.

Local Testing

Running a Logging Pipeline Locally

You may wish to test a logging pipeline locally to observe how it deals with log messages. The following is a walk-through for running Fluent Bit and Elasticsearch locally with Docker Compose which can serve as an example for testing other plugins locally.

Create a Configuration File

Refer to the Configuration File section to create a configuration to test.

fluent-bit.conf:

[INPUT]
  Name dummy
  Dummy {"top": {".dotted": "value"}}

[OUTPUT]
  Name es
  Host elasticsearch
  Replace_Dots On

Docker Compose

Use Docker Compose to run Fluent Bit (with the configuration file mounted) and Elasticsearch.

docker-compose.yaml:

version: "3.7"

services:
  fluent-bit:
    image: fluent/fluent-bit
    volumes:
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
    depends_on:
      - elasticsearch
  elasticsearch:
    image: elasticsearch:7.6.2
    ports:
      - "9200:9200"
    environment:
      - discovery.type=single-node

View indexed logs

To view indexed logs run:

curl "localhost:9200/_search?pretty" \
  -H 'Content-Type: application/json' \
  -d'{ "query": { "match_all": {} }}'

To "start fresh", delete the index by running:

curl -X DELETE "localhost:9200/fluent-bit?pretty"

Data Pipeline

Pipeline Monitoring

Learn how to monitor your data pipeline with external services

A Data Pipeline represents a flow of data that goes through the inputs (sources), filters, and output (sinks). There are a couple of ways to monitor the pipeline. We recommend the following sections for a better understanding and steps to get started:

HTTP Server: JSON and Prometheus Exporter-style metrics
Grafana Dashboards and Alerts
Health Checks
Calyptia Cloud: hosted service to monitor and visualize your pipelines

Inputs

Collectd

The collectd input plugin allows you to receive datagrams from collectd service.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Configuration Examples

Here is a basic configuration example.

With this configuration, Fluent Bit listens to 0.0.0.0:25826, and outputs incoming datagram packets to stdout.

You must set the same types.db files that your collectd server uses. Otherwise, Fluent Bit may not be able to interpret the payload properly.

Docker Log Based Metrics

The docker input plugin allows you to collect Docker container metrics such as memory usage and CPU consumption.

Content:

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Interval_Sec

Polling interval in seconds

Include

A space-separated list of containers to include

Exclude

A space-separated list of containers to exclude

If you set neither Include nor Exclude, the plugin will try to get metrics from all the running containers.

Configuration File

Here is an example configuration that collects metrics from two docker instances (6bab19c3a0f9 and 14159be4ca2c).

[INPUT]
    Name         docker
    Include      6bab19c3a0f9 14159be4ca2c
[OUTPUT]
    Name   stdout
    Match  *

This configuration will produce records like below.

[1] docker.0: [1571994772.00555745, {"id"=>"6bab19c3a0f9", "name"=>"postgresql", "cpu_used"=>172102435, "mem_used"=>5693400, "mem_limit"=>4294963200}]

Docker Events

The docker events input plugin uses the docker API to capture server events. A complete list of possible events returned by this plugin can be found

Configuration Parameters

This plugin supports the following configuration parameters:

Key

Description

Default

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

Dummy

The dummy input plugin, generates dummy events. It is useful for testing, debugging, benchmarking and getting started with Fluent Bit.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

Fluent Bit Metrics

A plugin to collect Fluent Bit's own metrics

Fluent Bit exposes its own metrics to allow you to monitor the internals of your pipeline. The collected metrics can be processed similarly to those from the Prometheus Node Exporter input plugin. They can be sent to output plugins including Prometheus Exporter, Prometheus Remote Write or OpenTelemetry..

Important note: Metrics collected with Node Exporter Metrics flow through a separate pipeline from logs and current filters do not operate on top of metrics.

Configuration

Key

Description

Default

scrape_interval

The rate at which metrics are collected from the host operating system

2 seconds

scrape_on_start

Scrape metrics upon start, useful to avoid waiting for 'scrape_interval' for the first round of metrics.

false

Getting Started

Simple Configuration File

In the following configuration file, the input plugin _node_exporter_metrics collects _metrics every 2 seconds and exposes them through our Prometheus Exporter output plugin on HTTP/TCP port 2021.

# Fluent Bit Metrics + Prometheus Exporter
# -------------------------------------------
# The following example collects Fluent Bit metrics and exposes
# them through a Prometheus HTTP end-point.
#
# After starting the service try it with:
#
# $ curl http://127.0.0.1:2021/metrics
#
[SERVICE]
    flush           1
    log_level       info

[INPUT]
    name            fluentbit_metrics
    tag             internal_metrics
    scrape_interval 2

[OUTPUT]
    name            prometheus_exporter
    match           internal_metrics
    host            0.0.0.0
    port            2021

You can test the expose of the metrics by using curl:

curl http://127.0.0.1:2021/metrics

Health

Health input plugin allows you to check how healthy a TCP server is. It does the check by issuing a TCP connection every a certain interval of time.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Host

Name of the target host or IP address to check.

Port

TCP port where to perform the connection check.

Interval_Sec

Interval in seconds between the service checks. Default value is 1.

Internal_Nsec

Specify a nanoseconds interval for service checks, it works in conjunction with the Interval_Sec configuration key. Default value is 0.

Alert

If enabled, it will only generate messages if the target TCP service is down. By default this option is disabled.

Add_Host

If enabled, hostname is appended to each records. Default value is false.

Add_Port

If enabled, port number is appended to each records. Default value is false.

Getting Started

In order to start performing the checks, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit generate the checks with the following options:

$ fluent-bit -i health -p host=127.0.0.1 -p port=80 -o stdout

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name          health
    Host          127.0.0.1
    Port          80
    Interval_Sec  1
    Interval_NSec 0

[OUTPUT]
    Name   stdout
    Match  *

Testing

Once Fluent Bit is running, you will see some random values in the output interface similar to this:

$ fluent-bit -i health -p host=127.0.0.1 -p port=80 -o stdout
Fluent Bit v1.8.0
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/06/20 08:39:47] [ info] [engine] started (pid=4621)
[2021/06/20 08:39:47] [ info] [storage] version=1.1.1, initializing...
[2021/06/20 08:39:47] [ info] [storage] in-memory
[2021/06/20 08:39:47] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/06/20 08:39:47] [ info] [sp] stream processor started
[0] health.0: [1624145988.305640385, {"alive"=>true}]
[1] health.0: [1624145989.305575360, {"alive"=>true}]
[2] health.0: [1624145990.306498573, {"alive"=>true}]
[3] health.0: [1624145991.305595498, {"alive"=>true}]

Kernel Logs

The kmsg input plugin reads the Linux Kernel log buffer since the beginning, it gets every record and parse it field as priority, sequence, seconds, useconds, and message.

Configuration Parameters

Key

Description

Default

Getting Started

In order to start getting the Linux Kernel messages, you can run the plugin from the command line or through the configuration file:

Command Line

As described above, the plugin processed all messages that the Linux Kernel reported, the output has been truncated for clarification.

Configuration File

In your main configuration file append the following Input & Output sections:

Memory Metrics

The mem input plugin, gathers the information about the memory and swap usage of the running system every certain interval of time and reports the total amount of memory and the amount of free available.

Getting Started

In order to get memory and swap usage from your system, you can run the plugin from the command line or through the configuration file:

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

MQTT

The MQTT input plugin, allows to retrieve messages/data from MQTT control packets over a TCP connection. The incoming data to receive must be a JSON map.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Getting Started

In order to start listening for MQTT messages, you can run the plugin from the command line or through the configuration file:

Command Line

Since the MQTT input plugin let Fluent Bit behave as a server, we need to dispatch some messages using some MQTT client, in the following example mosquitto tool is being used for the purpose:

The following command line will send a message to the MQTT input plugin:

Configuration File

In your main configuration file append the following Input & Output sections:

Process Log Based Metrics

Process input plugin allows you to check how healthy a process is. It does so by performing a service check at every certain interval of time specified by the user.

The Process metrics plugin creates metrics that are log-based (I.e. JSON payload). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Getting Started

In order to start performing the checks, you can run the plugin from the command line or through the configuration file:

The following example will check the health of crond process.

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you will see the health of process:

Random

Random input plugin generate very simple random value samples using the device interface /dev/urandom, if not available it will use a unix timestamp as value.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Getting Started

In order to start generating random samples, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit generate the samples with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

StatsD

The statsd input plugin allows you to receive metrics via StatsD protocol.

Content:

Configuration Parameters
Configuration Examples

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Listen

Listener network interface.

0.0.0.0

Port

UDP port where listening for connections

8125

Configuration Examples

Here is a configuration example.

[INPUT]
    Name   statsd
    Listen 0.0.0.0
    Port   8125

[OUTPUT]
    Name   stdout
    Match  *

Now you can input metrics through the UDP port as follows:

echo "click:10|c|@0.1" | nc -q0 -u 127.0.0.1 8125
echo "active:99|g"     | nc -q0 -u 127.0.0.1 8125

Fluent Bit will produce the following records:

[0] statsd.0: [1574905088.971380537, {"type"=>"counter", "bucket"=>"click", "value"=>10.000000, "sample_rate"=>0.100000}]
[0] statsd.0: [1574905141.863344517, {"type"=>"gauge", "bucket"=>"active", "value"=>99.000000, "incremental"=>0}]

Node Exporter Metrics

A plugin based on Prometheus Node Exporter to collect system / host level metrics

Prometheus Node Exporter is a popular way to collect system level metrics from operating systems, such as CPU / Disk / Network / Process statistics. Fluent Bit 1.8.0 includes node exporter metrics plugin that builds off the Prometheus design to collect system level metrics without having to manage two separate processes or agents.

The initial release of Node Exporter Metrics contains a subset of collectors and metrics available from Prometheus Node Exporter and we plan to expand them over time.

Important note: Metrics collected with Node Exporter Metrics flow through a separate pipeline from logs and current filters do not operate on top of metrics.

This plugin is currently only supported on Linux based operating systems\

Configuration

Key

Description

Default

scrape_interval

The rate at which metrics are collected from the host operating system

5 seconds

path.procfs

The mount point used to collect process information and metrics

/proc/

path.sysfs

The path in the filesystem used to collect system metrics

/sys/

collector.cpu.scrape_interval

The rate in seconds at which cpu metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.cpufreq.scrape_interval

The rate in seconds at which cpufreq metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.meminfo.scrape_interval

The rate in seconds at which meminfo metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.diskstats.scrape_interval

The rate in seconds at which diskstats metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.filesystem.scrape_interval

The rate in seconds at which filesystem metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.uname.scrape_interval

The rate in seconds at which uname metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.stat.scrape_interval

The rate in seconds at which stat metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.time.scrape_interval

The rate in seconds at which time metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.loadavg.scrape_interval

The rate in seconds at which loadavg metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.vmstat.scrape_interval

The rate in seconds at which vmstat metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

collector.filefd.scrape_interval

The rate in seconds at which filefd metrics are collected from the host operating system. If a value greater than 0 is used then it overrides the global default otherwise the global default is used.

0 seconds

metrics

To specify which metrics are collected from the host operating system. These metrics depend on /proc or /sys fs. The actual values of metrics will be read from /proc or /sys when needed. cpu, cpufreq, meminfo, diskstats, filesystem, stat, loadavg, vmstat, netdev, and filefd depend on procfs. cpufreq metrics depend on sysfs.

"cpu,cpufreq,meminfo,diskstats,filesystem,uname,stat,time,loadavg,vmstat,netdev,filefd"

filesystem.ignore_mount_point_regex

Specify the regex for the mount points to prevent collection of/ignore.

`^/(dev

filesystem.ignore_filesystem_type_regex

Specify the regex for the filesystem types to prevent collection of/ignore.

`^(autofs

diskstats.ignore_device_regex

Specify the regex for the diskstats to prevent collection of/ignore.

`^(ram

systemd_service_restart_metrics

Determines if the collector will include service restart metrics

false

systemd_unit_start_time_metrics

Determines if the collector will include unit start time metrics

false

systemd_include_service_task_metrics

Determines if the collector will include service task metrics

false

systemd_include_pattern

regex to determine which units are included in the metrics produced by the systemd collector

It is not applied unless explicitly set

systemd_exclude_pattern

regex to determine which units are excluded in the metrics produced by the systemd collector

`.+\.(automount

Note: The plugin top-level scrape_interval setting is the global default with any custom settings for individual scrape_intervals then overriding just that specific metric scraping interval. Each collector.xxx.scrape_interval option only overrides the interval for that specific collector and updates the associated set of provided metrics.

The overridden intervals only change the collection interval, not the interval for publishing the metrics which is taken from the global setting. For example, if the global interval is set to 5s and an override interval of 60s is used then the published metrics will be reported every 5s but for the specific collector they will stay the same for 60s until it is collected again. This feature aims to help with down-sampling when collecting metrics.

Collectors available

The following table describes the available collectors as part of this plugin. All of them are enabled by default and respects the original metrics name, descriptions, and types from Prometheus Exporter, so you can use your current dashboards without any compatibility problem.

note: the Version column specifies the Fluent Bit version where the collector is available.

Name

Description

Version

cpu

Exposes CPU statistics.

Linux

v1.8

cpufreq

Exposes CPU frequency statistics.

Linux

v1.8

diskstats

Exposes disk I/O statistics.

Linux

v1.8

filefd

Exposes file descriptor statistics from /proc/sys/fs/file-nr.

Linux

v1.8.2

loadavg

Exposes load average.

Linux

v1.8

meminfo

Exposes memory statistics.

Linux

v1.8

netdev

Exposes network interface statistics such as bytes transferred.

Linux

v1.8.2

stat

Exposes various statistics from /proc/stat. This includes boot time, forks, and interruptions.

Linux

v1.8

time

Exposes the current system time.

Linux

v1.8

uname

Exposes system information as provided by the uname system call.

Linux

v1.8

vmstat

Exposes statistics from /proc/vmstat.

Linux

v1.8.2

systemd collector

Exposes statistics from systemd.

Linux

v2.1.3

Getting Started

Simple Configuration File

In the following configuration file, the input plugin _node_exporter_metrics collects _metrics every 2 seconds and exposes them through our Prometheus Exporter output plugin on HTTP/TCP port 2021.

# Node Exporter Metrics + Prometheus Exporter
# -------------------------------------------
# The following example collect host metrics on Linux and expose
# them through a Prometheus HTTP end-point.
#
# After starting the service try it with:
#
# $ curl http://127.0.0.1:2021/metrics
#
[SERVICE]
    flush           1
    log_level       info

[INPUT]
    name            node_exporter_metrics
    tag             node_metrics
    scrape_interval 2

[OUTPUT]
    name            prometheus_exporter
    match           node_metrics
    host            0.0.0.0
    port            2021

You can test the expose of the metrics by using curl:

curl http://127.0.0.1:2021/metrics

Container to Collect Host Metrics

When deploying Fluent Bit in a container you will need to specify additional settings to ensure that Fluent Bit has access to the host operating system. The following docker command deploys Fluent Bit with specific mount paths and settings enabled to ensure that Fluent Bit can collect from the host. These are then exposed over port 2021.

docker run -ti -v /proc:/host/proc \
               -v /sys:/host/sys   \
               -p 2021:2021        \
               fluent/fluent-bit:1.8.0 \
               /fluent-bit/bin/fluent-bit \
                         -i node_exporter_metrics -p path.procfs=/host/proc -p path.sysfs=/host/sys \
                         -o prometheus_exporter -p "add_label=host $HOSTNAME" \
                         -f 1

Fluent Bit + Prometheus + Grafana

If you like dashboards for monitoring, Grafana is one of the preferred options. In our Fluent Bit source code repository, we have pushed a simple **docker-compose **example. Steps:

Get a copy of Fluent Bit source code

git clone https://github.com/fluent/fluent-bit
cd fluent-bit/docker_compose/node-exporter-dashboard/

Start the service and view your Dashboard

docker-compose up --force-recreate -d --build

Now open your browser in the address http://127.0.0.1:3000. When asked for the credentials to access Grafana, just use the **admin **username and admin password.

Note that by default Grafana dashboard plots the data from the last 24 hours, so just change it to Last 5 minutes to see the recent data being collected.

Stop the Service

docker-compose down

Enhancement Requests

Our current plugin implements a sub-set of the available collectors in the original Prometheus Node Exporter, if you would like that we prioritize a specific collector please open a Github issue by using the following template: - in_node_exporter_metrics

Tail

The tail input plugin allows to monitor one or several text files. It has a similar behavior like tail -f shell command.

The plugin reads every matched file in the Path pattern and for every new line found (separated by a newline character (\n) ), it generates a new record. Optionally a database file can be used so the plugin can have a history of tracked files and a state of offsets, this is very useful to resume a state if the service is restarted.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Buffer_Chunk_Size

Set the initial buffer size to read files data. This value is used to increase buffer size. The value must be according to the specification.

32k

Buffer_Max_Size

Set the limit of the buffer size per monitored file. When a buffer needs to be increased (e.g: very long lines), this value is used to restrict how much the memory buffer can grow. If reading a file exceeds this limit, the file is removed from the monitored file list. The value must be according to the specification.

32k

Path

Pattern specifying a specific log file or multiple ones through the use of common wildcards. Multiple patterns separated by commas are also allowed.

Path_Key

If enabled, it appends the name of the monitored file as part of the record. The value assigned becomes the key in the map.

Exclude_Path

Set one or multiple shell patterns separated by commas to exclude files matching certain criteria, e.g: Exclude_Path *.gz,*.zip

Offset_Key

If enabled, Fluent Bit appends the offset of the current monitored file as part of the record. The value assigned becomes the key in the map

Read_from_Head

For new discovered files on start (without a database offset/position), read the content from the head of the file, not tail.

False

Refresh_Interval

The interval of refreshing the list of watched files in seconds.

Rotate_Wait

Specify the number of extra time in seconds to monitor a file once is rotated in case some pending data is flushed.

Ignore_Older

Ignores files older than ignore_older. Supports m, h, d (minutes, hours, days) syntax. Default behavior is to read all.

Skip_Long_Lines

When a monitored file reaches its buffer capacity due to a very long line (Buffer_Max_Size), the default behavior is to stop monitoring that file. Skip_Long_Lines alter that behavior and instruct Fluent Bit to skip long lines and continue processing other lines that fits into the buffer size.

Off

Skip_Empty_Lines

Skips empty lines in the log file from any further processing or output.

Off

Specify the database file to keep track of monitored files and offsets.

DB.sync

Set a default synchronization (I/O) method. Values: Extra, Full, Normal, Off. This flag affects how the internal SQLite engine do synchronization to disk, for more details about each option please refer to . Most of workload scenarios will be fine with normal mode, but if you really need full synchronization after every write operation you should set full mode. Note that full has a high I/O performance cost.

normal

DB.locking

Specify that the database will be accessed only by Fluent Bit. Enabling this feature helps to increase performance when accessing the database but it restrict any external tool to query the content.

false

DB.journal_mode

sets the journal mode for databases (WAL). Enabling WAL provides higher performance. Note that WAL is not compatible with shared network file systems.

WAL

Mem_Buf_Limit

Set a limit of memory that Tail plugin can use when appending data to the Engine. If the limit is reach, it will be paused; when the data is flushed it resumes.

Exit_On_Eof

When reading a file will exit as soon as it reach the end of the file. Useful for bulk load and tests

false

Parser

Specify the name of a parser to interpret the entry as a structured message.

Key

When a message is unstructured (no parser applied), it's appended as a string under the key name log. This option allows to define an alternative name for that key.

log

Inotify_Watcher

Set to false to use file stat watcher instead of inotify.

true

Tag

Set a tag (with regex-extract fields) that will be placed on lines read. E.g. kube.<namespace_name>.<pod_name>.<container_name>. Note that "tag expansion" is supported: if the tag includes an asterisk (*), that asterisk will be replaced with the absolute path of the monitored file (also see ).

Tag_Regex

Set a regex to extract fields from the file name. E.g. (?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-

Static_Batch_Size

Set the maximum number of bytes to process per iteration for the monitored static files (files that already exists upon Fluent Bit start).

50M

Note that if the database parameter DB is not specified, by default the plugin will start reading each target file from the beginning. This also might cause some unwanted behavior, for example when a line is bigger that Buffer_Chunk_Size and Skip_Long_Lines is not turned on, the file will be read from the beginning of each Refresh_Interval until the file is rotated.

Multiline Support

Starting from Fluent Bit v1.8 we have introduced a new Multiline core functionality. For Tail input plugin, it means that now it supports the old configuration mechanism but also the new one. In order to avoid breaking changes, we will keep both but encourage our users to use the latest one. We will call the two mechanisms as:

Multiline Core
Old Multiline

Multiline Core (v1.8)

The new multiline core is exposed by the following configuration:

Key

Description

multiline.parser

Specify one or multiple to apply to the content.

As stated in the Multiline Parser documentation, now we provide built-in configuration modes. Note that when using a new multiline.parser definition, you must disable the old configuration from your tail section like:

parser
parser_firstline
parser_N
multiline
multiline_flush
docker_mode

Multiline and Containers (v1.8)

If you are running Fluent Bit to process logs coming from containers like Docker or CRI, you can use the new built-in modes for such purposes. This will help to reassembly multiline messages originally split by Docker or CRI:

[INPUT]
    name              tail
    path              /var/log/containers/*.log
    multiline.parser  docker, cri

pipeline:
  inputs:
    - tail:
      path: /var/log/containers/*.log
      multiline.parser: docker, cri

The two options separated by a comma means multi-format: try docker and cri multiline formats.

We are still working on extending support to do multiline for nested stack traces and such. Over the Fluent Bit v1.8.x release cycle we will be updating the documentation.

Old Multiline Configuration Parameters

For the old multiline configuration, the following options exist to configure the handling of multilines logs:

Key

Description

Default

Multiline

If enabled, the plugin will try to discover multiline messages and use the proper parsers to compose the outgoing messages. Note that when this option is enabled the Parser option is not used.

Off

Multiline_Flush

Wait period time in seconds to process queued multiline messages

Parser_Firstline

Name of the parser that matches the beginning of a multiline message. Note that the regular expression defined in the parser must include a group name (named capture), and the value of the last match group must be a string

Parser_N

Optional-extra parser to interpret and structure multiline entries. This option can be used to define multiple parsers, e.g: Parser_1 ab1, Parser_2 ab2, Parser_N abN.

Old Docker Mode Configuration Parameters

Docker mode exists to recombine JSON log lines split by the Docker daemon due to its line length limit. To use this feature, configure the tail plugin with the corresponding parser and then enable Docker mode:

Key

Description

Default

Docker_Mode

If enabled, the plugin will recombine split Docker log lines before passing them to any parser as configured above. This mode cannot be used at the same time as Multiline.

Off

Docker_Mode_Flush

Wait period time in seconds to flush queued unfinished split lines.

Docker_Mode_Parser

Specify an optional parser for the first line of the docker multiline mode. The parser name to be specified must be registered in the parsers.conf file.

Getting Started

In order to tail text or log files, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit parse text files with the following options:

$ fluent-bit -i tail -p path=/var/log/syslog -o stdout

Configuration File

In your main configuration file append the following Input & Output sections.

[INPUT]
    Name        tail
    Path        /var/log/syslog

[OUTPUT]
    Name   stdout
    Match  *

pipeline:
  inputs:
    - tail:
      path: /var/log/syslog
      
  outputs:
    - stdout:
      match: *

Old Multi-line example

When using multi-line configuration you need to first specify Multiline On in the configuration and use the Parser_Firstline and additional parser parameters Parser_N if needed. If we are trying to read the following Java Stacktrace as a single event

Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)

We need to specify a Parser_Firstline parameter that matches the first line of a multi-line event. Once a match is made Fluent Bit will read all future lines until another match with Parser_Firstline is made .

In the case above we can use the following parser, that extracts the Time as time and the remaining portion of the multiline as log

[PARSER]
    Name multiline
    Format regex
    Regex /(?<time>Dec \d+ \d+\:\d+\:\d+)(?<message>.*)/
    Time_Key  time
    Time_Format %b %d %H:%M:%S

If we want to further parse the entire event we can add additional parsers with Parser_N where N is an integer. The final Fluent Bit configuration looks like the following:

# Note this is generally added to parsers.conf and referenced in [SERVICE]
[PARSER]
    Name multiline
    Format regex
    Regex /(?<time>Dec \d+ \d+\:\d+\:\d+)(?<message>.*)/
    Time_Key  time
    Time_Format %b %d %H:%M:%S

[INPUT]
    Name             tail
    Multiline        On
    Parser_Firstline multiline
    Path             /var/log/java.log

[OUTPUT]
    Name             stdout
    Match            *

Our output will be as follows.

[0] tail.0: [1607928428.466041977, {"message"=>"Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)", "message"=>"at com.myproject.module.MyProject.main(MyProject.java:6)"}]

Tailing files keeping state

The tail input plugin a feature to save the state of the tracked files, is strongly suggested you enabled this. For this purpose the db property is available, e.g:

$ fluent-bit -i tail -p path=/var/log/syslog -p db=/path/to/logs.db -o stdout

When running, the database file /path/to/logs.db will be created, this database is backed by SQLite3 so if you are interested into explore the content, you can open it with the SQLite client tool, e.g:

$ sqlite3 tail.db
-- Loading resources from /home/edsiper/.sqliterc

SQLite version 3.14.1 2016-08-11 18:53:32
Enter ".help" for usage hints.
sqlite> SELECT * FROM in_tail_files;
id     name                              offset        inode         created
-----  --------------------------------  ------------  ------------  ----------
1      /var/log/syslog                   73453145      23462108      1480371857
sqlite>

Make sure to explore when Fluent Bit is not hard working on the database file, otherwise you will see some Error: database is locked messages.

Formatting SQLite

By default SQLite client tool do not format the columns in a human read-way, so to explore in_tail_files table you can create a config file in ~/.sqliterc with the following content:

.headers on
.mode column
.width 5 32 12 12 10

SQLite and Write Ahead Logging

Fluent Bit keep the state or checkpoint of each file through using a SQLite database file, so if the service is restarted, it can continue consuming files from it last checkpoint position (offset). The default options set are enabled for high performance and corruption-safe.

The SQLite journaling mode enabled is Write Ahead Log or WAL. This allows to improve performance of read and write operations to disk. When enabled, you will see in your file system additional files being created, consider the following configuration statement:

[INPUT]
    name    tail
    path    /var/log/containers/*.log
    db      test.db

The above configuration enables a database file called test.db and in the same path for that file SQLite will create two additional files:

test.db-shm
test.db-wal

Those two files aims to support the WAL mechanism that helps to improve performance and reduce the number system calls required. The -wal file refers to the file that stores the new changes to be committed, at some point the WAL file transactions are moved back to the real database file. The -shm file is a shared-memory type to allow concurrent-users to the WAL file.

WAL and Memory Usage

The WAL mechanism give us higher performance but also might increase the memory usage by Fluent Bit. Most of this usage comes from the memory mapped and cached pages. In some cases you might see that memory usage keeps a bit high giving the impression of a memory leak, but actually is not relevant unless you want your memory metrics back to normal. Starting from Fluent Bit v1.7.3 we introduced the new option db.journal_mode mode that sets the journal mode for databases, by default it will be WAL (Write-Ahead Logging), currently allowed configurations for db.journal_mode are DELETE | TRUNCATE | PERSIST | MEMORY | WAL | OFF .

File Rotation

File rotation is properly handled, including logrotate's copytruncate mode.

Note that the Path patterns cannot match the rotated files. Otherwise, the rotated file would be read again and lead to duplicate records.

Docker

Fluent Bit container images are available on Docker Hub ready for production usage. Current available images can be deployed in multiple architectures.

Quick Start

Get started by simply typing the following command:

docker run -ti cr.fluentbit.io/fluent/fluent-bit

Tags and Versions

The following table describes the Linux container tags that are available on Docker Hub fluent/fluent-bit repository:

Tag(s)

Manifest Architectures

Description

2.1.10-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.10

x86_64, arm64v8, arm32v7

Release

2.1.9-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.9

x86_64, arm64v8, arm32v7

Release

2.1.8-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.8

x86_64, arm64v8, arm32v7

Release

2.1.7-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.7

x86_64, arm64v8, arm32v7

Release

2.1.6-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.6

x86_64, arm64v8, arm32v7

Release

2.1.5

x86_64, arm64v8, arm32v7

Release

2.1.5-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.4

x86_64, arm64v8, arm32v7

Release

2.1.4-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.3

x86_64, arm64v8, arm32v7

Release

2.1.3-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.2

x86_64, arm64v8, arm32v7

Release

2.1.2-debug

x86_64, arm64v8, arm32v7

Debug images

2.1.1

x86_64, arm64v8, arm32v7

Release

2.1.1-debug

x86_64, arm64v8, arm32v7

v2.1.x releases (production + debug)

2.1.0

x86_64, arm64v8, arm32v7

Release

2.1.0-debug

x86_64, arm64v8, arm32v7

v2.1.x releases (production + debug)

2.0.11

x86_64, arm64v8, arm32v7

Release

2.0.11-debug

x86_64, arm64v8, arm32v7