1 of 100

1.4 Fluent Bit v1.4 Documentation

High Performance Logs Processor

Fluent Bit is a Fast and Lightweight Log Processor, Stream Processor and Forwarder for Linux, OSX, Windows and BSD family operating systems. It has been made with a strong focus on performance to allow the collection of events from different sources without complexity.

Features

High Performance
Data Parsing
- Convert your unstructured messages using our parsers: JSON, Regex, LTSV and Logfmt
Reliability and Data Integrity
- Backpressure Handling
- Data Buffering in memory and file system
Networking
- Security: built-in TLS/SSL support
- Asynchronous I/O
Pluggable Architecture and Extensibility: Inputs, Filters and Outputs
- More than 50 built-in plugins available
- Extensibility
  - Write any input, filter or output plugin in C language
  - Bonus: write Filters in Lua or Output plugins in Golang
Monitoring: expose internal metrics over HTTP in JSON and Prometheus format
Stream Processing: Perform data selection and transformation using simple SQL queries
- Create new streams of data using query results
- Aggregation Windows
- Data analysis and prediction: Timeseries forecasting
Portable: runs on Linux, MacOS, Windows and BSD systems

Fluent Bit, Fluentd and CNCF

Fluent Bit is a sub-component of the Fluentd project ecosystem, it's licensed under the terms of the Apache License v2.0. This project was created by Treasure Data and is its current primary sponsor.

Nowadays Fluent Bit get contributions from several companies and individuals and same as Fluentd, it's hosted as a CNCF subproject.

About

What is Fluent Bit ?

Fluent Bit is a CNCF sub-project under the umbrella of Fluentd

Fluent Bit is an open source and multi-platform log processor tool which aims to be a generic Swiss knife for logs processing and distribution.

Nowadays the number of sources of information in our environments is ever increasing. Handling data collection at scale is complex, and collecting and aggregating diverse data requires a specialized tool that can deal with:

Different sources of information
Different data formats
Data Reliability
Security
Flexible Routing
Multiple destinations

Fluent Bit has been designed with performance and low resources consumption in mind.

A Brief History of Fluent Bit

Every project has a story

On 2014, the team at forecasted the need of a lightweight log processor for constraint environments like Embedded Linux and Gateways, the project aimed to be part of the Fluentd Ecosystem and we called it , fully open source and available under the terms of the .

After the project was around for some time, it got some traction in the Embedded market but we also started getting requests for several features from the Cloud community like more inputs, filters, and outputs. Not so long after that, Fluent Bit becomes one of the preferred solutions to solve the logging challenges in Cloud environments.

Fluentd & Fluent Bit

The Production Grade Ecosystem

Logging and data processing in general can be complex, and at scale a bit more, that's why Fluentd was born. But now is more than a simple tool, it's a full ecosystem that contains SDKs for different languages and sub projects like Fluent Bit.

On this page, we will describe the relationship between the Fluentd and Fluent Bit open source projects, as a summary we can say both are:

Licensed under the terms of Apache License v2.0
Hosted projects by the Cloud Native Computing Foundation (CNCF)
Production Grade solutions: deployed thousands of times every single day, millions per month.
Community driven projects
Widely Adopted by the Industry: trusted by all major companies like AWS, Microsoft, Google Cloud and hundred of others.
Originally created by Treasure Data.

Both projects share a lot of similarities, Fluent Bit is fully designed and built on top of the best ideas of Fluentd architecture and general design. Choosing which one to use depends on the end-user needs.

The following table describes a comparison in different areas of the projects:

Both Fluentd and Fluent Bit can work as Aggregators or Forwarders, they both can complement each other or use them as standalone solutions.

License

Strong Commitment to the Openness and Collaboration

, including it core, plugins and tools are distributed under the terms of the :

Concepts

Key Concepts

There are a few key concepts that are really important to understand how Fluent Bit operates.

Before diving into Fluent Bit it’s good to get acquainted with some of the key concepts of the service. This document provides a gentle introduction to those concepts and common Fluent Bit terminology. We’ve provided a list below of all the terms we’ll cover, but we recommend reading this document from start to finish to gain a more general understanding of our log and stream processor.

Event or Record
Filtering
Tag
Timestamp
Match
Structured Message

Event or Record

Every incoming piece of data that belongs to a log or a metric that is retrieved by Fluent Bit is considered an Event or a Record.

As an example consider the following content of a Syslog file:

Jan 18 12:52:16 flb systemd[2222]: Starting GNOME Terminal Server
Jan 18 12:52:16 flb dbus-daemon[2243]: [session uid=1000 pid=2243] Successfully activated service 'org.gnome.Terminal'
Jan 18 12:52:16 flb systemd[2222]: Started GNOME Terminal Server.
Jan 18 12:52:16 flb gsd-media-keys[2640]: # watch_fast: "/org/gnome/terminal/legacy/" (establishing: 0, active: 0)

It contains four lines and all of them represents four independent Events.

Internally, an Event always has two components (in an array form):

[TIMESTAMP, MESSAGE]

Filtering

In some cases is required to perform modifications on the Events content, the process to alter, enrich or drop Events is called Filtering.

There are many use cases when Filtering is required like:

Append specific information to the Event like an IP address or metadata.
Select a specific piece of the Event content.
Drop Events that matches certain pattern.

Tag

Every Event that gets into Fluent Bit gets assigned a Tag. This tag is an internal string that is used in a later stage by the Router to decide which Filter or Output phase it must go through.

Most of the tags are assigned manually in the configuration. If a tag is not specified, Fluent Bit will assign the name of the Input plugin instance from where that Event was generated from.

The only input plugin that don't assign Tags is Forward input. This plugin speaks the Fluentd wire protocol called Forward where every Event already comes with a Tag associated. Fluent Bit will always use the incoming Tag set by the client.

A Tagged record must always have a Matching rule. To learn more about Tags and Matches check the Routing section.

Timestamp

The Timestamp represents the time when an Event was created. Every Event contains a Timestamp associated. The Timestamp is a numeric fractional integer in the format:

SECONDS.NANOSECONDS

Seconds

It is the number of seconds that have elapsed since the Unix epoch.

Nanoseconds

Fractional second or one thousand-millionth of a second.

A timestamp always exists, either set by the Input plugin or discovered through a data parsing process.

Match

Fluent Bit allows to deliver your collected and processed Events to one or multiple destinations, this is done through a routing phase. A Match represent a simple rule to select Events where it Tags matches a defined rule.

To learn more about Tags and Matches check the Routing section.

Structured Messages

Source events can have or not have a structure. A structure defines a set of keys and values inside the Event message. As an example consider the following two messages:

No structured message

"Project Fluent Bit created on 1398289291"

Structured Message

{"project": "Fluent Bit", "created": 1398289291}

At a low level both are just an array of bytes, but the Structured message defines keys and values, having a structure helps to implement faster operations on data modifications.

Fluent Bit always handle every Event message as a structured message. For performance reasons, we use a binary serialization data format called MessagePack.

Consider MessagePack as a binary version of JSON on steroids.

Buffering

Performance and Data Safety

When process data, it uses the system memory (heap) as a primary and temporal place to store the record logs before they get delivered, on this private memory area the records are processed.

Buffering refers to the ability to store the records somewhere, and while they are processed and delivered, still be able to store more. Buffering in memory is the fastest mechanism, but there are certain scenarios where the mechanism requires special strategies to deal with , data safety or reduce memory consumption by the service in constraint environments.

Network failures or latency on third party service is pretty common, and on scenarios where we cannot deliver data fast enough as we receive new data to process, we likely will face backpressure. Our buffering strategies are designed to solve problems associated with backpressure and general delivery failures.

Fluent Bit as buffering strategies, offers a primary buffering mechanism in memory and an optional secondary one using the file system. With this hybrid solution you can adjust to any use case safety and keep a high performance while processing your data.

Both mechanisms are not exclusive and when the data is ready to be processed or delivered it will be always in memory, while other data in the queue might be in the file system until is ready to be processed and moved up to memory.

To learn more about the buffering configuration in Fluent Bit, please jump to the section.

Data Pipeline

Input

The way to gather data from your sources

provides different Input Plugins to gather information from different sources, some of them just collect data from log files while others can gather metrics information from the operating system. There are many plugins for different needs.

When an input plugin is loaded, an internal instance is created. Every instance has its own and independent configuration. Configuration keys are often called properties.

Every input plugin has its own documentation section where it's specified how it can be used and what properties are available.

Parser

Convert Unstructured to Structured messages

Dealing with raw strings or unstructured messages is a constant pain; having a structure is highly desired. Ideally we want to set a structure to the incoming data by the Input Plugins as soon as they are collected:

The Parser allows you to convert from unstructured to structured data. As a demonstrative example consider the following Apache (HTTP Server) log entry:

The above log line is a raw string without format, ideally we would like to give it a structure that can be processed later easily. If the proper configuration is used, the log entry could be converted to:

Filter

Modify, Enrich or Drop your records

In production environments we want to have full control of the data we are collecting, filtering is an important feature that allows us to alter the data before delivering it to some destination.

Filtering is implemented through plugins, so each filter available could be used to match, exclude or enrich your logs with some specific metadata.

We support many filters, A common use case for filtering is Kubernetes deployments. Every Pod log needs to get the proper metadata associated

Very similar to the input plugins, Filters run in an instance context, which has its own independent configuration. Configuration keys are often called properties.

Buffer

Data processing with reliability

Previously defined in the concept section, the buffer phase in the pipeline aims to provide a unified and persistent mechanism to store your data, either using the primary in-memory model or using the filesystem based mode.

The buffer phase already contains the data in an immutable state, meaning, no other filter can be applied.

Note that buffered data is not longer a raw text, instead it's in Fluent Bit internal binary representation.

Fluent Bit offers a buffering mechanism in the file system that acts as a backup system to avoid data loss in case of system failures.

Router

Create flexible routing rules

Routing is a core feature that allows to route your data through Filters and finally to one or multiple destinations. The router relies on the concept of Tags and Matching rules

There are two important concepts in Routing:

Tag
Match

When the data is generated by the input plugins, it comes with a Tag (most of the time the Tag is configured manually), the Tag is a human-readable indicator that helps to identify the data source.

In order to define where the data should be routed, a Match rule must be specified in the output configuration.

Consider the following configuration example that aims to deliver CPU metrics to an Elasticsearch database and Memory metrics to the standard output interface:

[INPUT]
    Name cpu
    Tag  my_cpu

[INPUT]
    Name mem
    Tag  my_mem

[OUTPUT]
    Name   es
    Match  my_cpu

[OUTPUT]
    Name   stdout
    Match  my_mem

Note: the above is a simple example demonstrating how Routing is configured.

Routing works automatically reading the Input Tags and the Output Match rules. If some data has a Tag that doesn't match upon routing time, the data is deleted.

Routing with Wildcard

Routing is flexible enough to support wildcard in the Match pattern. The below example defines a common destination for both sources of data:

[INPUT]
    Name cpu
    Tag  my_cpu

[INPUT]
    Name mem
    Tag  my_mem

[OUTPUT]
    Name   stdout
    Match  my_*

The match rule is set to my_* which means it will match any Tag that starts with my_.

Output

Destinations for your data: databases, cloud services and more!

The output interface allows us to define destinations for the data. Common destinations are remote services, local file system or standard interface with others. Outputs are implemented as plugins and there are many available.

When an output plugin is loaded, an internal instance is created. Every instance has its own independent configuration. Configuration keys are often called properties.

Every output plugin has its own documentation section specifying how it can be used and what properties are available.

For more details, please refer to the Output Plugins section.

Installation

Upgrade Notes

The following article cover the relevant notes for users upgrading from previous Fluent Bit versions. We aim to cover compatibility changes that you must be aware of.

For more details about changes on each release please refer to the Official Release Notes.

Fluent Bit v1.4

If you are migrating from Fluent Bit v1.3, there are not breaking changes. Just new exciting features to enjoy :)

Fluent Bit v1.3

If you are migrating from Fluent Bit v1.2 to v1.3, there are not breaking changes. If you are upgrading from an older version please review the incremental changes below.

Fluent Bit v1.2

Docker, JSON, Parsers and Decoders

On Fluent Bit v1.2 we have fixed many issues associated with JSON encoding and decoding, for hence when parsing Docker logs is no longer necessary to use decoders. The new Docker parser looks like this:

[PARSER]
    Name         docker
    Format       json
    Time_Key     time
    Time_Format  %Y-%m-%dT%H:%M:%S.%L
    Time_Keep    On

Note: again, do not use decoders.

Kubernetes Filter

We have done improvements also on how Kubernetes Filter handle the stringified log message. If the option Merge_Log is enabled, it will try to handle the log content as a JSON map, if so, it will add the keys to the root map.

In addition, we have fixed and improved the option called Merge_Log_Key. If a merge log succeed, all new keys will be packaged under the key specified by this option, a suggested configuration is as follows:

[FILTER]
    Name             Kubernetes
    Match            kube.*
    Kube_Tag_Prefix  kube.var.log.containers.
    Merge_Log        On
    Merge_Log_Key    log_processed

As an example, if the original log content is the following map:

{"key1": "val1", "key2": "val2"}

the final record will be composed as follows:

{
    "log": "{\"key1\": \"val1\", \"key2\": \"val2\"}",
    "log_processed": {
        "key1": "val1",
        "key2": "val2"
    }
}

Fluent Bit v1.1

If you are upgrading from Fluent Bit <= 1.0.x you should take in consideration the following relevant changes when switching to Fluent Bit v1.1 series:

Kubernetes Filter

We introduced a new configuration property called Kube_Tag_Prefix to help Tag prefix resolution and address an unexpected behavior that landed in previous versions.

During 1.0.x release cycle, a commit in Tail input plugin changed the default behavior on how the Tag was composed when using the wildcard for expansion generating breaking compatibility with other services. Consider the following configuration example:

[INPUT]
    Name  tail
    Path  /var/log/containers/*.log
    Tag   kube.*

The expected behavior is that Tag will be expanded to:

kube.var.log.containers.apache.log

but the change introduced in 1.0 series switched from absolute path to the base file name only:

kube.apache.log

On Fluent Bit v1.1 release we restored to our default behavior and now the Tag is composed using the absolute path of the monitored file.

Having absolute path in the Tag is relevant for routing and flexible configuration where it also helps to keep compatibility with Fluentd behavior.

This behavior switch in Tail input plugin affects how Filter Kubernetes operates. As you know when the filter is used it needs to perform local metadata lookup that comes from the file names when using Tail as a source. Now with the new Kube_Tag_Prefix option you can specify what's the prefix used in Tail input plugin, for the configuration example above the new configuration will look as follows:

[INPUT]
    Name  tail
    Path  /var/log/containers/*.log
    Tag   kube.*

[FILTER]
    Name             kubernetes
    Match            *
    Kube_Tag_Prefix  kube.var.log.containers.

So the proper for Kube_Tag_Prefix value must be composed by Tag prefix set in Tail input plugin plus the converted monitored directory replacing slashes with dots.

Supported Platforms

The following operating systems and architectures are supported in Fluent Bit.

From an architecture support perspective, Fluent Bit is fully functional on x86_64, Arm64v8 and Arm32v7 based processors.

Fluent Bit can work also on OSX and *BSD systems, but not all plugins will be available on all platforms. Official support will be expanding based on community demand.

Requirements

uses very low CPU and Memory consumption, it's compatible with most of x86, x86_64, arm32v7 and arm64v8 based platforms. In order to build it you need the following components in your system for the build process:

Compiler: GCC or clang
CMake
Flex & Bison: only if you enable the Stream Processor or Record Accessor feature (both enabled by default)

In the core there are not other dependencies, For certain features that depends on third party components like output plugins with special backend libraries (e.g: kafka), those are included in the main source code repository.

Sources

Download Source Code

Stable

For production systems, we strongly suggest that you always get the latest stable release from our web site, you can get the official tarballs (.tar.gz) from the following link:

https://fluentbit.io/download/

Development

For people who aims to contribute to the project testing or extending the code base, can get the development version from our GIT repository:

$ git clone https://github.com/fluent/fluent-bit

Note that our master branch is where the development of Fluent Bit happens. Since it's a development version, expect issues when compiling or at run time.

We encourage everybody to help us testing every development version, at the end this is what will become stable.

Build and Install

Fluent Bit uses CMake as it build system. The suggested procedure to prepare the build system consists on the following steps:

Prepare environment

In the following steps you can find exact commands to build and install the project with the default options. If you already know how CMake works you can skip this part and look at the build options available.

Change to the build/ directory inside the Fluent Bit sources:

$ cd build/

Let CMake configure the project specifying where the root path is located:

$ cmake ../
-- The C compiler identification is GNU 4.9.2
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- The CXX compiler identification is GNU 4.9.2
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
...
-- Could NOT find Doxygen (missing:  DOXYGEN_EXECUTABLE)
-- Looking for accept4
-- Looking for accept4 - not found
-- Configuring done
-- Generating done
-- Build files have been written to: /home/edsiper/coding/fluent-bit/build

Now you are ready to start the compilation process through the simple make command:

$ make
Scanning dependencies of target msgpack
[  2%] Building C object lib/msgpack-1.1.0/CMakeFiles/msgpack.dir/src/unpack.c.o
[  4%] Building C object lib/msgpack-1.1.0/CMakeFiles/msgpack.dir/src/objectc.c.o
[  7%] Building C object lib/msgpack-1.1.0/CMakeFiles/msgpack.dir/src/version.c.o
...
[ 19%] Building C object lib/monkey/mk_core/CMakeFiles/mk_core.dir/mk_file.c.o
[ 21%] Building C object lib/monkey/mk_core/CMakeFiles/mk_core.dir/mk_rconf.c.o
[ 23%] Building C object lib/monkey/mk_core/CMakeFiles/mk_core.dir/mk_string.c.o
...
Scanning dependencies of target fluent-bit-static
[ 66%] Building C object src/CMakeFiles/fluent-bit-static.dir/flb_pack.c.o
[ 69%] Building C object src/CMakeFiles/fluent-bit-static.dir/flb_input.c.o
[ 71%] Building C object src/CMakeFiles/fluent-bit-static.dir/flb_output.c.o
...
Linking C executable ../bin/fluent-bit
[100%] Built target fluent-bit-bin

to continue installing the binary on the system just do:

$ make install

it's likely you may need root privileges so you can try to prefixing the command with sudo.

Build Options

Fluent Bit provides certain options to CMake that can be enabled or disabled when configuring, please refer to the following tables under the General Options, Development Options, Input Plugins and _Output Plugins sections.

General Options

Development Options

Input Plugins

The input plugins provides certain features to gather information from a specific source type which can be a network interface, some built-in metric or through a specific input device, the following input plugins are available:

Filter Plugins

The filter plugins allows to modify, enrich or drop records. The following table describes the filters available on this version:

Output Plugins

The output plugins gives the capacity to flush the information to some external interface, service or terminal, the following table describes the output plugins available as of this version:

Build with Static Configuration

Fluent Bit in normal operation mode allows to be configurable through text files or using specific arguments in the command line, while this is the ideal deployment case, there are scenarios where a more restricted configuration is required: static configuration mode.

Static configuration mode aims to include a built-in configuration in the final binary of Fluent Bit, disabling the usage of external files or flags at runtime.

Getting Started

Requirements

The following steps assumes you are familiar with configuring Fluent Bit using text files and you have experience building it from scratch as described in the Build and Install section.

Configuration Directory

In your file system prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. It is mandatory that this directory contain as a minimum one configuration file called fluent-bit.conf containing the required SERVICE, INPUT and OUTPUT sections. As an example create a new fluent-bit.conf file with the following content:

[SERVICE]
    Flush     1
    Daemon    off
    Log_Level info

[INPUT]
    Name      cpu

[OUTPUT]
    Name      stdout
    Match     *

the configuration provided above will calculate CPU metrics from the running system and print them to the standard output interface.

Build with Custom Configuration

Inside Fluent Bit source code, get into the build/ directory and run CMake appending the FLB_STATIC_CONF option pointing the configuration directory recently created, e.g:

$ cd fluent-bit/build/
$ cmake -DFLB_STATIC_CONF=/path/to/my/confdir/

then build it:

$ make

At this point the fluent-bit binary generated is ready to run without necessity of further configuration:

$ bin/fluent-bit 
Fluent-Bit v0.15.0
Copyright (C) Treasure Data

[2018/10/19 15:32:31] [ info] [engine] started (pid=15186)
[0] cpu.local: [1539984752.000347547, {"cpu_p"=>0.750000, "user_p"=>0.500000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

Linux Packages

Amazon Linux

Install on Amazon Linux 2

Fluent Bit is distributed as td-agent-bit package and is available for the latest Amazon Linux 2. The following architectures are supported

x86_64
aarch64 / arm64v8

Configure Yum

We provide td-agent-bit through a Yum repository. In order to add the repository reference to your system, please add a new file called td-agent-bit.repo in /etc/yum.repos.d/ with the following content:

note: we encourage you always enable the gpgcheck for security reasons. All our packages are signed.

The GPG Key fingerprint is F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Install

Once your repository is configured, run the following command to install it:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.

Redhat / CentOS

Install on Redhat / CentOS

Fluent Bit is distributed as td-agent-bit package and is available for the latest stable CentOS system. The following architectures are supported

x86_64
aarch64 / arm64v8

Configure Yum

note: we encourage you always enable the gpgcheck for security reasons. All our packages are signed.

The GPG Key fingerprint is F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Install

Once your repository is configured, run the following command to install it:

Now the following step is to instruct Systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.

Debian

Fluent Bit is distributed as td-agent-bit package and is available for the latest (and old) stable Debian systems: Buster, Stretch and Jessie.

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

Update your sources lists

On Debian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Debian 10 (Buster)

Debian 9 (Stretch)

Debian 8 (Jessie)

Update your repositories database

Now let your system update the apt database:

Install TD Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Ubuntu

Fluent Bit is distributed as td-agent-bit package and is available for the latest stable Ubuntu system: Focal Fossa.

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

$ wget -qO - https://packages.fluentbit.io/fluentbit.key | sudo apt-key add -

Update your sources lists

On Ubuntu, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Ubuntu 20.04 LTS (Focal Fossa)

deb https://packages.fluentbit.io/ubuntu/focal focal main

Ubuntu 18.04 LTS (Bionic Beaver)

deb https://packages.fluentbit.io/ubuntu/bionic bionic main

Ubuntu 16.04 LTS (Xenial Xerus)

deb https://packages.fluentbit.io/ubuntu/xenial xenial main

Update your repositories database

Now let your system update the apt database:

$ sudo apt-get update

Install TD-Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

$ sudo apt-get install td-agent-bit

Now the following step is to instruct systemd to enable the service:

$ sudo service td-agent-bit start

If you do a status check, you should see a similar output like this:

sudo service td-agent-bit status
● td-agent-bit.service - TD Agent Bit
   Loaded: loaded (/lib/systemd/system/td-agent-bit.service; disabled; vendor preset: enabled)
   Active: active (running) since mié 2016-07-06 16:58:25 CST; 2h 45min ago
 Main PID: 6739 (td-agent-bit)
    Tasks: 1
   Memory: 656.0K
      CPU: 1.393s
   CGroup: /system.slice/td-agent-bit.service
           └─6739 /opt/td-agent-bit/bin/td-agent-bit -c /etc/td-agent-bit/td-agent-bit.conf
...

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Raspbian / Raspberry Pi

Fluent Bit is distributed as td-agent-bit package and is available for the Raspberry, specifically for Raspbian distribution, the following versions are supported:

Raspbian Buster (10)
Raspbian Stretch (9)
Raspbian Jessie (8)

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

$ wget -qO - https://packages.fluentbit.io/fluentbit.key | sudo apt-key add -

Update your sources lists

On Debian and derivated systems such as Raspbian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Raspbian 10 (Buster)

deb https://packages.fluentbit.io/raspbian/buster buster main

Raspbian 9 (Stretch)

deb https://packages.fluentbit.io/raspbian/stretch stretch main

Raspbian 8 (Jessie)

deb https://packages.fluentbit.io/raspbian/jessie jessie main

Update your repositories database

Now let your system update the apt database:

$ sudo apt-get update

Install TD-Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

$ sudo apt-get install td-agent-bit

Now the following step is to instruct systemd to enable the service:

$ sudo service td-agent-bit start

If you do a status check, you should see a similar output like this:

sudo service td-agent-bit status
● td-agent-bit.service - TD Agent Bit
   Loaded: loaded (/lib/systemd/system/td-agent-bit.service; disabled; vendor preset: enabled)
   Active: active (running) since mié 2016-07-06 16:58:25 CST; 2h 45min ago
 Main PID: 6739 (td-agent-bit)
    Tasks: 1
   Memory: 656.0K
      CPU: 1.393s
   CGroup: /system.slice/td-agent-bit.service
           └─6739 /opt/td-agent-bit/bin/td-agent-bit -c /etc/td-agent-bit/td-agent-bit.conf
...

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Docker

Fluent Bit container images are available on Docker Hub ready for production usage. Current available images can be deployed in multiple architectures.

Tags and Versions

The following table describe the tags are available on Docker Hub repository:

It's strongly suggested that you always use the latest image of Fluent Bit.

Multi Architecture Images

In addition, the main manifest provides images for arm64v8 and arm32v7 architectures. From a deployment perspective there is no need to specify an architecture, the container client tool that pulls the image gets the proper layer for the running architecture.

For every architecture we build the layers using the following base images:

Getting Started

Download the last stable image from 1.4 series:

Once the image is in place, now run the following (useless) test which makes Fluent Bit measure CPU usage by the container:

That command will let Fluent Bit measure CPU usage every second and flush the results to the standard output, e.g:

F.A.Q

Why there is no Fluent Bit Docker image based on Alpine Linux ?

Alpine Linux uses Musl C library instead of Glibc. Musl is not fully compatible with Glibc which generated many issues in the following areas when used with Fluent Bit:

Memory Allocator: to run Fluent Bit properly in high-load environments, we use Jemalloc as a default memory allocator which reduce fragmentation and provides better performance for our needs. Jemalloc cannot run smoothly with Musl and requires extra work.
Alpine Linux Musl functions bootstrap have a compatibility issue when loading Golang shared libraries, this generate problems when trying to load Golang output plugins in Fluent Bit.
Alpine Linux Musl Time format parser does not support Glibc extensions
Maintainers preference in terms of base image due to security and maintenance reasons are Distroless and Debian.

Where 'latest' Tag points to ?

Our Docker containers images are deployed thousands of times per day, we take security and stability very seriously.

The latest tag most of the time points to the latest stable image. When we release a major update to Fluent Bit like for example from v1.3.x to v1.4.0, we don't move latest tag until 2 weeks after the release. That give us extra time to verify with our community that everything works as expected.

Amazon

Containers on AWS

AWS maintains a distribution of Fluent Bit combining the latest official release with a set of Go Plugins for sending logs to AWS services. AWS and Fluent Bit are working together to rewrite their plugins for inclusion in the official Fluent Bit distribution.

Plugins

Currently, the AWS for Fluent Bit image contains Go Plugins for:

Versions and Regional Repositories

AWS vends their container image via Docker Hub, and a set of highly available regional Amazon ECR repositories. For more information, see the AWS for Fluent Bit GitHub repo.

The AWS for Fluent Bit image uses a custom versioning scheme because it contains multiple projects. To see what each release contains, check out the release notes on GitHub.

SSM Public Parameters

AWS vends SSM Public Parameters with the regional repository link for each image. These parameters can be queried by any AWS account.

To see a list of available version tags in a given region, run the following command:

aws ssm get-parameters-by-path --region eu-central-1 --path /aws/service/aws-for-fluent-bit/ --query 'Parameters[*].Name'

To see the ECR repository URI for a given image tag in a given region, run the following:

$ aws ssm get-parameter --region ap-northeast-1 --name /aws/service/aws-for-fluent-bit/2.0.0

You can use these SSM public parameters as parameters in your CloudFormation templates:

Parameters:
  FireLensImage:
    Description: Fluent Bit image for the FireLens Container
    Type: AWS::SSM::Parameter::Value<String>
    Default: /aws/service/aws-for-fluent-bit/latest

Amazon EC2

Learn how to .

Kubernetes

Kubernetes Production Grade Log Processor

is a lightweight and extensible Log Processor that comes with full support for Kubernetes:

Process Kubernetes containers logs from the file system or Systemd/Journald.
Enrich logs with Kubernetes Metadata.
Centralize your logs in third party storage services like Elasticsearch, InfluxDB, HTTP, etc.

Concepts

Before getting started it is important to understand how Fluent Bit will be deployed. Kubernetes manages a cluster of nodes, so our log agent tool will need to run on every node to collect logs from every POD, hence Fluent Bit is deployed as a DaemonSet (a POD that runs on every node of the cluster).

When Fluent Bit runs, it will read, parse and filter the logs of every POD and will enrich each entry with the following information (metadata):

Pod Name
Pod ID
Container Name
Container ID
Labels
Annotations

To obtain these information, a built-in filter plugin called kubernetes talks to the Kubernetes API Server to retrieve relevant information such as the pod_id, labels and annotations, other fields such as pod_name, container_id and container_name are retrieved locally from the log file names. All of this is handled automatically, no intervention is required from a configuration aspect.

Installation

The next step is to create a ConfigMap that will be used by our Fluent Bit DaemonSet:

Note for Kubernetes < v1.16

For Kubernetes versions olden than v1.16, the DaemonSet resource is not available on apps/v1 , the resource is available on apiVersion: extensions/v1beta1 . Our current Daemonset Yaml files uses the new apiVersion.

If you are using and older Kubernetes, grab manually a copy of your Daemonset Yaml file and replace the value of apiVersion from:

You can read more about this deprecation on Kubernetes v1.14 Changelog here:

Fluent Bit to Elasticsearch

Fluent Bit DaemonSet ready to be used with Elasticsearch on a normal Kubernetes Cluster:

Fluent Bit to Elasticsearch on Minikube

If you are using Minikube for testing purposes, use the following alternative DaemonSet manifest:

Details

The default configuration of Fluent Bit makes sure of the following:

Consume all containers logs from the running Node.
The Kubernetes filter will enrich the logs with Kubernetes metadata, specifically labels and annotations. The filter only goes to the API Server when it cannot find the cached info, otherwise it uses the cache.
There is an option called Retry_Limit set to False, that means if Fluent Bit cannot flush the records to Elasticsearch it will re-try indefinitely until it succeed.

Yocto / Embedded Linux

source code provides Bitbake recipes to configure, build and package the software for a Yocto based image. Note that specific steps of usage of these recipes in your Yocto environment (Poky) is out of the scope of this documentation.

We distribute two main recipes, one for testing/dev purposes and other with the latest stable release.

It's strongly recommended to always use the stable release of Fluent Bit recipe and not the one from GIT master for production deployments.

Fluent Bit and other architectures

Fluent Bit >= v1.1.x fully supports x86_64, x86, arm32v7 and arm64v8.

Windows

Fluent Bit is distributed as td-agent-bit package for Windows. Fluent Bit has two flavours of Windows installers: a ZIP archive (for quick testing) and an EXE installer (for system installation).

Installation Packages

The latest stable version is 1.4.6.

To check the integrity, use Get-FileHash commandlet on PowerShell.

Installing from ZIP archive

Download a ZIP archive . There are installers for 32-bit and 64-bit environments, so choose one suitable for your environment.

Then you need to expand the ZIP archive. You can do this by clicking "Extract All" on Explorer, or if you're using PowerShell, you can use Expand-Archive commandlet.

The ZIP package contains the following set of files.

Now, launch cmd.exe or PowerShell on your machine, and execute fluent-bit.exe as follows.

If you see the following output, it's working fine!

To halt the process, press CTRL-C in the terminal.

Installing from EXE installer

Then, double-click the EXE installer you've downloaded. Installation wizard will automatically start.

Click Next and proceed. By default, Fluent Bit is installed into C:\Program Files\td-agent-bit\, so you should be able to launch fluent-bit as follow after installation.

Administration

Configuring Fluent Bit

Format and Schema

Fluent Bit might optionally use a configuration file to define how the service will behave, and before proceeding we need to understand how the configuration schema works.

The schema is defined by three concepts:

Sections
Entries: Key/Value
Indented Configuration Mode

A simple example of a configuration file is as follows:

Sections

A section is defined by a name or title inside brackets. Looking at the example above, a Service section has been set using [SERVICE] definition. Section rules:

All section content must be indented (4 spaces ideally).
Multiple sections can exist on the same file.
A section is expected to have comments and entries, it cannot be empty.
Any commented line under a section, must be indented too.

Entries: Key/Value

A section may contain Entries, an entry is defined by a line of text that contains a Key and a Value, using the above example, the [SERVICE] section contains two entries, one is the key Daemon with value off and the other is the key Log_Level with the value debug. Entries rules:

An entry is defined by a key and a value.
A key must be indented.
A key must contain a value which ends in the breakline.
Multiple keys with the same name can exist.

Also commented lines are set prefixing the # character, those lines are not processed but they must be indented too.

Indented Configuration Mode

Fluent Bit configuration files are based in a strict Indented Mode, that means that each configuration file must follow the same pattern of alignment from left to right when writing text. By default an indentation level of four spaces from left to right is suggested. Example:

As you can see there are two sections with multiple entries and comments, note also that empty lines are allowed and they do not need to be indented.

Data Pipeline

Inputs

Parsers

Filters

Outputs

Rewrite Tag

Powerful and flexible routing

Tags are what makes routing possible. Tags are set in the configuration of the Input definitions where the records are generated, but there are certain scenarios where might be useful to modify the Tag in the pipeline so we can perform more advanced and flexible routing.

The rewrite_tag filter, allows to re-emit a record under a new Tag. Once a record has been re-emitted, the original record can be preserved or discarded.

How it Works

The way it works is defining rules that matches specific record key content against a regular expression, if a match exists, a new record with the defined Tag will be emitted. Multiple rules can be specified and they are processed in order until one of them matches.

The new Tag to define can be composed by:

Alphabet characters & Numbers
Original Tag string or part of it
Regular Expressions groups capture
Any key or sub-key of the processed record
Environment variables

Configuration Parameters

The rewrite_tag filter supports the following configuration parameters:

Rules

A rule aims to define matching criteria and specify how to create a new Tag for a record. You can define one or multiple rules in the same configuration section. The rules have the following format:

$KEY  REGEX  NEW_TAG  KEEP

Key

The key represents the name of the record key that holds the value that we want to use to match our regular expression. A key name is specified and prefixed with a $. Consider the following structured record (formatted for readability):

{
  "name": "abc-123",
  "ss": {
    "s1": {
      "s2": "flb"
    }
  }
}

If we wanted to match against the value of the key name we must use $name. The key selector is flexible enough to allow to match nested levels of sub-maps from the structure. If we wanted to check the value of the nested key s2 we can do it specifying $ss['s1']['s2'], for short:

$name = "abc-123"
$ss['s1']['s2'] = "flb"

Note that a key must point a value that contains a string, it's not valid for numbers, booleans, maps or arrays.

Regex

Using a simple regular expression we can specify a matching pattern to use against the value of the key specified above, also we can take advantage of group capturing to create custom placeholder values.

If we wanted to match any record that it $name contains a value of the format string-number like the example provided above, we might use:

^([a-z]+)-([0-9]+)$

Note that in our example we are using parentheses, this teams that we are specifying groups of data. If the pattern matches the value a placeholder will be created that can be consumed by the NEW_TAG section.

If $name equals abc-123 , then the following placeholders will be created:

$0 = "abc-123"
$1 = "abc"
$2 = "123"

If the Regular expression do not matches an incoming record, the rule will be skipped and the next rule (if any) will be processed.

New Tag

If a regular expression has matched the value of the defined key in the rule, we are ready to compose a new Tag for that specific record. The tag is a concatenated string that can contain any of the following characters: a-z,A-Z, 0-9 and .-,.

A Tag can take any string value from the matching record, the original tag it self, environment variable or general placeholder.

Consider the following incoming data on the rule:

Tag = aa.bb.cc
Record = {"name": "abc-123", "ss": {"s1": {"s2": "flb"}}}
Environment variable $HOSTNAME = fluent

With such information we could create a very custom Tag for our record like the following:

newtag.$TAG.$TAG[1].$1.$ss['s1']['s2'].out.${HOSTNAME}

the expected Tag to generated will be:

newtag.aa.bb.cc.bb.abc.flb.out.fluent

We make use of placeholders, record content and environment variables.

Keep

If a rule matches the criteria the filter will emit a copy of the record with the new defined Tag. The property keep takes a boolean value to define if the original record with the old Tag must be preserved and continue in the pipeline or just be discarded.

You can use true or false to decide the expected behavior. There is no default value and this is a mandatory field in the rule.

Configuration Example

The following configuration example will emit a dummy (hand-crafted) record, the filter will rewrite the tag, discard the old record and print the new record to the standard output interface:

[SERVICE]
    Flush     1
    Log_Level info

[INPUT]
    NAME   dummy
    Dummy  {"tool": "fluent", "sub": {"s1": {"s2": "bit"}}}
    Tag    test_tag

[FILTER]
    Name          rewrite_tag
    Match         test_tag
    Rule          $tool ^(fluent)$  from.$TAG.new.$tool.$sub['s1']['s2'].out false
    Emitter_Name  re_emitted

[OUTPUT]
    Name   stdout
    Match  from.*

The original tag test_tag will be rewritten as from.test_tag.new.fluent.bit.out:

$ bin/fluent-bit -c example.conf
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

...
[0] from.test_tag.new.fluent.bit.out: [1580436933.000050569, {"tool"=>"fluent", "sub"=>{"s1"=>{"s2"=>"bit"}}}]

Monitoring

As described in the Monitoring section, every component of the pipeline of Fluent Bit exposes metrics. The basic metrics exposed by this filter are drop_records and add_records, they summarize the total of dropped records from the incoming data chunk or the new records added.

Since rewrite_tag emit new records that goes through the beginning of the pipeline, it exposes an additional metric called emit_records that summarize the total number of emitted records.

Understanding the Metrics

Using the configuration provided above, if we query the metrics exposed in the HTTP interface we will see the following:

Command:

$ curl  http://127.0.0.1:2020/api/v1/metrics/ | jq

Metrics output:

{
  "input": {
    "dummy.0": {
      "records": 2,
      "bytes": 80
    },
    "emitter_for_rewrite_tag.0": {
      "records": 1,
      "bytes": 40
    }
  },
  "filter": {
    "rewrite_tag.0": {
      "drop_records": 2,
      "add_records": 0,
      "emit_records": 2
    }
  },
  "output": {
    "stdout.0": {
      "proc_records": 1,
      "proc_bytes": 40,
      "errors": 0,
      "retries": 0,
      "retries_failed": 0
    }
  }
}

The dummy input generated two records, the filter dropped two from the chunks and emitted two new ones under a different Tag.

The records generated are handled by the internal Emitter, so the new records are summarized in the Emitter metrics, take a look at the entry called emitter_for_rewrite_tag.0.

What is the Emitter ?

The Emitter is an internal Fluent Bit plugin that allows other components of the pipeline to emit custom records. On this case rewrite_tag creates an Emitter instance to use it exclusively to emit records, on that way we can have a granular control of who is emitting what.

The Emitter name in the metrics can be changed setting up the Emitter_Name configuration property described above.