1 of 100

2.0 Fluent Bit v2.0 Documentation

High Performance Log and Metrics Processor

Fluent Bit is a Fast and Lightweight Telemetry Agent for Logs, Metrics, and Traces for Linux, macOS, Windows, and BSD family operating systems. It has been made with a strong focus on performance to allow the collection and processing of telemetry data from different sources without complexity.

Features

High Performance: High throughput with low resources consumption
Data Parsing
- Convert your unstructured messages using our parsers: JSON, Regex, LTSV and Logfmt
Metrics Support: Prometheus and OpenTelemetry compatible
Reliability and Data Integrity
- Backpressure Handling
- Data Buffering in memory and file system
Networking
- Security: built-in TLS/SSL support
- Asynchronous I/O
Pluggable Architecture and Extensibility: Inputs, Filters and Outputs
- More than 100 built-in plugins are available
- Extensibility
  - Write any input, filter or output plugin in C language
  - WASM: WASM Filter Plugins or WASM Input Plugins
  - Bonus: write Filters in Lua or Output plugins in Golang
Monitoring: expose internal metrics over HTTP in JSON and Prometheus format
Stream Processing: Perform data selection and transformation using simple SQL queries
- Create new streams of data using query results
- Aggregation Windows
- Data analysis and prediction: Timeseries forecasting
Portable: runs on Linux, macOS, Windows and BSD systems

Fluent Bit, Fluentd and CNCF

Fluent Bit is a CNCF graduated sub-project under the umbrella of Fluentd, it's licensed under the terms of the Apache License v2.0.

Fluent Bit was originally created by Eduardo Silva; as a CNCF-hosted project is a fully vendor-neutral and community-driven project.

About

What is Fluent Bit?

Fluent Bit is a CNCF sub-project under the umbrella of Fluentd

Fluent Bit is an open-source telemetry agent specifically designed to efficiently handle the challenges of collecting and processing telemetry data across a wide range of environments, from constrained systems to complex cloud infrastructures. Managing telemetry data from various sources and formats can be a constant challenge, particularly when performance is a critical factor.

Rather than serving as a drop-in replacement, Fluent Bit enhances the observability strategy for your infrastructure by adapting and optimizing your existing logging layer, as well as metrics and traces processing. Furthermore, Fluent Bit supports a vendor-neutral approach, seamlessly integrating with other ecosystems such as Prometheus and OpenTelemetry. Trusted by major cloud providers, banks, and companies in need of a ready-to-use telemetry agent solution, Fluent Bit effectively manages diverse data sources and formats while maintaining optimal performance.

Fluent Bit can be deployed as an edge agent for localized telemetry data handling or utilized as a central aggregator/collector for managing telemetry data across multiple sources and environments.

Fluent Bit has been designed with performance and low resource consumption in mind.

A Brief History of Fluent Bit

Every project has a story

On 2014, the Fluentd team at Treasure Data was forecasting the need for a lightweight log processor for constraint environments like Embedded Linux and Gateways, the project aimed to be part of the Fluentd Ecosystem; at that moment, Eduardo created Fluent Bit, a new open source solution written from scratch available under the terms of the Apache License v2.0.\

After the project was around for some time, it got more traction for normal Linux systems, also with the new containerized world, the Cloud Native community asked to extend the project scope to support more sources, filters, and destinations. Not so long after, Fluent Bit became one of the preferred solutions to solve the logging challenges in Cloud environments.

Fluentd & Fluent Bit

The Production Grade Telemetry Ecosystem

Telemetry data processing in general can be complex, and at scale a bit more, that's why Fluentd was born. Fluentd has become more than a simple tool, it has grown into a fullscale ecosystem that contains SDKs for different languages and sub-projects like Fluent Bit.

On this page, we will describe the relationship between the Fluentd and Fluent Bit open source projects, as a summary we can say both are:

Licensed under the terms of Apache License v2.0
Graduated Hosted projects by the Cloud Native Computing Foundation (CNCF)
Production Grade solutions: deployed million of times every single day.
Vendor neutral and community driven projects
Widely Adopted by the Industry: trusted by all major companies like AWS, Microsoft, Google Cloud and hundreds of others.

Both projects share a lot of similarities, Fluent Bit is fully designed and built on top of the best ideas of Fluentd architecture and general design. Choosing which one to use depends on the end-user needs.

The following table describes a comparison of different areas of the projects:

Fluentd

Fluent Bit

Both Fluentd and Fluent Bit can work as Aggregators or Forwarders, they both can complement each other or use them as standalone solutions. In the recent years, Cloud Providers switched from Fluentd to Fluent Bit for performance and compatibility reasons. Fluent Bit is now considered the next generation solution.

License

Strong Commitment to the Openness and Collaboration

, including it core, plugins and tools are distributed under the terms of the :

Concepts

Key Concepts

There are a few key concepts that are really important to understand how Fluent Bit operates.

Before diving into it’s good to get acquainted with some of the key concepts of the service. This document provides a gentle introduction to those concepts and common terminology. We’ve provided a list below of all the terms we’ll cover, but we recommend reading this document from start to finish to gain a more general understanding of our log and stream processor.

Event or Record
Filtering
Tag
Timestamp
Match
Structured Message

Event or Record

Every incoming piece of data that belongs to a log or a metric that is retrieved by Fluent Bit is considered an Event or a Record.

As an example consider the following content of a Syslog file:

It contains four lines and all of them represents four independent Events.

Internally, an Event always has two components (in an array form):

Filtering

In some cases it is required to perform modifications on the Events content, the process to alter, enrich or drop Events is called Filtering.

There are many use cases when Filtering is required like:

Append specific information to the Event like an IP address or metadata.
Select a specific piece of the Event content.
Drop Events that matches certain pattern.

Tag

Every Event that gets into Fluent Bit gets assigned a Tag. This tag is an internal string that is used in a later stage by the Router to decide which Filter or Output phase it must go through.

Most of the tags are assigned manually in the configuration. If a tag is not specified, Fluent Bit will assign the name of the Input plugin instance from where that Event was generated from.

Timestamp

The Timestamp represents the time when an Event was created. Every Event contains a Timestamp associated. The Timestamp is a numeric fractional integer in the format:

Seconds

It is the number of seconds that have elapsed since the Unix epoch.

Nanoseconds

Fractional second or one thousand-millionth of a second.

A timestamp always exists, either set by the Input plugin or discovered through a data parsing process.

Match

Fluent Bit allows to deliver your collected and processed Events to one or multiple destinations, this is done through a routing phase. A Match represent a simple rule to select Events where it Tags matches a defined rule.

Structured Messages

Source events can have or not have a structure. A structure defines a set of keys and values inside the Event message. As an example consider the following two messages:

No structured message

Structured Message

At a low level both are just an array of bytes, but the Structured message defines keys and values, having a structure helps to implement faster operations on data modifications.

Buffering

Performance and Data Safety

When processes data, it uses the system memory (heap) as a primary and temporary place to store the record logs before they get delivered, in this private memory area the records are processed.

Buffering refers to the ability to store the records somewhere, and while they are processed and delivered, still be able to store more. Buffering in memory is the fastest mechanism, but there are certain scenarios where it requires special strategies to deal with , data safety or reduce memory consumption by the service in constrained environments.

Network failures or latency on third party service is pretty common, and on scenarios where we cannot deliver data fast enough as we receive new data to process, we likely will face backpressure.

Our buffering strategies are designed to solve problems associated with backpressure and general delivery failures.

Fluent Bit as buffering strategies go, offers a primary buffering mechanism in memory and an optional secondary one using the file system. With this hybrid solution you can accommodate any use case safely and keep a high performance while processing your data.

Both mechanisms are not mutually exclusive and when the data is ready to be processed or delivered it will always be in memory, while other data in the queue might be in the file system until is ready to be processed and moved up to memory.

To learn more about the buffering configuration in Fluent Bit, please jump to the section.

Data Pipeline

Input

The way to gather data from your sources

Fluent Bit provides different Input Plugins to gather information from different sources, some of them just collect data from log files while others can gather metrics information from the operating system. There are many plugins for different needs.

When an input plugin is loaded, an internal instance is created. Every instance has its own and independent configuration. Configuration keys are often called properties.

Every input plugin has its own documentation section where it's specified how it can be used and what properties are available.

For more details, please refer to the Input Plugins section.

Parser

Convert Unstructured to Structured messages

Dealing with raw strings or unstructured messages is a constant pain; having a structure is highly desired. Ideally we want to set a structure to the incoming data by the Input Plugins as soon as they are collected:

The Parser allows you to convert from unstructured to structured data. As a demonstrative example consider the following Apache (HTTP Server) log entry:

192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395

The above log line is a raw string without format, ideally we would like to give it a structure that can be processed later easily. If the proper configuration is used, the log entry could be converted to:

{
  "host":    "192.168.2.20",
  "user":    "-",
  "method":  "GET",
  "path":    "/cgi-bin/try/",
  "code":    "200",
  "size":    "3395",
  "referer": "",
  "agent":   ""
 }

Parsers are fully configurable and are independently and optionally handled by each input plugin, for more details please refer to the Parsers section.

Filter

Modify, Enrich or Drop your records

In production environments we want to have full control of the data we are collecting, filtering is an important feature that allows us to alter the data before delivering it to some destination.

Filtering is implemented through plugins, so each filter available could be used to match, exclude or enrich your logs with some specific metadata.

We support many filters, A common use case for filtering is Kubernetes deployments. Every Pod log needs to get the proper metadata associated

Very similar to the input plugins, Filters run in an instance context, which has its own independent configuration. Configuration keys are often called properties.

For more details about the Filters available and their usage, please refer to the Filters section.

Buffer

Data processing with reliability

Previously defined in the Buffering concept section, the buffer phase in the pipeline aims to provide a unified and persistent mechanism to store your data, either using the primary in-memory model or using the filesystem based mode.

The buffer phase already contains the data in an immutable state, meaning, no other filter can be applied.

Note that buffered data is not raw text, it's in Fluent Bit's internal binary representation.

Fluent Bit offers a buffering mechanism in the file system that acts as a backup system to avoid data loss in case of system failures.

Router

Create flexible routing rules

Routing is a core feature that allows to route your data through Filters and finally to one or multiple destinations. The router relies on the concept of Tags and Matching rules

There are two important concepts in Routing:

Tag
Match

When the data is generated by the input plugins, it comes with a Tag (most of the time the Tag is configured manually), the Tag is a human-readable indicator that helps to identify the data source.

In order to define where the data should be routed, a Match rule must be specified in the output configuration.

Consider the following configuration example that aims to deliver CPU metrics to an Elasticsearch database and Memory metrics to the standard output interface:

[INPUT]
    Name cpu
    Tag  my_cpu

[INPUT]
    Name mem
    Tag  my_mem

[OUTPUT]
    Name   es
    Match  my_cpu

[OUTPUT]
    Name   stdout
    Match  my_mem

Note: the above is a simple example demonstrating how Routing is configured.

Routing works automatically reading the Input Tags and the Output Match rules. If some data has a Tag that doesn't match upon routing time, the data is deleted.

Routing with Wildcard

Routing is flexible enough to support wildcard in the Match pattern. The below example defines a common destination for both sources of data:

[INPUT]
    Name cpu
    Tag  my_cpu

[INPUT]
    Name mem
    Tag  my_mem

[OUTPUT]
    Name   stdout
    Match  my_*

The match rule is set to my_* which means it will match any Tag that starts with my_.

Output

Destinations for your data: databases, cloud services and more!

The output interface allows us to define destinations for the data. Common destinations are remote services, local file system or standard interface with others. Outputs are implemented as plugins and there are many available.

When an output plugin is loaded, an internal instance is created. Every instance has its own independent configuration. Configuration keys are often called properties.

Every output plugin has its own documentation section specifying how it can be used and what properties are available.

Installation

Getting Started with Fluent Bit

The following serves as a guide on how to install/deploy/upgrade Fluent Bit

Container Deployment

Deployment Type

Instructions

Install on Linux (Packages)

Operating System

Installation Instructions

Install on Windows (Packages)

Operating System

Installation Instructions

Install on macOS (Packages)

Compile from Source (Linux, Windows, FreeBSD, macOS)

Sandbox Environment

If you are interested in learning about Fluent Bit you can try out the sandbox environment

Enterprise Packages

Upgrade Notes

The following article cover the relevant notes for users upgrading from previous Fluent Bit versions. We aim to cover compatibility changes that you must be aware of.

For more details about changes on each release please refer to the .

Note: release notes will be prepared in advance of a Git tag for a release so an official release should provide both a tag and a release note together to allow users to verify and understand the release contents.

The tag drives the overall binary release process so release binaries (containers/packages) will appear after a tag and its associated release note. This allows users to expect the new release binary to appear and allow/deny/update it as appropriate in their infrastructure.

Fluent Bit v1.9.9

The td-agent-bit package is no longer provided after this release. Users should switch to the fluent-bit package.

Fluent Bit v1.6

If you are migrating from previous version of Fluent Bit please review the following important changes:

Tail Input Plugin

Now by default the plugin follows a file from the end once the service starts (old behavior was always read from the beginning). For every file found at start, its followed from it last position, for new files discovered at runtime or rotated, they are read from the beginning.

If you desire to keep the old behavior you can set the option read_from_head to true.

Stackdriver Output Plugin

The project_id of in sent to Google Cloud Logging would be set to the project ID rather than the project number. To learn the difference between Project ID and project number, see for more details.

If you have any existing queries based on the resource's project_id, please update your query accordingly.

Fluent Bit v1.5

The migration from v1.4 to v1.5 is pretty straightforward.

Fluent Bit v1.4

If you are migrating from Fluent Bit v1.3, there are no breaking changes. Just new exciting features to enjoy :)

Fluent Bit v1.3

If you are migrating from Fluent Bit v1.2 to v1.3, there are no breaking changes. If you are upgrading from an older version please review the incremental changes below.

Fluent Bit v1.2

Docker, JSON, Parsers and Decoders

On Fluent Bit v1.2 we have fixed many issues associated with JSON encoding and decoding, for hence when parsing Docker logs is no longer necessary to use decoders. The new Docker parser looks like this:

Note: again, do not use decoders.

Kubernetes Filter

We have done improvements also on how Kubernetes Filter handle the stringified log message. If the option Merge_Log is enabled, it will try to handle the log content as a JSON map, if so, it will add the keys to the root map.

In addition, we have fixed and improved the option called Merge_Log_Key. If a merge log succeed, all new keys will be packaged under the key specified by this option, a suggested configuration is as follows:

As an example, if the original log content is the following map:

the final record will be composed as follows:

Fluent Bit v1.1

If you are upgrading from Fluent Bit <= 1.0.x you should take in consideration the following relevant changes when switching to Fluent Bit v1.1 series:

Kubernetes Filter

We introduced a new configuration property called Kube_Tag_Prefix to help Tag prefix resolution and address an unexpected behavior that landed in previous versions.

During 1.0.x release cycle, a commit in Tail input plugin changed the default behavior on how the Tag was composed when using the wildcard for expansion generating breaking compatibility with other services. Consider the following configuration example:

The expected behavior is that Tag will be expanded to:

but the change introduced in 1.0 series switched from absolute path to the base file name only:

On Fluent Bit v1.1 release we restored to our default behavior and now the Tag is composed using the absolute path of the monitored file.

Having absolute path in the Tag is relevant for routing and flexible configuration where it also helps to keep compatibility with Fluentd behavior.

This behavior switch in Tail input plugin affects how Filter Kubernetes operates. As you know when the filter is used it needs to perform local metadata lookup that comes from the file names when using Tail as a source. Now with the new Kube_Tag_Prefix option you can specify what's the prefix used in Tail input plugin, for the configuration example above the new configuration will look as follows:

So the proper for Kube_Tag_Prefix value must be composed by Tag prefix set in Tail input plugin plus the converted monitored directory replacing slashes with dots.

Supported Platforms

The following operating systems and architectures are supported in Fluent Bit.

Operating System

Distribution

Architectures

From an architecture support perspective, Fluent Bit is fully functional on x86_64, Arm64v8 and Arm32v7 based processors.

Requirements

uses very low CPU and Memory consumption, it's compatible with most of x86, x86_64, arm32v7 and arm64v8 based platforms. In order to build it you need the following components in your system for the build process:

Compiler: GCC or clang
CMake
Flex & Bison: only if you enable the Stream Processor or Record Accessor feature (both enabled by default)
Libyaml development headers and libraries

In the core there are not other dependencies, For certain features that depends on third party components like output plugins with special backend libraries (e.g: kafka), those are included in the main source code repository.

Sources

Download Source Code

Stable

For production systems, we strongly suggest that you always get the latest stable release of the source code in either zip or tarball format from Github using the following link pattern:

https://github.com/fluent/fluent-bit/archive/refs/tags/v<release version>.tar.gz https://github.com/fluent/fluent-bit/archive/refs/tags/v<release version>.zip

For example for version 1.8.12 the link is the following:

Development

For anyone who aims to contribute to the project by testing or extending the code base, you can get the development version from our GIT repository:

Note that our master branch is where the development of Fluent Bit happens. Since it's a development version, expect issues when compiling or at run time.

We encourage everybody to help us testing every development version, at the end this is what will become stable.

Build and Install

Fluent Bit uses CMake as its build system. The suggested procedure to prepare the build system consists of the following steps:

Requirements

CMake >= 3.0
Flex
Bison
YAML headers
OpenSSL headers

Prepare environment

In the following steps you can find exact commands to build and install the project with the default options. If you already know how CMake works you can skip this part and look at the build options available. Note that Fluent Bit requires CMake 3.x. You may need to use cmake3 instead of cmake to complete the following steps on your system.

Change to the build/ directory inside the Fluent Bit sources:

$ cd build/

Let CMake configure the project specifying where the root path is located:

$ cmake ../
-- The C compiler identification is GNU 4.9.2
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- The CXX compiler identification is GNU 4.9.2
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
...
-- Could NOT find Doxygen (missing:  DOXYGEN_EXECUTABLE)
-- Looking for accept4
-- Looking for accept4 - not found
-- Configuring done
-- Generating done
-- Build files have been written to: /home/edsiper/coding/fluent-bit/build

Now you are ready to start the compilation process through the simple make command:

$ make
Scanning dependencies of target msgpack
[  2%] Building C object lib/msgpack-1.1.0/CMakeFiles/msgpack.dir/src/unpack.c.o
[  4%] Building C object lib/msgpack-1.1.0/CMakeFiles/msgpack.dir/src/objectc.c.o
[  7%] Building C object lib/msgpack-1.1.0/CMakeFiles/msgpack.dir/src/version.c.o
...
[ 19%] Building C object lib/monkey/mk_core/CMakeFiles/mk_core.dir/mk_file.c.o
[ 21%] Building C object lib/monkey/mk_core/CMakeFiles/mk_core.dir/mk_rconf.c.o
[ 23%] Building C object lib/monkey/mk_core/CMakeFiles/mk_core.dir/mk_string.c.o
...
Scanning dependencies of target fluent-bit-static
[ 66%] Building C object src/CMakeFiles/fluent-bit-static.dir/flb_pack.c.o
[ 69%] Building C object src/CMakeFiles/fluent-bit-static.dir/flb_input.c.o
[ 71%] Building C object src/CMakeFiles/fluent-bit-static.dir/flb_output.c.o
...
Linking C executable ../bin/fluent-bit
[100%] Built target fluent-bit-bin

to continue installing the binary on the system just do:

$ make install

it's likely you may need root privileges so you can try to prefixing the command with sudo.

Build Options

Fluent Bit provides certain options to CMake that can be enabled or disabled when configuring, please refer to the following tables under the General Options, Development Options, Input Plugins and _Output Plugins sections.

General Options

Development Options

Optimization Options

Input Plugins

The input plugins provides certain features to gather information from a specific source type which can be a network interface, some built-in metric or through a specific input device, the following input plugins are available:

Filter Plugins

The filter plugins allows to modify, enrich or drop records. The following table describes the filters available on this version:

Output Plugins

The output plugins gives the capacity to flush the information to some external interface, service or terminal, the following table describes the output plugins available as of this version:

Build with Static Configuration

Fluent Bit in normal operation mode allows to be configurable through text files or using specific arguments in the command line, while this is the ideal deployment case, there are scenarios where a more restricted configuration is required: static configuration mode.

Static configuration mode aims to include a built-in configuration in the final binary of Fluent Bit, disabling the usage of external files or flags at runtime.

Getting Started

Requirements

The following steps assumes you are familiar with configuring Fluent Bit using text files and you have experience building it from scratch as described in the Build and Install section.

Configuration Directory

In your file system prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. It is mandatory that this directory contain as a minimum one configuration file called fluent-bit.conf containing the required SERVICE, INPUT and OUTPUT sections. As an example create a new fluent-bit.conf file with the following content:

[SERVICE]
    Flush     1
    Daemon    off
    Log_Level info

[INPUT]
    Name      cpu

[OUTPUT]
    Name      stdout
    Match     *

the configuration provided above will calculate CPU metrics from the running system and print them to the standard output interface.

Build with Custom Configuration

Inside Fluent Bit source code, get into the build/ directory and run CMake appending the FLB_STATIC_CONF option pointing the configuration directory recently created, e.g:

$ cd fluent-bit/build/
$ cmake -DFLB_STATIC_CONF=/path/to/my/confdir/

then build it:

$ make

At this point the fluent-bit binary generated is ready to run without necessity of further configuration:

$ bin/fluent-bit 
Fluent-Bit v0.15.0
Copyright (C) Treasure Data

[2018/10/19 15:32:31] [ info] [engine] started (pid=15186)
[0] cpu.local: [1539984752.000347547, {"cpu_p"=>0.750000, "user_p"=>0.500000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

Linux Packages

The most secure option is to create the repositories acccording to the instructions for your specific OS.

A simple installation script is provided to be used for most Linux targets. This will by default install the most recent version released.

curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

This is purely a convenience helper and should always be validated prior to use.

GPG key updates

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.

The GPG Key fingerprint of the new key is:

C3C0 A285 34B9 293E AF51  FABD 9F9D DC08 3888 C1CD
Fluentbit releases (Releases signing key) <releases@fluentbit.io>

The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Refer to the supported platform documentation to see which platforms are supported in each release.

Migration to Fluent Bit

From version 1.9, td-agent-bit is a deprecated package and is removed after 1.9.9. The correct package name to use now is fluent-bit.

Amazon Linux

Install on Amazon Linux

Fluent Bit is distributed as fluent-bit package and is available for the latest Amazon Linux 2 and Amazon Linux 2022. The following architectures are supported

x86_64
aarch64 / arm64v8

Single line install

A simple installation script is provided to be used for most Linux targets. This will always install the most recent version released.

curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

This is purely a convenience helper and should always be validated prior to use. The recommended secure deployment approach is to follow the instructions below.

Amazon Linux 2022

For Amazon Linux 2022, until it is GA, we need to force it to use the 2022 releasever in Yum but only for the Fluent Bit repository.

export FLUENT_BIT_INSTALL_COMMAND_PREFIX="sed -i 's|\$releasever/|2022/|g' /etc/yum.repos.d/fluent-bit.repo"
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

Configure Yum

We provide fluent-bit through a Yum repository. In order to add the repository reference to your system, please add a new file called fluent-bit.repo in /etc/yum.repos.d/ with the following content:

Amazon Linux 2

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/amazonlinux/2/$basearch/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1

Amazon Linux 2022

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/amazonlinux/2022/$basearch/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1

Note: we encourage you always enable the gpgcheck for security reasons. All our packages are signed.

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.

The GPG Key fingerprint of the new key is:

C3C0 A285 34B9 293E AF51  FABD 9F9D DC08 3888 C1CD
Fluentbit releases (Releases signing key) <releases@fluentbit.io>

The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Refer to the supported platform documentation to see which platforms are supported in each release.

Install

Once your repository is configured, run the following command to install it:

yum install fluent-bit

Now the following step is to instruct systemd to enable the service:

sudo service fluent-bit start

If you do a status check, you should see a similar output like this:

$ service fluent-bit status
Redirecting to /bin/systemctl status  fluent-bit.service
● fluent-bit.service - Fluent Bit
   Loaded: loaded (/usr/lib/systemd/system/fluent-bit.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2016-07-07 02:08:01 BST; 9s ago
 Main PID: 3820 (fluent-bit)
   CGroup: /system.slice/fluent-bit.service
           └─3820 /opt/fluent-bit/bin/fluent-bit -c etc/fluent-bit/fluent-bit.conf
...

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.

Redhat / CentOS

Install on Redhat / CentOS

Fluent Bit is distributed as fluent-bit package and is available for the latest stable CentOS system.

The following architectures are supported

x86_64
aarch64 / arm64v8

For CentOS 9+ we use CentOS Stream as the canonical base system.

Single line install

A simple installation script is provided to be used for most Linux targets. This will always install the most recent version released.

curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

This is purely a convenience helper and should always be validated prior to use. The recommended secure deployment approach is to follow the instructions below.

CentOS 8

CentOS 8 is now EOL so the default Yum repositories are unavailable.

Make sure to configure to use an appropriate mirror, for example:

$ sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-* && \
  sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*

An alternative is to use Rocky or Alma Linux which should be equivalent.

Configure Yum

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/centos/$releasever/$basearch/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
repo_gpgcheck=1
enabled=1

It is best practice to always enable the gpgcheck and repo_gpgcheck for security reasons. We sign our repository metadata as well as all of our packages.

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.

The GPG Key fingerprint of the new key is:

C3C0 A285 34B9 293E AF51  FABD 9F9D DC08 3888 C1CD
Fluentbit releases (Releases signing key) <releases@fluentbit.io>

The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Refer to the supported platform documentation to see which platforms are supported in each release.

Install

Once your repository is configured, run the following command to install it:

yum install fluent-bit

Now the following step is to instruct Systemd to enable the service:

sudo service fluent-bit start

If you do a status check, you should see a similar output like this:

$ service fluent-bit status
Redirecting to /bin/systemctl status  fluent-bit.service
● fluent-bit.service - Fluent Bit
   Loaded: loaded (/usr/lib/systemd/system/fluent-bit.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2016-07-07 02:08:01 BST; 9s ago
 Main PID: 3820 (fluent-bit)
   CGroup: /system.slice/fluent-bit.service
           └─3820 /opt/fluent-bit/bin/fluent-bit -c etc/fluent-bit/fluent-bit.conf
...

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.

FAQ

Yum install fails with a "404 - Page not found" error for the package mirror

The fluent-bit.repo file for the latest installations of Fluent-Bit uses a $releasever variable to determine the correct version of the package to install to your system:

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/centos/$releasever/$basearch/
...

Depending on your Red Hat distribution version, this variable may return a value other than the OS major release version (e.g., RHEL7 Server distributions return "7Server" instead of just "7"). The Fluent-Bit package url uses just the major OS release version, so any other value here will cause a 404.

In order to resolve this issue, you can replace the $releasever variable with your system's OS major release version. For example:

[fluent-bit]
name = Fluent Bit
baseurl = https://packages.fluentbit.io/centos/7/$basearch/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
repo_gpgcheck=1
enabled=1

Debian

Fluent Bit is distributed as fluent-bit package and is available for the latest (and legacy) stable Debian systems: Bookworm and Bullseye. The following architectures are supported

x86_64
aarch64 / arm64v8

Single line install

A simple installation script is provided to be used for most Linux targets. This will always install the most recent version released.

curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

This is purely a convenience helper and should always be validated prior to use. The recommended secure deployment approach is to follow the instructions below.

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages. Follow the official Debian wiki guidance: https://wiki.debian.org/DebianRepository/UseThirdParty#OpenPGP\_Key\_distribution

curl https://packages.fluentbit.io/fluentbit.key | gpg --dearmor > /usr/share/keyrings/fluentbit-keyring.gpg

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.

The GPG Key fingerprint of the new key is:

C3C0 A285 34B9 293E AF51  FABD 9F9D DC08 3888 C1CD
Fluentbit releases (Releases signing key) <releases@fluentbit.io>

The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Refer to the supported platform documentation to see which platforms are supported in each release.

Update your sources lists

On Debian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file - ensure to set CODENAME to your specific Debian release name (e.g. bookworm for Debian 12):

deb [signed-by=/usr/share/keyrings/fluentbit-keyring.gpg] https://packages.fluentbit.io/debian/${CODENAME} ${CODENAME} main

Update your repositories database

Now let your system update the apt database:

sudo apt-get update

We recommend upgrading your system (sudo apt-get upgrade). This could avoid potential issues with expired certificates.

Install Fluent Bit

Using the following apt-get command you are able now to install the latest fluent-bit:

sudo apt-get install fluent-bit

Now the following step is to instruct systemd to enable the service:

sudo systemctl start fluent-bit

If you do a status check, you should see a similar output like this:

sudo service fluent-bit status
● fluent-bit.service - Fluent Bit
   Loaded: loaded (/lib/systemd/system/fluent-bit.service; disabled; vendor preset: enabled)
   Active: active (running) since mié 2016-07-06 16:58:25 CST; 2h 45min ago
 Main PID: 6739 (fluent-bit)
    Tasks: 1
   Memory: 656.0K
      CPU: 1.393s
   CGroup: /system.slice/fluent-bit.service
           └─6739 /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
...

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Ubuntu

Fluent Bit is distributed as fluent-bit package and is available for the latest stable Ubuntu system: Jammy Jellyfish.

Single line install

A simple installation script is provided to be used for most Linux targets. This will always install the most recent version released.

This is purely a convenience helper and should always be validated prior to use. The recommended secure deployment approach is to follow the instructions below.

Server GPG key

The first step is to add our server GPG key to your keyring to ensure you can get our signed packages. Follow the official Debian wiki guidance:

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at so ensure this new one is added.

The GPG Key fingerprint of the new key is:

The previous key is still available at and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

Update your sources lists

Update your repositories database

Now let your system update the apt database:

We recommend upgrading your system (sudo apt-get upgrade). This could avoid potential issues with expired certificates.

If you have the following error "Certificate verification failed", you might want to check if the package ca-certificates is properly installed (sudo apt-get install ca-certificates).

## Install Fluent Bit

Using the following apt-get command you are able now to install the latest fluent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Raspbian / Raspberry Pi

Fluent Bit is distributed as fluent-bit package and is available for the Raspberry, specifically for distribution, the following versions are supported:

Raspbian Bullseye (11)
Raspbian Buster (10)

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at so ensure this new one is added.

The GPG Key fingerprint of the new key is:

The previous key is still available at and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

Refer to the to see which platforms are supported in each release.

Update your sources lists

On Debian and derivative systems such as Raspbian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file.

Raspbian 11 (Bullseye)

Raspbian 10 (Buster)

Update your repositories database

Now let your system update the apt database:

We recommend upgrading your system (sudo apt-get upgrade). This could avoid potential issues with expired certificates.

Install Fluent Bit

Using the following apt-get command you are able now to install the latest fluent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Docker

Fluent Bit container images are available on Docker Hub ready for production usage. Current available images can be deployed in multiple architectures.

Quick Start

Get started by simply typing the following command:

Tags and Versions

The following table describes the Linux container tags that are available on Docker Hub repository:

Tag(s)

Manifest Architectures

Description

It is strongly suggested that you always use the latest image of Fluent Bit.

Windows container images are provided from v2.0.6 for Windows Server 2019 and Windows Server 2022. These can be found as tags on the same Docker Hub registry above.

Multi Architecture Images

From a deployment perspective, there is no need to specify an architecture, the container client tool that pulls the image gets the proper layer for the running architecture.

Verify signed container images

Note: replace cosign above with the binary installed if it has a different name (e.g. cosign-linux-amd64).

Keyless signing is also provided but this is still experimental:

Getting Started

Download the last stable image from 2.0 series:

Once the image is in place, now run the following (useless) test which makes Fluent Bit measure CPU usage by the container:

That command will let Fluent Bit measure CPU usage every second and flush the results to the standard output, e.g:

F.A.Q

Why there is no Fluent Bit Docker image based on Alpine Linux ?

Alpine Linux uses Musl C library instead of Glibc. Musl is not fully compatible with Glibc which generated many issues in the following areas when used with Fluent Bit:

Memory Allocator: to run Fluent Bit properly in high-load environments, we use Jemalloc as a default memory allocator which reduce fragmentation and provides better performance for our needs. Jemalloc cannot run smoothly with Musl and requires extra work.
Alpine Linux Musl functions bootstrap have a compatibility issue when loading Golang shared libraries, this generate problems when trying to load Golang output plugins in Fluent Bit.
Alpine Linux Musl Time format parser does not support Glibc extensions
Maintainers preference in terms of base image due to security and maintenance reasons are Distroless and Debian.

Why use distroless containers ?

Only include what you need, reduce the attack surface available.
Reduces size so improves perfomance as well.
Reduces false positives on scans (and reduces resources required for scanning).
Reduces supply chain security requirements to just what you need.
Helps prevent unauthorised processes or users interacting with the container.
Less need to harden the container (and container runtime, K8S, etc.).
Faster CICD processes.

With any choice of course there are downsides:

No shell or package manager to update/add things.
- Generally though dynamic updating is a bad idea in containers as the time it is done affects the outcome: two containers started at different times using the same base image may perform differently or get different dependencies, etc.
- A better approach is to rebuild a new image version but then you can do this with Distroless, however it is harder requiring multistage builds or similar to provide the new dependencies.
Debugging can be harder.
- More specifically you need applications set up to properly expose information for debugging rather than rely on traditional debug approaches of connecting to processes or dumping memory. This can be an upfront cost vs a runtime cost but does shift left in the development process so hopefully is a reduction overall.
Assumption that Distroless is secure: nothing is secure (just more or less secure) and there are still exploits so it does not remove the need for securing your system.
Sometimes you need to use a common base image, e.g. with audit/security/health/etc. hooks integrated, or common base tooling (this could still be Distroless though).

One other important thing to note is that exec'ing into a container will potentially impact resource limits.

This can be a quite different container from the one you want to investigate (e.g. lots of extra tools or even a different base).
No resource limits applied to this container - can be good or bad.
Runs in pod namespaces, just another container that can access everything the others can.
May need architecture of the pod to share volumes, etc.
Requires more recent versions of K8S and the container runtime plus RBAC allowing it.

Containers on AWS

AWS maintains a distribution of Fluent Bit combining the latest official release with a set of Go Plugins for sending logs to AWS services. AWS and Fluent Bit are working together to rewrite their plugins for inclusion in the official Fluent Bit distribution.

Plugins

Currently, the image contains Go Plugins for:

Fluent Bit includes Amazon CloudWatch Logs plugin named cloudwatch_logs, Amazon Kinesis Firehose plugin named kinesis_firehose and Amazon Kinesis Data Streams plugin named kinesis_streams which are higher performance than Go plugins.

Also, Fluent Bit includes S3 output plugin named s3.

Versions and Regional Repositories

AWS vends their container image via , and a set of highly available regional Amazon ECR repositories. For more information, see the .

The AWS for Fluent Bit image uses a custom versioning scheme because it contains multiple projects. To see what each release contains, check out the .

SSM Public Parameters

AWS vends SSM Public Parameters with the regional repository link for each image. These parameters can be queried by any AWS account.

To see a list of available version tags in a given region, run the following command:

To see the ECR repository URI for a given image tag in a given region, run the following:

You can use these SSM public parameters as parameters in your CloudFormation templates:

Amazon EC2

Learn how to .

Administration

Classic mode

Local Testing

Data Pipeline

Inputs

Windows

Fluent Bit is distributed as fluent-bit package for Windows and as a Windows container on Docker Hub. Fluent Bit has two flavours of Windows installers: a ZIP archive (for quick testing) and an EXE installer (for system installation).

Configuration

Make sure to provide a valid Windows configuration with the installation, a sample one is shown below:

[SERVICE]
    # Flush
    # =====
    # set an interval of seconds before to flush records to a destination
    flush        5

    # Daemon
    # ======
    # instruct Fluent Bit to run in foreground or background mode.
    daemon       Off

    # Log_Level
    # =========
    # Set the verbosity level of the service, values can be:
    #
    # - error
    # - warning
    # - info
    # - debug
    # - trace
    #
    # by default 'info' is set, that means it includes 'error' and 'warning'.
    log_level    info

    # Parsers File
    # ============
    # specify an optional 'Parsers' configuration file
    parsers_file parsers.conf

    # Plugins File
    # ============
    # specify an optional 'Plugins' configuration file to load external plugins.
    plugins_file plugins.conf

    # HTTP Server
    # ===========
    # Enable/Disable the built-in HTTP Server for metrics
    http_server  Off
    http_listen  0.0.0.0
    http_port    2020

    # Storage
    # =======
    # Fluent Bit can use memory and filesystem buffering based mechanisms
    #
    # - https://docs.fluentbit.io/manual/administration/buffering-and-storage
    #
    # storage metrics
    # ---------------
    # publish storage pipeline metrics in '/api/v1/storage'. The metrics are
    # exported only if the 'http_server' option is enabled.
    #
    storage.metrics on

[INPUT]
    Name         winlog
    Channels     Setup,Windows PowerShell
    Interval_Sec 1

[OUTPUT]
    name  stdout
    match *

Migration to Fluent Bit

From version 1.9, td-agent-bit is a deprecated package and was removed after 1.9.9. The correct package name to use now is fluent-bit.

Installation Packages

The latest stable version is 2.0.9, each version is available on the Github release as well as at https://releases.fluentbit.io/<Major Version>/fluent-bit-<Full Version>-win[32|64].[exe|zip]:

To check the integrity, use Get-FileHash cmdlet on PowerShell.

PS> Get-FileHash fluent-bit-2.0.9-win32.exe

Installing from ZIP archive

Download a ZIP archive from above. There are installers for 32-bit and 64-bit environments, so choose one suitable for your environment.

Then you need to expand the ZIP archive. You can do this by clicking "Extract All" on Explorer, or if you're using PowerShell, you can use Expand-Archive cmdlet.

PS> Expand-Archive fluent-bit-2.0.9-win64.zip

The ZIP package contains the following set of files.

fluent-bit
├── bin
│   ├── fluent-bit.dll
│   └── fluent-bit.exe
│   └── fluent-bit.pdb
├── conf
│   ├── fluent-bit.conf
│   ├── parsers.conf
│   └── plugins.conf
└── include
    │   ├── flb_api.h
    │   ├── ...
    │   └── flb_worker.h
    └── fluent-bit.h

Now, launch cmd.exe or PowerShell on your machine, and execute fluent-bit.exe as follows.

PS> .\bin\fluent-bit.exe -i dummy -o stdout

If you see the following output, it's working fine!

PS> .\bin\fluent-bit.exe  -i dummy -o stdout
Fluent Bit v2.0.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2019/06/28 10:13:04] [ info] [storage] initializing...
[2019/06/28 10:13:04] [ info] [storage] in-memory
[2019/06/28 10:13:04] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2019/06/28 10:13:04] [ info] [engine] started (pid=10324)
[2019/06/28 10:13:04] [ info] [sp] stream processor started
[0] dummy.0: [1561684385.443823800, {"message"=>"dummy"}]
[1] dummy.0: [1561684386.428399000, {"message"=>"dummy"}]
[2] dummy.0: [1561684387.443641900, {"message"=>"dummy"}]
[3] dummy.0: [1561684388.441405800, {"message"=>"dummy"}]

To halt the process, press CTRL-C in the terminal.

Installing from EXE installer

Download an EXE installer from the download page. It has both 32-bit and 64-bit builds. Choose one which is suitable for you.

Double-click the EXE installer you've downloaded. The installation wizard will automatically start.

Click Next and proceed. By default, Fluent Bit is installed into C:\Program Files\fluent-bit\, so you should be able to launch fluent-bit as follows after installation.

PS> C:\Program Files\fluent-bit\bin\fluent-bit.exe -i dummy -o stdout

Installer options

The Windows installer is built by [CPack using NSIS(https://cmake.org/cmake/help/latest/cpack_gen/nsis.html) and so supports the default options that all NSIS installers do for silent installation and the directory to install to.

To silently install to C:\fluent-bit directory here is an example:

PS> <installer exe> /S /D=C:\fluent-bit

The uninstaller automatically provided also supports a silent un-install using the same /S flag. This may be useful for provisioning with automation like Ansible, Puppet, etc.

Windows Service Support

Windows services are equivalent to "daemons" in UNIX (i.e. long-running background processes). Since v1.5.0, Fluent Bit has the native support for Windows Service.

Suppose you have the following installation layout:

C:\fluent-bit\
├── conf
│   ├── fluent-bit.conf
│   └── parsers.conf
│   └── plugins.conf
└── bin
    ├── fluent-bit.dll
    └── fluent-bit.exe
    └── fluent-bit.pdb

To register Fluent Bit as a Windows service, you need to execute the following command on Command Prompt. Please be careful that a single space is required after binpath=.

% sc.exe create fluent-bit binpath= "\fluent-bit\bin\fluent-bit.exe -c \fluent-bit\conf\fluent-bit.conf"

Now Fluent Bit can be started and managed as a normal Windows service.

% sc.exe start fluent-bit
% sc.exe query fluent-bit
SERVICE_NAME: fluent-bit
    TYPE               : 10  WIN32_OWN_PROCESS
    STATE              : 4 Running
    ...

To halt the Fluent Bit service, just execute the "stop" command.

% sc.exe stop fluent-bit

To start Fluent Bit automatically on boot, execute the following:

% sc.exe config fluent-bit start= auto

[FAQ] Fluent Bit fails to start up when installed under `C:\Program Files`

Quotations are required if file paths contain spaces. Here is an example:

% sc.exe create fluent-bit binpath= "\"C:\Program Files\fluent-bit\bin\fluent-bit.exe\" -c \"C:\Program Files\fluent-bit\conf\fluent-bit.conf\""

[FAQ] How can I manage Fluent Bit service via PowerShell?

Instead of sc.exe, PowerShell can be used to manage Windows services.

Create a Fluent Bit service:

PS> New-Service fluent-bit -BinaryPathName "C:\fluent-bit\bin\fluent-bit.exe -c C:\fluent-bit\conf\fluent-bit.conf" -StartupType Automatic

Start the service:

PS> Start-Service fluent-bit

Query the service status:

PS> get-Service fluent-bit | format-list
Name                : fluent-bit
DisplayName         : fluent-bit
Status              : Running
DependentServices   : {}
ServicesDependedOn  : {}
CanPauseAndContinue : False
CanShutdown         : False
CanStop             : True
ServiceType         : Win32OwnProcess

Stop the service:

PS> Stop-Service fluent-bit

Remove the service (requires PowerShell 6.0 or later)

PS> Remove-Service fluent-bit

Compile from Source

If you need to create a custom executable, you can use the following procedure to compile Fluent Bit by yourself.

Preparation

First, you need Microsoft Visual C++ to compile Fluent Bit. You can install the minimum toolkit by the following command:

PS> wget -o vs.exe https://aka.ms/vs/16/release/vs_buildtools.exe
PS> start vs.exe

When asked which packages to install, choose "C++ Build Tools" (make sure that "C++ CMake tools for Windows" is selected too) and wait until the process finishes.

Also you need to install flex and bison. One way to install them on Windows is to use winflexbison.

PS> wget -o winflexbison.zip https://github.com/lexxmark/winflexbison/releases/download/v2.5.22/win_flex_bison-2.5.22.zip
PS> Expand-Archive winflexbison.zip -Destination C:\WinFlexBison
PS> cp -Path C:\WinFlexBison\win_bison.exe C:\WinFlexBison\bison.exe
PS> cp -Path C:\WinFlexBison\win_flex.exe C:\WinFlexBison\flex.exe

Add the path C:\WinFlexBison to your systems environment variable "Path". Here's how to do that.

Also you need to install git to pull the source code from the repository.

PS> wget -o git.exe https://github.com/git-for-windows/git/releases/download/v2.28.0.windows.1/Git-2.28.0-64-bit.exe
PS> start git.exe

Compilation

Open the start menu on Windows and type "Developer Command Prompt".

Clone the source code of Fluent Bit.

% git clone https://github.com/fluent/fluent-bit
% cd fluent-bit/build

Compile the source code.

% cmake .. -G "NMake Makefiles"
% cmake --build .

Now you should be able to run Fluent Bit:

% .\bin\debug\fluent-bit.exe -i dummy -o stdout

Packaging

To create a ZIP package, call cpack as follows:

% cpack -G ZIP

Troubleshooting

Tap Functionality: generate events or records
Dump Internals Signal

Tap Functionality

Tap can be used to generate events or records detailing what messages pass through Fluent Bit, at what time and what filters affect them.

Simple example

First, we will make sure that the container image we are going to use actually supports Fluent Bit Tap (available in Fluent Bit 2.0+):

$ docker run --rm -ti fluent/fluent-bit:latest --help | grep -Z
  -Z, --enable-chunk-trace     enable chunk tracing. activating it requires using the HTTP Server API.

If the --enable-chunk-trace option is present it means Fluent Bit has support for Fluent Bit Tap but it is disabled by default, so remember to enable it with this option.

Tap support is enabled and disabled via the embedded web server, so enable it like so (or the equivalent option in the configuration file):

$ docker run --rm -ti -p 2020:2020 fluent/fluent-bit:latest -Z -H -i dummy -p alias=input_dummy -o stdout -f 1
Fluent Bit v2.0.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/10/21 10:03:16] [ info] [fluent bit] version=2.0.0, commit=3000f699f2, pid=1
[2022/10/21 10:03:16] [ info] [output:stdout:stdout.0] worker #0 started
[2022/10/21 10:03:16] [ info] [storage] ver=1.3.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2022/10/21 10:03:16] [ info] [cmetrics] version=0.5.2
[2022/10/21 10:03:16] [ info] [input:dummy:input_dummy] initializing
[2022/10/21 10:03:16] [ info] [input:dummy:input_dummy] storage_strategy='memory' (memory only)
[2022/10/21 10:03:16] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2022/10/21 10:03:16] [ info] [sp] stream processor started
[0] dummy.0: [1666346597.203307010, {"message"=>"dummy"}]
[0] dummy.0: [1666346598.204103793, {"message"=>"dummy"}]
...

In another terminal we can activate Tap by either using the instance id of the input; dummy.0 or its alias.

Since the alias is more predictable that is what we will use:

$ curl 127.0.0.1:2020/api/v1/trace/input_dummy
{"status":"ok"}

This response means we have activated Tap, the terminal with Fluent Bit running should now look like this:

[0] dummy.0: [1666346615.203253156, {"message"=>"dummy"}]
[2022/10/21 10:03:36] [ info] [fluent bit] version=2.0.0, commit=3000f699f2, pid=1
[2022/10/21 10:03:36] [ info] [storage] ver=1.3.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2022/10/21 10:03:36] [ info] [cmetrics] version=0.5.2
[2022/10/21 10:03:36] [ info] [input:emitter:trace-emitter] initializing
[2022/10/21 10:03:36] [ info] [input:emitter:trace-emitter] storage_strategy='memory' (memory only)
[2022/10/21 10:03:36] [ info] [sp] stream processor started
[2022/10/21 10:03:36] [ info] [output:stdout:stdout.0] worker #0 started
[0] dummy.0: [1666346616.203551736, {"message"=>"dummy"}]
[0] trace: [1666346617.205221952, {"type"=>1, "trace_id"=>"trace.0", "plugin_instance"=>"dummy.0", "plugin_alias"=>"input_dummy", "records"=>[{"timestamp"=>1666346617, "record"=>{"message"=>"dummy"}}], "start_time"=>1666346617, "end_time"=>1666346617}]
[0] dummy.0: [1666346617.205131790, {"message"=>"dummy"}]
[0] trace: [1666346617.205419358, {"type"=>3, "trace_id"=>"trace.0", "plugin_instance"=>"dummy.0", "plugin_alias"=>"input_dummy", "records"=>[{"timestamp"=>1666346617, "record"=>{"message"=>"dummy"}}], "start_time"=>1666346617, "end_time"=>1666346617}]
[0] trace: [1666346618.204110867, {"type"=>1, "trace_id"=>"trace.1", "plugin_instance"=>"dummy.0", "plugin_alias"=>"input_dummy", "records"=>[{"timestamp"=>1666346618, "record"=>{[0] dummy.0: [1666346618.204049246, {"message"=>"dummy"}]
"message"=>"dummy"}}], "start_time"=>1666346618, "end_time"=>1666346618}]
[0] trace: [1666346618.204198654, {"type"=>3, "trace_id"=>"trace.1", "plugin_instance"=>"dummy.0", "plugin_alias"=>"input_dummy", "records"=>[{"timestamp"=>1666346618, "record"=>{"message"=>"dummy"}}], "start_time"=>1666346618, "end_time"=>1666346618}]

All the records that now appear are those emitted by the activities of the dummy plugin.

Complex example

This example takes the same steps but demonstrates the same mechanism works with more complicated configurations. In this example we will follow a single input of many which passes through several filters.

$ docker run --rm -ti -p 2020:2020 \
	fluent/fluent-bit:latest \
	-Z -H \
		-i dummy -p alias=dummy_0 -p \
			dummy='{"dummy": "dummy_0", "key_name": "foo", "key_cnt": "1"}' \
		-i dummy -p alias=dummy_1 -p dummy='{"dummy": "dummy_1"}' \
		-i dummy -p alias=dummy_2 -p dummy='{"dummy": "dummy_2"}' \
		-F record_modifier -m 'dummy.0' -p record="powered_by fluent" \
		-F record_modifier -m 'dummy.1' -p record="powered_by fluent-bit" \
		-F nest -m 'dummy.0' \
			-p operation=nest -p wildcard='key_*' -p nest_under=data \
		-o null -m '*' -f 1

To make sure the window is not cluttered by the actual records generated by the input plugins we send all of it to null.

We activate with the following 'curl' command:

$ curl 127.0.0.1:2020/api/v1/trace/dummy_0
{"status":"ok"}

Now we should start seeing output similar to the following:

[0] trace: [1666349359.325597543, {"type"=>1, "trace_id"=>"trace.0", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349359, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1"}}], "start_time"=>1666349359, "end_time"=>1666349359}]
[0] trace: [1666349359.325723747, {"type"=>2, "start_time"=>1666349359, "end_time"=>1666349359, "trace_id"=>"trace.0", "plugin_instance"=>"record_modifier.0", "records"=>[{"timestamp"=>1666349359, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1", "powered_by"=>"fluent"}}]}]
[0] trace: [1666349359.325783954, {"type"=>2, "start_time"=>1666349359, "end_time"=>1666349359, "trace_id"=>"trace.0", "plugin_instance"=>"nest.2", "records"=>[{"timestamp"=>1666349359, "record"=>{"dummy"=>"dummy_0", "powered_by"=>"fluent", "data"=>{"key_name"=>"foo", "key_cnt"=>"1"}}}]}]
[0] trace: [1666349359.325913783, {"type"=>3, "trace_id"=>"trace.0", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349359, "record"=>{"dummy"=>"dummy_0", "powered_by"=>"fluent", "data"=>{"key_name"=>"foo", "key_cnt"=>"1"}}}], "start_time"=>1666349359, "end_time"=>1666349359}]
[0] trace: [1666349360.323826619, {"type"=>1, "trace_id"=>"trace.1", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349360, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1"}}], "start_time"=>1666349360, "end_time"=>1666349360}]
[0] trace: [1666349360.323859618, {"type"=>2, "start_time"=>1666349360, "end_time"=>1666349360, "trace_id"=>"trace.1", "plugin_instance"=>"record_modifier.0", "records"=>[{"timestamp"=>1666349360, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1", "powered_by"=>"fluent"}}]}]
[0] trace: [1666349360.323900784, {"type"=>2, "start_time"=>1666349360, "end_time"=>1666349360, "trace_id"=>"trace.1", "plugin_instance"=>"nest.2", "records"=>[{"timestamp"=>1666349360, "record"=>{"dummy"=>"dummy_0", "powered_by"=>"fluent", "data"=>{"key_name"=>"foo", "key_cnt"=>"1"}}}]}]
[0] trace: [1666349360.323926366, {"type"=>3, "trace_id"=>"trace.1", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349360, "record"=>{"dummy"=>"dummy_0", "powered_by"=>"fluent", "data"=>{"key_name"=>"foo", "key_cnt"=>"1"}}}], "start_time"=>1666349360, "end_time"=>1666349360}]
[0] trace: [1666349361.324223752, {"type"=>1, "trace_id"=>"trace.2", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349361, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1"}}], "start_time"=>1666349361, "end_time"=>1666349361}]
[0] trace: [1666349361.324263959, {"type"=>2, "start_time"=>1666349361, "end_time"=>1666349361, "trace_id"=>"trace.2", "plugin_instance"=>"record_modifier.0", "records"=>[{"timestamp"=>1666349361, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1", "powered_by"=>"fluent"}}]}]
[0] trace: [1666349361.324283250, {"type"=>2, "start_time"=>1666349361, "end_time"=>1666349361, "trace_id"=>"trace.2", "plugin_instance"=>"nest.2", "records"=>[{"timestamp"=>1666349361, "record"=>{"dummy"=>"dummy_0", "powered_by"=>"fluent", "data"=>{"key_name"=>"foo", "key_cnt"=>"1"}}}]}]
[0] trace: [1666349361.324294291, {"type"=>3, "trace_id"=>"trace.2", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349361, "record"=>{"dummy"=>"dummy_0", "powered_by"=>"fluent", "data"=>{"key_name"=>"foo", "key_cnt"=>"1"}}}], "start_time"=>1666349361, "end_time"=>1666349361}]
^C[2022/10/21 10:49:23] [engine] caught signal (SIGINT)
[2022/10/21 10:49:23] [ warn] [engine] service will shutdown in max 5 seconds
[2022/10/21 10:49:23] [ info] [input] pausing dummy_0
[2022/10/21 10:49:23] [ info] [input] pausing dummy_1
[2022/10/21 10:49:23] [ info] [input] pausing dummy_2
[2022/10/21 10:49:23] [ info] [engine] service has stopped (0 pending tasks)
[2022/10/21 10:49:23] [ info] [input] pausing dummy_0
[2022/10/21 10:49:23] [ info] [input] pausing dummy_1
[2022/10/21 10:49:23] [ info] [input] pausing dummy_2
[0] trace: [1666349362.323272011, {"type"=>1, "trace_id"=>"trace.3", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349362, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1"}}], "start_time"=>1666349362, "end_time"=>1666349362}]
[0] trace: [1666349362.323306843, {"type"=>2, "start_time"=>1666349362, "end_time"=>1666349362, "trace_id"=>"trace.3", "plugin_instance"=>"record_modifier.0", "records"=>[{"timestamp"=>1666349362, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1", "powered_by"=>"fluent"}}]}]
[0] trace: [1666349362.323323884, {"type"=>2, "start_time"=>1666349362, "end_time"=>1666349362, "trace_id"=>"trace.3", "plugin_instance"=>"nest.2", "records"=>[{"timestamp"=>1666349362, "record"=>{"dummy"=>"dummy_0", "powered_by"=>"fluent", "data"=>{"key_name"=>"foo", "key_cnt"=>"1"}}}]}]
[0] trace: [1666349362.323334509, {"type"=>3, "trace_id"=>"trace.3", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349362, "record"=>{"dummy"=>"dummy_0", "powered_by"=>"fluent", "data"=>{"key_name"=>"foo", "key_cnt"=>"1"}}}], "start_time"=>1666349362, "end_time"=>1666349362}]
[2022/10/21 10:49:24] [ warn] [engine] service will shutdown in max 1 seconds
[2022/10/21 10:49:25] [ info] [engine] service has stopped (0 pending tasks)
[2022/10/21 10:49:25] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2022/10/21 10:49:25] [ info] [output:stdout:stdout.0] thread worker #0 stopped
[2022/10/21 10:49:25] [ info] [output:null:null.0] thread worker #0 stopping...
[2022/10/21 10:49:25] [ info] [output:null:null.0] thread worker #0 stopped

Parameters for the output in Tap

When activating Tap, any plugin parameter can be given. These can be used to modify, for example, the output format, the name of the time key, the format of the date, etc.

In the next example we will use the parameter "format": "json" to demonstrate how in Tap, stdout can be shown in Json format.

First, run Fluent Bit enabling Tap:

$ docker run --rm -ti -p 2020:2020 fluent/fluent-bit:latest -Z -H -i dummy -p alias=input_dummy -o stdout -f 1
Fluent Bit v2.0.8
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/01/27 07:44:25] [ info] [fluent bit] version=2.0.8, commit=9444fdc5ee, pid=1
[2023/01/27 07:44:25] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/01/27 07:44:25] [ info] [cmetrics] version=0.5.8
[2023/01/27 07:44:25] [ info] [ctraces ] version=0.2.7
[2023/01/27 07:44:25] [ info] [input:dummy:input_dummy] initializing
[2023/01/27 07:44:25] [ info] [input:dummy:input_dummy] storage_strategy='memory' (memory only)
[2023/01/27 07:44:25] [ info] [output:stdout:stdout.0] worker #0 started
[2023/01/27 07:44:25] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2023/01/27 07:44:25] [ info] [sp] stream processor started
[0] dummy.0: [1674805465.976012761, {"message"=>"dummy"}]
[0] dummy.0: [1674805466.973669512, {"message"=>"dummy"}]
...

Next, in another terminal, we activate Tap including the output, in this case stdout, and the parameters wanted, in this case "format": "json":

$ curl 127.0.0.1:2020/api/v1/trace/input_dummy -d '{"output":"stdout", "params": {"format": "json"}}'
{"status":"ok"}

In the first terminal, we should be seeing the output similar to the following:

[0] dummy.0: [1674805635.972373840, {"message"=>"dummy"}]
[{"date":1674805634.974457,"type":1,"trace_id":"0","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805634,"record":{"message":"dummy"}}],"start_time":1674805634,"end_time":1674805634},{"date":1674805634.974605,"type":3,"trace_id":"0","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805634,"record":{"message":"dummy"}}],"start_time":1674805634,"end_time":1674805634},{"date":1674805635.972398,"type":1,"trace_id":"1","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805635,"record":{"message":"dummy"}}],"start_time":1674805635,"end_time":1674805635},{"date":1674805635.972413,"type":3,"trace_id":"1","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805635,"record":{"message":"dummy"}}],"start_time":1674805635,"end_time":1674805635}]
[0] dummy.0: [1674805636.973970215, {"message"=>"dummy"}]
[{"date":1674805636.974008,"type":1,"trace_id":"2","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805636,"record":{"message":"dummy"}}],"start_time":1674805636,"end_time":1674805636},{"date":1674805636.974034,"type":3,"trace_id":"2","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805636,"record":{"message":"dummy"}}],"start_time":1674805636,"end_time":1674805636}]

This parameter shows stdout in Json format, however, as mentioned before, parameters can be passed to any plugin.

Please visit the following link for more information on other output plugins: https://docs.fluentbit.io/manual/pipeline/outputs

Analysis of a single Tap record

Here we analyze a single record from a filter event to explain the meaning of each field in detail. We chose a filter record since it includes the most details of all the record types.

{
	"type": 2,
	"start_time": 1666349231,
	"end_time": 1666349231,
	"trace_id": "trace.1",
	"plugin_instance": "nest.2", 
	"records": [{
		"timestamp": 1666349231,
		"record": {
			"dummy": "dummy_0",
			"powered_by": "fluent",
			"data": {
				"key_name": "foo", 
				"key_cnt": "1"
			}
		}
	}]
}

type

The type defines at what stage the event is generated:

type=1: input record
- this is the unadulterated input record
type=2: filtered record
- this is a record once it has been filtered. One record is generated per filter.
type=3: pre-output record
- this is the record right before it is sent for output.

Since this is a record generated by the manipulation of a record by a filter is has the type 2.

start_time and end_time

This records the start and end of an event, it is a bit different for each event type:

type 1: when the input is received, both the start and end time.
type 2: the time when filtering is matched until it has finished processing.
type 3: the time when the input is received and when it is finally slated for output.

trace_id

This is a string composed of a prefix and a number which is incremented with each record received by the input during the Tap session.

plugin_instance

This is the plugin instance name as it is generated by Fluent Bit at runtime.

plugin_alias

If an alias is set this field will contain the alias set for a plugin.

records

This is an array of all the records being sent. Since Fluent Bit handles records in chunks of multiple records and chunks are indivisible the same is done in the Tap output. Each record consists of its timestamp followed by the actual data which is a composite type of keys and values.

Dump Internals / Signal

When the service is running we can export metrics to see the overall status of the data flow of the service. But there are other use cases where we would like to know the current status of the internals of the service, specifically to answer questions like what's the current status of the internal buffers ? , the Dump Internals feature is the answer.

Fluent Bit v1.4 introduces the Dump Internals feature that can be triggered easily from the command line triggering the CONT Unix signal.

note: this feature is only available on Linux and BSD family operating systems

Usage

Run the following kill command to signal Fluent Bit:

kill -CONT `pidof fluent-bit`

The command pidof aims to lookup the Process ID of Fluent Bit. You can replace the

Fluent Bit will dump the following information to the standard output interface (stdout):

[engine] caught signal (SIGCONT)
[2020/03/23 17:39:02] Fluent Bit Dump

===== Input =====
syslog_debug (syslog)
│
├─ status
│  └─ overlimit     : no
│     ├─ mem size   : 60.8M (63752145 bytes)
│     └─ mem limit  : 61.0M (64000000 bytes)
│
├─ tasks
│  ├─ total tasks   : 92
│  ├─ new           : 0
│  ├─ running       : 92
│  └─ size          : 171.1M (179391504 bytes)
│
└─ chunks
   └─ total chunks  : 92
      ├─ up chunks  : 35
      ├─ down chunks: 57
      └─ busy chunks: 92
         ├─ size    : 60.8M (63752145 bytes)
         └─ size err: 0

===== Storage Layer =====
total chunks     : 92
├─ mem chunks    : 0
└─ fs chunks     : 92
   ├─ up         : 35
   └─ down       : 57

Input Plugins Dump

The dump provides insights for every input instance configured.

Status

Overall ingestion status of the plugin.

Tasks

When an input plugin ingest data into the engine, a Chunk is created. A Chunk can contains multiple records. Upon flush time, the engine creates a Task that contains the routes for the Chunk associated in question.

The Task dump describes the tasks associated to the input plugin:

Chunks

The Chunks dump tells more details about all the chunks that the input plugin has generated and are still being processed.

Depending of the buffering strategy and limits imposed by configuration, some Chunks might be up (in memory) or down (filesystem).

Storage Layer Dump

Fluent Bit relies on a custom storage layer interface designed for hybrid buffering. The Storage Layer entry contains a total summary of Chunks registered by Fluent Bit:

Tail

The tail input plugin allows to monitor one or several text files. It has a similar behavior like tail -f shell command.

The plugin reads every matched file in the Path pattern and for every new line found (separated by a newline character (\n) ), it generates a new record. Optionally a database file can be used so the plugin can have a history of tracked files and a state of offsets, this is very useful to resume a state if the service is restarted.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Note that if the database parameter DB is not specified, by default the plugin will start reading each target file from the beginning. This also might cause some unwanted behavior, for example when a line is bigger that Buffer_Chunk_Size and Skip_Long_Lines is not turned on, the file will be read from the beginning of each Refresh_Interval until the file is rotated.

Multiline Support

Starting from Fluent Bit v1.8 we have introduced a new Multiline core functionality. For Tail input plugin, it means that now it supports the old configuration mechanism but also the new one. In order to avoid breaking changes, we will keep both but encourage our users to use the latest one. We will call the two mechanisms as:

Multiline Core
Old Multiline

Multiline Core (v1.8)

The new multiline core is exposed by the following configuration:

As stated in the Multiline Parser documentation, now we provide built-in configuration modes. Note that when using a new multiline.parser definition, you must disable the old configuration from your tail section like:

parser
parser_firstline
parser_N
multiline
multiline_flush
docker_mode

Multiline and Containers (v1.8)

If you are running Fluent Bit to process logs coming from containers like Docker or CRI, you can use the new built-in modes for such purposes. This will help to reassembly multiline messages originally split by Docker or CRI:

[INPUT]
    name              tail
    path              /var/log/containers/*.log
    multiline.parser  docker, cri

pipeline:
  inputs:
    - tail:
      path: /var/log/containers/*.log
      multiline.parser: docker, cri

The two options separated by a comma means multi-format: try docker and cri multiline formats.

We are still working on extending support to do multiline for nested stack traces and such. Over the Fluent Bit v1.8.x release cycle we will be updating the documentation.

Old Multiline Configuration Parameters

For the old multiline configuration, the following options exist to configure the handling of multilines logs:

Old Docker Mode Configuration Parameters

Docker mode exists to recombine JSON log lines split by the Docker daemon due to its line length limit. To use this feature, configure the tail plugin with the corresponding parser and then enable Docker mode:

Getting Started

In order to tail text or log files, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit parse text files with the following options:

$ fluent-bit -i tail -p path=/var/log/syslog -o stdout

Configuration File

In your main configuration file append the following Input & Output sections. An example visualization can be found here

[INPUT]
    Name        tail
    Path        /var/log/syslog

[OUTPUT]
    Name   stdout
    Match  *

pipeline:
  inputs:
    - tail:
      path: /var/log/syslog
      
  outputs:
    - stdout:
      match: *

Old Multi-line example

When using multi-line configuration you need to first specify Multiline On in the configuration and use the Parser_Firstline and additional parser parameters Parser_N if needed. If we are trying to read the following Java Stacktrace as a single event

Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)

We need to specify a Parser_Firstline parameter that matches the first line of a multi-line event. Once a match is made Fluent Bit will read all future lines until another match with Parser_Firstline is made .

In the case above we can use the following parser, that extracts the Time as time and the remaining portion of the multiline as log

[PARSER]
    Name multiline
    Format regex
    Regex /(?<time>Dec \d+ \d+\:\d+\:\d+)(?<message>.*)/
    Time_Key  time
    Time_Format %b %d %H:%M:%S

If we want to further parse the entire event we can add additional parsers with Parser_N where N is an integer. The final Fluent Bit configuration looks like the following:

# Note this is generally added to parsers.conf and referenced in [SERVICE]
[PARSER]
    Name multiline
    Format regex
    Regex /(?<time>Dec \d+ \d+\:\d+\:\d+)(?<message>.*)/
    Time_Key  time
    Time_Format %b %d %H:%M:%S

[INPUT]
    Name             tail
    Multiline        On
    Parser_Firstline multiline
    Path             /var/log/java.log

[OUTPUT]
    Name             stdout
    Match            *

Our output will be as follows.

[0] tail.0: [1607928428.466041977, {"message"=>"Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)", "message"=>"at com.myproject.module.MyProject.main(MyProject.java:6)"}]

Tailing files keeping state

The tail input plugin a feature to save the state of the tracked files, is strongly suggested you enabled this. For this purpose the db property is available, e.g:

$ fluent-bit -i tail -p path=/var/log/syslog -p db=/path/to/logs.db -o stdout

When running, the database file /path/to/logs.db will be created, this database is backed by SQLite3 so if you are interested into explore the content, you can open it with the SQLite client tool, e.g:

$ sqlite3 tail.db
-- Loading resources from /home/edsiper/.sqliterc

SQLite version 3.14.1 2016-08-11 18:53:32
Enter ".help" for usage hints.
sqlite> SELECT * FROM in_tail_files;
id     name                              offset        inode         created
-----  --------------------------------  ------------  ------------  ----------
1      /var/log/syslog                   73453145      23462108      1480371857
sqlite>

Make sure to explore when Fluent Bit is not hard working on the database file, otherwise you will see some Error: database is locked messages.

Formatting SQLite

By default SQLite client tool do not format the columns in a human read-way, so to explore in_tail_files table you can create a config file in ~/.sqliterc with the following content:

.headers on
.mode column
.width 5 32 12 12 10

SQLite and Write Ahead Logging

Fluent Bit keep the state or checkpoint of each file through using a SQLite database file, so if the service is restarted, it can continue consuming files from it last checkpoint position (offset). The default options set are enabled for high performance and corruption-safe.

The SQLite journaling mode enabled is Write Ahead Log or WAL. This allows to improve performance of read and write operations to disk. When enabled, you will see in your file system additional files being created, consider the following configuration statement:

[INPUT]
    name    tail
    path    /var/log/containers/*.log
    db      test.db

The above configuration enables a database file called test.db and in the same path for that file SQLite will create two additional files:

test.db-shm
test.db-wal

Those two files aims to support the WAL mechanism that helps to improve performance and reduce the number system calls required. The -wal file refers to the file that stores the new changes to be committed, at some point the WAL file transactions are moved back to the real database file. The -shm file is a shared-memory type to allow concurrent-users to the WAL file.

WAL and Memory Usage

The WAL mechanism give us higher performance but also might increase the memory usage by Fluent Bit. Most of this usage comes from the memory mapped and cached pages. In some cases you might see that memory usage keeps a bit high giving the impression of a memory leak, but actually is not relevant unless you want your memory metrics back to normal. Starting from Fluent Bit v1.7.3 we introduced the new option db.journal_mode mode that sets the journal mode for databases, by default it will be WAL (Write-Ahead Logging), currently allowed configurations for db.journal_mode are DELETE | TRUNCATE | PERSIST | MEMORY | WAL | OFF .

File Rotation

File rotation is properly handled, including logrotate's copytruncate mode.

Note that the Path patterns cannot match the rotated files. Otherwise, the rotated file would be read again and lead to duplicate records.

Buffering & Storage

The end-goal of Fluent Bit is to collect, parse, filter and ship logs to a central place. In this workflow there are many phases and one of the critical pieces is the ability to do buffering : a mechanism to place processed data into a temporary location until is ready to be shipped.

By default when Fluent Bit processes data, it uses Memory as a primary and temporary place to store the records, but there are certain scenarios where it would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities.

Choosing the right configuration is critical and the behavior of the service can be conditioned based in the backpressure settings. Before we jump into the configuration let's make sure we understand the relationship between Chunks, Memory, Filesystem and Backpressure.

Chunks, Memory, Filesystem and Backpressure

Understanding the chunks, buffering and backpressure concepts is critical for a proper configuration. Let's do a recap of the meaning of these concepts.

Chunks

When an input plugin (source) emits records, the engine groups the records together in a Chunk. A Chunk size usually is around 2MB. By configuration, the engine decides where to place this Chunk, the default is that all chunks are created only in memory.

Buffering and Memory

As mentioned above, the Chunks generated by the engine are placed in memory but this is configurable.

If memory is the only mechanism set for the input plugin, it will just store data as much as it can there (memory). This is the fastest mechanism with the least system overhead, but if the service is not able to deliver the records fast enough because of a slow network or an unresponsive remote service, Fluent Bit memory usage will increase since it will accumulate more data than it can deliver.

In a high load environment with backpressure the risks of having high memory usage is the chance of getting killed by the Kernel (OOM Killer). A workaround for this backpressure scenario is to limit the amount of memory in records that an input plugin can register, this configuration property is called mem_buf_limit. If a plugin has enqueued more than the mem_buf_limit, it won't be able to ingest more until that data can be delivered or flushed properly. In this scenario the input plugin in question is paused. When the input is paused, records will not be ingested until it is resumed. For some inputs, such as TCP and tail, pausing the input will almost certainly lead to log loss. For the tail input, Fluent Bit can save its current offset in the current file it is reading, and pick back up when the input is resumed.

Look for messages in the Fluent Bit log output like:

[input] tail.1 paused (mem buf overlimit)
[input] tail.1 resume (mem buf overlimit)

The workaround of mem_buf_limit is good for certain scenarios and environments, it helps to control the memory usage of the service, but at the costs that if a file gets rotated while paused, you might lose that data since it won't be able to register new records. This can happen with any input source plugin. The goal of mem_buf_limit is memory control and survival of the service.

For full data safety guarantee, use filesystem buffering.

Here is an example input definition:

[INPUT]
    Name          tcp
    Listen        0.0.0.0
    Port          5170
    Format        none
    Tag           tcp-logs
    Mem_Buf_Limit 50MB

If this input uses more than 50MB memory to buffer logs, you will get a warning like this in the Fluent Bit logs:

[input] tcp.1 paused (mem buf overlimit)

Please note that Mem_Buf_Limit only applies when storage.type is set to the default value of memory. The section below explains the limits that apply when you enable storage.type filesystem.

Filesystem buffering to the rescue

Filesystem buffering enabled helps with backpressure and overall memory control.

Behind the scenes, Memory and Filesystem buffering mechanisms are not mutually exclusive. Indeed when enabling filesystem buffering for your input plugin (source) you are getting the best of the two worlds: performance and data safety.

When Filesystem buffering is enabled, the behavior of the engine is different. Upon Chunk creation, the engine stores the content in memory and also maps a copy on disk (through mmap(2)). The newly created Chunk is (1) active in memory, (2) backed up on disk, and (3) is called to be up which means "the chunk content is up in memory".

How does the Filesystem buffering mechanism deal with high memory usage and backpressure? Fluent Bit controls the number of Chunks that are up in memory.

By default, the engine allows us to have 128 Chunks up in memory in total (considering all Chunks), this value is controlled by service property storage.max_chunks_up. The active Chunks that are up are ready for delivery and the ones that are still receiving records. Any other remaining Chunk is in a down state, which means that it is only in the filesystem and won't be up in memory unless it is ready to be delivered. Remember, chunks are never much larger than 2 MB, thus, with the default storage.max_chunks_up value of 128, each input is limited to roughly 256 MB of memory.

If the input plugin has enabled storage.type as filesystem, when reaching the storage.max_chunks_up threshold, instead of the plugin being paused, all new data will go to Chunks that are down in the filesystem. This allows us to control the memory usage by the service and also provides a guarantee that the service won't lose any data. By default, the enforcement of the storage.max_chunks_up limit is best-effort. Fluent Bit can only append new data to chunks that are up; when the limit is reached chunks will be temporarily brought up in memory to ingest new data, and then put to a down state afterwards. In general, Fluent Bit will work to keep the total number of up chunks at or below storage.max_chunks_up.

If storage.pause_on_chunks_overlimit is enabled (default is off), the input plugin will be paused upon exceeding storage.max_chunks_up. Thus, with this option, storage.max_chunks_up becomes a hard limit for the input. When the input is paused, records will not be ingested until it is resumed. For some inputs, such as TCP and tail, pausing the input will almost certainly lead to log loss. For the tail input, Fluent Bit can save its current offset in the current file it is reading, and pick back up when the input is resumed.

Look for messages in the Fluent Bit log output like:

[input] tail.1 paused (storage buf overlimit
[input] tail.1 resume (storage buf overlimit

Limiting Filesystem space for Chunks

Fluent Bit implements the concept of logical queues: based on its Tag, a Chunk can be routed to multiple destinations. Thus, we keep an internal reference from where a Chunk was created and where it needs to go.

It's common to find cases where if we have multiple destinations for a Chunk, one of the destinations might be slower than the other, or maybe one is generating backpressure and not all of them. In this scenario, how do we limit the amount of filesystem Chunks that we are logically queueing?

Starting from Fluent Bit v1.6, we introduced the new configuration property for output plugins called storage.total_limit_size which limits the number of Chunks that exist in the filesystem for a certain logical output destination. If one of the destinations reaches the storage.total_limit_size, the oldest Chunk from its queue for that logical output destination will be discarded.

Configuration

The storage layer configuration takes place in three areas:

Service Section
Input Section
Output Section

The known Service section configures a global environment for the storage layer, the Input sections define which buffering mechanism to use and the output the limits for the logical filesystem queues.

Service Section Configuration

The Service section refers to the section defined in the main configuration file:

a Service section will look like this:

[SERVICE]
    flush                     1
    log_Level                 info
    storage.path              /var/log/flb-storage/
    storage.sync              normal
    storage.checksum          off
    storage.backlog.mem_limit 5M

that configuration sets an optional buffering mechanism where the route to the data is /var/log/flb-storage/, it will use normal synchronization mode, without running a checksum and up to a maximum of 5MB of memory when processing backlog data.

Input Section Configuration

Optionally, any Input plugin can configure their storage preference, the following table describes the options available:

The following example configures a service that offers filesystem buffering capabilities and two Input plugins being the first based in filesystem and the second with memory only.

[SERVICE]
    flush                     1
    log_Level                 info
    storage.path              /var/log/flb-storage/
    storage.sync              normal
    storage.checksum          off
    storage.max_chunks_up     128
    storage.backlog.mem_limit 5M

[INPUT]
    name          cpu
    storage.type  filesystem

[INPUT]
    name          mem
    storage.type  memory

Output Section Configuration

If certain chunks are filesystem storage.type based, it's possible to control the size of the logical queue for an output plugin. The following table describes the options available:

The following example create records with CPU usage samples in the filesystem and then they are delivered to Google Stackdriver service limiting the logical queue (buffering) to 5M:

[SERVICE]
    flush                     1
    log_Level                 info
    storage.path              /var/log/flb-storage/
    storage.sync              normal
    storage.checksum          off
    storage.max_chunks_up     128
    storage.backlog.mem_limit 5M

[INPUT]
    name                      cpu
    storage.type              filesystem 

[OUTPUT]
    name                      stackdriver
    match                     *
    storage.total_limit_size  5M

If for some reason Fluent Bit gets offline because of a network issue, it will continue buffering CPU samples but just keep a maximum of 5M of the newest data.