1 of 100

1.5 About

What is Fluent Bit ?

Fluent Bit is a CNCF sub-project under the umbrella of Fluentd

Fluent Bit is an open source and multi-platform log processor tool which aims to be a generic Swiss knife for logs processing and distribution.

Nowadays the number of sources of information in our environments is ever increasing. Handling data collection at scale is complex, and collecting and aggregating diverse data requires a specialized tool that can deal with:

Different sources of information
Different data formats
Data Reliability
Security
Flexible Routing
Multiple destinations

Fluent Bit has been designed with performance and low resources consumption in mind.

A Brief History of Fluent Bit

Every project has a story

On 2014, the Fluentd team at Treasure Data forecasted the need of a lightweight log processor for constraint environments like Embedded Linux and Gateways, the project aimed to be part of the Fluentd Ecosystem and we called it Fluent Bit, fully open source and available under the terms of the Apache License v2.0.

After the project was around for some time, it got some traction in the Embedded market but we also started getting requests for several features from the Cloud community like more inputs, filters, and outputs. Not so long after that, Fluent Bit becomes one of the preferred solutions to solve the logging challenges in Cloud environments.

Fluentd & Fluent Bit

The Production Grade Ecosystem

Logging and data processing in general can be complex, and substantially more so at scale. That's why Fluentd was born. Now, Fluentd is more than a simple tool. It's a full ecosystem that contains SDKs for different languages and sub projects like Fluent Bit.

On this page, we will describe the relationship between the Fluentd and Fluent Bit open source projects. As a summary, we can say both are:

Licensed under the terms of Apache License v2.0
Hosted projects by the Cloud Native Computing Foundation (CNCF)
Production Grade solutions: deployed thousands of times every single day, millions per month.
Community driven projects
Widely Adopted by the Industry: trusted by all major companies like AWS, Microsoft, Google Cloud and hundred of others.
Originally created by Treasure Data.

Both projects have a lot of similarities. Fluent Bit is fully designed and built on top of the best ideas of the Fluentd architecture and general design. Choosing which one to use depends on the end-user needs.

The following table describes a comparison in different areas of the projects:

Fluentd

Fluent Bit

Scope

Containers / Servers

Embedded Linux / Containers / Servers

Language

C & Ruby

Memory

~40MB

~650KB

Performance

High Performance

Dependencies

Built as a Ruby Gem, it requires a certain number of gems.

Zero dependencies, unless some special plugin requires them.

Plugins

More than 1000 plugins available

Around 70 plugins available

License

Both Fluentd and Fluent Bit can work as Aggregators or Forwarders. They can both complement each other or be used as standalone solutions.

Concepts

Key Concepts

There are a few key concepts that are really important to understand how Fluent Bit operates.

Before diving into Fluent Bit it’s good to get acquainted with some of the key concepts of the service. This document provides a gentle introduction to those concepts and common Fluent Bit terminology. We’ve provided a list below of all the terms we’ll cover, but we recommend reading this document from start to finish to gain a more general understanding of our log and stream processor.

Event or Record
Filtering
Tag
Timestamp
Match
Structured Message

Event or Record

Every incoming piece of data that belongs to a log or a metric that is retrieved by Fluent Bit is considered an Event or a Record.

As an example consider the following content of a Syslog file:

Jan 18 12:52:16 flb systemd[2222]: Starting GNOME Terminal Server
Jan 18 12:52:16 flb dbus-daemon[2243]: [session uid=1000 pid=2243] Successfully activated service 'org.gnome.Terminal'
Jan 18 12:52:16 flb systemd[2222]: Started GNOME Terminal Server.
Jan 18 12:52:16 flb gsd-media-keys[2640]: # watch_fast: "/org/gnome/terminal/legacy/" (establishing: 0, active: 0)

It contains four lines and each of them represents an independent Event, for a total of four Events.

Internally, an Event always has two components (in an array form):

[TIMESTAMP, MESSAGE]

Filtering

Some cases require modifications on the Events content. Modifying events by altering, enriching or dropping Events is called Filtering.

There are many use cases when Filtering is required like:

Append specific information to the Event like an IP address or metadata.
Select a specific piece of the Event content.
Drop Events that matches certain pattern.

Tag

Every Event that gets into Fluent Bit gets assigned a Tag. This tag is an internal string that is used in a later stage by the Router to decide which Filter or Output phase it must go through.

Most of the tags are assigned manually in the configuration. If a tag is not specified, Fluent Bit will assign the name of the Input plugin instance from where that Event was generated from.

The only input plugin that doesn't assign Tags is Forward input. This plugin speaks the Fluentd wire protocol called Forward where every Event already comes with a Tag associated. Fluent Bit will always use the incoming Tag set by the client.

A Tagged record must always have a Matching rule. To learn more about Tags and Matches check the Routing section.

Timestamp

The Timestamp represents the time when an Event was created. Every Event contains a Timestamp associated. The Timestamp is a numeric fractional integer in the format:

SECONDS.NANOSECONDS

Seconds

It is the number of seconds that have elapsed since the Unix epoch.

Nanoseconds

Fractional second or one thousand-millionth of a second.

A timestamp always exists, either set by the Input plugin or discovered through a data parsing process.

Match

Fluent Bit allows to deliver your collected and processed Events to one or multiple destinations, this is done through a routing phase. A Match represent a simple rule to select Events where it Tags matches a defined rule.

To learn more about Tags and Matches check the Routing section.

Structured Messages

Source events can have or not have a structure. A structure defines a set of keys and values inside the Event message. As an example consider the following two messages:

No structured message

"Project Fluent Bit created on 1398289291"

Structured Message

{"project": "Fluent Bit", "created": 1398289291}

At a low level both are just an array of bytes, but the Structured message defines keys and values, having a structure helps to implement faster operations on data modifications.

Fluent Bit always handles every Event message as a structured message. For performance reasons, we use a binary serialization data format called MessagePack.

Consider MessagePack as a binary version of JSON on steroids.

Buffering

Performance and Data Safety

When Fluent Bit processes data, it uses the system memory (heap) as a primary and temporal place to store the record logs before they get delivered, on this private memory area the records are processed.

Buffering refers to the ability to store the records somewhere, and while they are processed and delivered, still be able to store more. Buffering in memory is the fastest mechanism, but there are certain scenarios where the mechanism requires special strategies to deal with backpressure, data safety or reduce memory consumption by the service in constraint environments.

Network failures or latency on third party service is pretty common, and on scenarios where we cannot deliver data fast enough as we receive new data to process, we likely will face backpressure.

Our buffering strategies are designed to solve problems associated with backpressure and general delivery failures.

Fluent Bit as buffering strategies, offers a primary buffering mechanism in memory and an optional secondary one using the file system. With this hybrid solution you can adjust to any use case safety and keep a high performance while processing your data.

Both mechanisms are not exclusive and when the data is ready to be processed or delivered it will be always in memory, while other data in the queue might be in the file system until is ready to be processed and moved up to memory.

To learn more about the buffering configuration in Fluent Bit, please jump to the Buffering & Storage section.

Data Pipeline

Input

The way to gather data from your sources

provides different Input Plugins to gather information from different sources, some of them just collect data from log files while others can gather metrics information from the operating system. There are many plugins for different needs.

When an input plugin is loaded, an internal instance is created. Every instance has its own and independent configuration. Configuration keys are often called properties.

Every input plugin has its own documentation section where it's specified how it can be used and what properties are available.

For more details, please refer to the section.

Parser

Convert Unstructured to Structured messages

Dealing with raw strings or unstructured messages is a constant pain; having a structure is highly desired. Ideally we want to set a structure to the incoming data by the Input Plugins as soon as they are collected:

The Parser allows you to convert from unstructured to structured data. As a demonstrative example consider the following Apache (HTTP Server) log entry:

192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395

The above log line is a raw string without format, ideally we would like to give it a structure that can be processed later easily. If the proper configuration is used, the log entry could be converted to:

{
  "host":    "192.168.2.20",
  "user":    "-",
  "method":  "GET",
  "path":    "/cgi-bin/try/",
  "code":    "200",
  "size":    "3395",
  "referer": "",
  "agent":   ""
 }

Parsers are fully configurable and are independently and optionally handled by each input plugin, for more details please refer to the Parsers section.

Filter

Modify, Enrich or Drop your records

In production environments we want to have full control of the data we are collecting, filtering is an important feature that allows us to alter the data before delivering it to some destination.

Filtering is implemented through plugins, so each filter available could be used to match, exclude or enrich your logs with some specific metadata.

We support many filters, A common use case for filtering is Kubernetes deployments. Every Pod log needs to get the proper metadata associated

Very similar to the input plugins, Filters run in an instance context, which has its own independent configuration. Configuration keys are often called properties.

For more details about the Filters available and their usage, please refer to the Filters section.

Buffer

Data processing with reliability

Previously defined in the Buffering concept section, the buffer phase in the pipeline aims to provide a unified and persistent mechanism to store your data, either using the primary in-memory model or using the filesystem based mode.

The buffer phase already contains the data in an immutable state, meaning, no other filter can be applied.

Note that buffered data is not raw text, it's in Fluent Bit's internal binary representation.

Fluent Bit offers a buffering mechanism in the file system that acts as a backup system to avoid data loss in case of system failures.

Router

Create flexible routing rules

Routing is a core feature that allows to route your data through Filters and finally to one or multiple destinations. The router relies on the concept of and rules

There are two important concepts in Routing:

Tag
Match

When the data is generated by the input plugins, it comes with a Tag (most of the time the Tag is configured manually), the Tag is a human-readable indicator that helps to identify the data source.

In order to define where the data should be routed, a Match rule must be specified in the output configuration.

Consider the following configuration example that aims to deliver CPU metrics to an Elasticsearch database and Memory metrics to the standard output interface:

Note: the above is a simple example demonstrating how Routing is configured.

Routing works automatically reading the Input Tags and the Output Match rules. If some data has a Tag that doesn't match upon routing time, the data is deleted.

Routing with Wildcard

Routing is flexible enough to support wildcard in the Match pattern. The below example defines a common destination for both sources of data:

The match rule is set to my_* which means it will match any Tag that starts with my_.

Output

Destinations for your data: databases, cloud services and more!

The output interface allows us to define destinations for the data. Common destinations are remote services, local file system or standard interface with others. Outputs are implemented as plugins and there are many available.

When an output plugin is loaded, an internal instance is created. Every instance has its own independent configuration. Configuration keys are often called properties.

Every output plugin has its own documentation section specifying how it can be used and what properties are available.

For more details, please refer to the Output Plugins section.

Installation

Upgrade Notes

The following article cover the relevant notes for users upgrading from previous Fluent Bit versions. We aim to cover compatibility changes that you must be aware of.

For more details about changes on each release please refer to the Official Release Notes.

Fluent Bit v1.5

The migration from v1.4 to v1.5 is pretty straightforward.

If you enabled keepalive mode in your configuration, note that this configuration property has been renamed to net.keepalive. Now all Network I/O keepalive is enabled by default, to learn more about this and other associated configuration properties read the Networking Administration section.
If you use the Elasticsearch output plugin, note the default value of type changed from flb_type to _doc. Many versions of Elasticsearch will tolerate this, but ES v5.6 through v6.1 require a type without a leading underscore. See the Elasticsearch output plugin documentation FAQ entry for more.

Fluent Bit v1.4

If you are migrating from Fluent Bit v1.3, there are no breaking changes. Just new exciting features to enjoy :)

Fluent Bit v1.3

If you are migrating from Fluent Bit v1.2 to v1.3, there are no breaking changes. If you are upgrading from an older version please review the incremental changes below.

Fluent Bit v1.2

Docker, JSON, Parsers and Decoders

On Fluent Bit v1.2 we have fixed many issues associated with JSON encoding and decoding, for hence when parsing Docker logs is no longer necessary to use decoders. The new Docker parser looks like this:

[PARSER]
    Name         docker
    Format       json
    Time_Key     time
    Time_Format  %Y-%m-%dT%H:%M:%S.%L
    Time_Keep    On

Note: again, do not use decoders.

Kubernetes Filter

We have done improvements also on how Kubernetes Filter handle the stringified log message. If the option Merge_Log is enabled, it will try to handle the log content as a JSON map, if so, it will add the keys to the root map.

In addition, we have fixed and improved the option called Merge_Log_Key. If a merge log succeed, all new keys will be packaged under the key specified by this option, a suggested configuration is as follows:

[FILTER]
    Name             Kubernetes
    Match            kube.*
    Kube_Tag_Prefix  kube.var.log.containers.
    Merge_Log        On
    Merge_Log_Key    log_processed

As an example, if the original log content is the following map:

{"key1": "val1", "key2": "val2"}

the final record will be composed as follows:

{
    "log": "{\"key1\": \"val1\", \"key2\": \"val2\"}",
    "log_processed": {
        "key1": "val1",
        "key2": "val2"
    }
}

Fluent Bit v1.1

If you are upgrading from Fluent Bit <= 1.0.x you should take in consideration the following relevant changes when switching to Fluent Bit v1.1 series:

Kubernetes Filter

We introduced a new configuration property called Kube_Tag_Prefix to help Tag prefix resolution and address an unexpected behavior that landed in previous versions.

During 1.0.x release cycle, a commit in Tail input plugin changed the default behavior on how the Tag was composed when using the wildcard for expansion generating breaking compatibility with other services. Consider the following configuration example:

[INPUT]
    Name  tail
    Path  /var/log/containers/*.log
    Tag   kube.*

The expected behavior is that Tag will be expanded to:

kube.var.log.containers.apache.log

but the change introduced in 1.0 series switched from absolute path to the base file name only:

kube.apache.log

On Fluent Bit v1.1 release we restored to our default behavior and now the Tag is composed using the absolute path of the monitored file.

Having absolute path in the Tag is relevant for routing and flexible configuration where it also helps to keep compatibility with Fluentd behavior.

This behavior switch in Tail input plugin affects how Filter Kubernetes operates. As you know when the filter is used it needs to perform local metadata lookup that comes from the file names when using Tail as a source. Now with the new Kube_Tag_Prefix option you can specify what's the prefix used in Tail input plugin, for the configuration example above the new configuration will look as follows:

[INPUT]
    Name  tail
    Path  /var/log/containers/*.log
    Tag   kube.*

[FILTER]
    Name             kubernetes
    Match            *
    Kube_Tag_Prefix  kube.var.log.containers.

So the proper for Kube_Tag_Prefix value must be composed by Tag prefix set in Tail input plugin plus the converted monitored directory replacing slashes with dots.

Supported Platforms

The following operating systems and architectures are supported in Fluent Bit.

From an architecture support perspective, Fluent Bit is fully functional on x86_64, Arm64v8 and Arm32v7 based processors.

Fluent Bit can work also on OSX and *BSD systems, but not all plugins will be available on all platforms. Official support will be expanding based on community demand.

Requirements

uses very low CPU and Memory consumption, it's compatible with most of x86, x86_64, arm32v7 and arm64v8 based platforms. In order to build it you need the following components in your system for the build process:

Compiler: GCC or clang
CMake
Flex & Bison: only if you enable the Stream Processor or Record Accessor feature (both enabled by default)

In the core there are not other dependencies, For certain features that depends on third party components like output plugins with special backend libraries (e.g: kafka), those are included in the main source code repository.

Sources

Download Source Code

Stable

For production systems, we strongly suggest that you always get the latest stable release from our web site, you can get the official tarballs (.tar.gz) from the following link:

Development

For people who aims to contribute to the project testing or extending the code base, can get the development version from our GIT repository:

Note that our master branch is where the development of Fluent Bit happens. Since it's a development version, expect issues when compiling or at run time.

We encourage everybody to help us testing every development version, at the end this is what will become stable.

Build with Static Configuration

Fluent Bit in normal operation mode allows to be configurable through text files or using specific arguments in the command line, while this is the ideal deployment case, there are scenarios where a more restricted configuration is required: static configuration mode.

Static configuration mode aims to include a built-in configuration in the final binary of Fluent Bit, disabling the usage of external files or flags at runtime.

Getting Started

Requirements

The following steps assumes you are familiar with configuring Fluent Bit using text files and you have experience building it from scratch as described in the Build and Install section.

Configuration Directory

In your file system prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. It is mandatory that this directory contain as a minimum one configuration file called fluent-bit.conf containing the required SERVICE, INPUT and OUTPUT sections. As an example create a new fluent-bit.conf file with the following content:

[SERVICE]
    Flush     1
    Daemon    off
    Log_Level info

[INPUT]
    Name      cpu

[OUTPUT]
    Name      stdout
    Match     *

the configuration provided above will calculate CPU metrics from the running system and print them to the standard output interface.

Build with Custom Configuration

Inside Fluent Bit source code, get into the build/ directory and run CMake appending the FLB_STATIC_CONF option pointing the configuration directory recently created, e.g:

$ cd fluent-bit/build/
$ cmake -DFLB_STATIC_CONF=/path/to/my/confdir/

then build it:

$ make

At this point the fluent-bit binary generated is ready to run without necessity of further configuration:

$ bin/fluent-bit 
Fluent-Bit v0.15.0
Copyright (C) Treasure Data

[2018/10/19 15:32:31] [ info] [engine] started (pid=15186)
[0] cpu.local: [1539984752.000347547, {"cpu_p"=>0.750000, "user_p"=>0.500000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

Linux Packages

Amazon Linux

Install on Amazon Linux 2

Fluent Bit is distributed as td-agent-bit package and is available for the latest Amazon Linux 2. The following architectures are supported

x86_64
aarch64 / arm64v8

Configure Yum

We provide td-agent-bit through a Yum repository. In order to add the repository reference to your system, please add a new file called td-agent-bit.repo in /etc/yum.repos.d/ with the following content:

[td-agent-bit]
name = TD Agent Bit
baseurl = https://packages.fluentbit.io/amazonlinux/2/$basearch/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1

note: we encourage you always enable the gpgcheck for security reasons. All our packages are signed.

The GPG Key fingerprint is F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Install

Once your repository is configured, run the following command to install it:

$ yum install td-agent-bit

Now the following step is to instruct systemd to enable the service:

$ sudo service td-agent-bit start

If you do a status check, you should see a similar output like this:

$ service td-agent-bit status
Redirecting to /bin/systemctl status  td-agent-bit.service
● td-agent-bit.service - TD Agent Bit
   Loaded: loaded (/usr/lib/systemd/system/td-agent-bit.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2016-07-07 02:08:01 BST; 9s ago
 Main PID: 3820 (td-agent-bit)
   CGroup: /system.slice/td-agent-bit.service
           └─3820 /opt/td-agent-bit/bin/td-agent-bit -c etc/td-agent-bit/td-agent-bit.conf
...

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.

Redhat / CentOS

Install on Redhat / CentOS

Fluent Bit is distributed as td-agent-bit package and is available for the latest stable CentOS system. The following architectures are supported

x86_64
aarch64 / arm64v8

Configure Yum

[td-agent-bit]
name = TD Agent Bit
baseurl = https://packages.fluentbit.io/centos/7/$basearch/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1

note: we encourage you always enable the gpgcheck for security reasons. All our packages are signed.

The GPG Key fingerprint is F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Install

Once your repository is configured, run the following command to install it:

$ yum install td-agent-bit

Now the following step is to instruct Systemd to enable the service:

$ sudo service td-agent-bit start

If you do a status check, you should see a similar output like this:

$ service td-agent-bit status
Redirecting to /bin/systemctl status  td-agent-bit.service
● td-agent-bit.service - TD Agent Bit
   Loaded: loaded (/usr/lib/systemd/system/td-agent-bit.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2016-07-07 02:08:01 BST; 9s ago
 Main PID: 3820 (td-agent-bit)
   CGroup: /system.slice/td-agent-bit.service
           └─3820 /opt/td-agent-bit/bin/td-agent-bit -c etc/td-agent-bit/td-agent-bit.conf
...

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/messages file.

Debian

Fluent Bit is distributed as td-agent-bit package and is available for the latest (and old) stable Debian systems: Buster, Stretch and Jessie.

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

$ wget -qO - https://packages.fluentbit.io/fluentbit.key | sudo apt-key add -

Update your sources lists

On Debian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Debian 10 (Buster)

deb https://packages.fluentbit.io/debian/buster buster main

Debian 9 (Stretch)

deb https://packages.fluentbit.io/debian/stretch stretch main

Debian 8 (Jessie)

deb https://packages.fluentbit.io/debian/jessie jessie main

Update your repositories database

Now let your system update the apt database:

$ sudo apt-get update

Install TD Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

$ sudo apt-get install td-agent-bit

Now the following step is to instruct systemd to enable the service:

$ sudo service td-agent-bit start

If you do a status check, you should see a similar output like this:

sudo service td-agent-bit status
● td-agent-bit.service - TD Agent Bit
   Loaded: loaded (/lib/systemd/system/td-agent-bit.service; disabled; vendor preset: enabled)
   Active: active (running) since mié 2016-07-06 16:58:25 CST; 2h 45min ago
 Main PID: 6739 (td-agent-bit)
    Tasks: 1
   Memory: 656.0K
      CPU: 1.393s
   CGroup: /system.slice/td-agent-bit.service
           └─6739 /opt/td-agent-bit/bin/td-agent-bit -c /etc/td-agent-bit/td-agent-bit.conf
...

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Ubuntu

Fluent Bit is distributed as td-agent-bit package and is available for the latest stable Ubuntu system: Focal Fossa.

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

Update your sources lists

On Ubuntu, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Ubuntu 20.04 LTS (Focal Fossa)

Ubuntu 18.04 LTS (Bionic Beaver)

Ubuntu 16.04 LTS (Xenial Xerus)

Update your repositories database

Now let your system update the apt database:

Install TD-Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Raspbian / Raspberry Pi

Fluent Bit is distributed as td-agent-bit package and is available for the Raspberry, specifically for distribution, the following versions are supported:

Raspbian Buster (10)
Raspbian Stretch (9)
Raspbian Jessie (8)

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

Update your sources lists

On Debian and derivated systems such as Raspbian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Raspbian 10 (Buster)

Raspbian 9 (Stretch)

Raspbian 8 (Jessie)

Update your repositories database

Now let your system update the apt database:

Install TD-Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Containers on AWS

AWS maintains a distribution of Fluent Bit combining the latest official release with a set of Go Plugins for sending logs to AWS services. AWS and Fluent Bit are working together to rewrite their plugins for inclusion in the official Fluent Bit distribution.

Plugins

Currently, the AWS for Fluent Bit image contains Go Plugins for:

Versions and Regional Repositories

AWS vends their container image via Docker Hub, and a set of highly available regional Amazon ECR repositories. For more information, see the AWS for Fluent Bit GitHub repo.

The AWS for Fluent Bit image uses a custom versioning scheme because it contains multiple projects. To see what each release contains, check out the release notes on GitHub.

SSM Public Parameters

AWS vends SSM Public Parameters with the regional repository link for each image. These parameters can be queried by any AWS account.

To see a list of available version tags in a given region, run the following command:

aws ssm get-parameters-by-path --region eu-central-1 --path /aws/service/aws-for-fluent-bit/ --query 'Parameters[*].Name'

To see the ECR repository URI for a given image tag in a given region, run the following:

$ aws ssm get-parameter --region ap-northeast-1 --name /aws/service/aws-for-fluent-bit/2.0.0

You can use these SSM public parameters as parameters in your CloudFormation templates:

Parameters:
  FireLensImage:
    Description: Fluent Bit image for the FireLens Container
    Type: AWS::SSM::Parameter::Value<String>
    Default: /aws/service/aws-for-fluent-bit/latest

Amazon EC2

Learn how to install Fluent Bit and the AWS output plugins on Amazon Linux 2 via AWS Systems Manager.

Kubernetes

Kubernetes Production Grade Log Processor

Fluent Bit is a lightweight and extensible Log Processor that comes with full support for Kubernetes:

Process Kubernetes containers logs from the file system or Systemd/Journald.
Enrich logs with Kubernetes Metadata.
Centralize your logs in third party storage services like Elasticsearch, InfluxDB, HTTP, etc.

Concepts

Before getting started it is important to understand how Fluent Bit will be deployed. Kubernetes manages a cluster of nodes, so our log agent tool will need to run on every node to collect logs from every POD, hence Fluent Bit is deployed as a DaemonSet (a POD that runs on every node of the cluster).

When Fluent Bit runs, it will read, parse and filter the logs of every POD and will enrich each entry with the following information (metadata):

Pod Name
Pod ID
Container Name
Container ID
Labels
Annotations

To obtain these information, a built-in filter plugin called kubernetes talks to the Kubernetes API Server to retrieve relevant information such as the pod_id, labels and annotations, other fields such as pod_name, container_id and container_name are retrieved locally from the log file names. All of this is handled automatically, no intervention is required from a configuration aspect.

Our Kubernetes Filter plugin is fully inspired on the Fluentd Kubernetes Metadata Filter written by Jimmi Dyson.

Installation

Fluent Bit must be deployed as a DaemonSet, so on that way it will be available on every node of your Kubernetes cluster. To get started run the following commands to create the namespace, service account and role setup:

$ kubectl create namespace logging
$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-service-account.yaml
$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role.yaml
$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role-binding.yaml

The next step is to create a ConfigMap that will be used by our Fluent Bit DaemonSet:

$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-configmap.yaml

Note for Kubernetes < v1.16

For Kubernetes versions olden than v1.16, the DaemonSet resource is not available on apps/v1 , the resource is available on apiVersion: extensions/v1beta1 . Our current Daemonset Yaml files uses the new apiVersion.

If you are using and older Kubernetes, grab manually a copy of your Daemonset Yaml file and replace the value of apiVersion from:

apiVersion: apps/v1

apiVersion: extensions/v1beta1

You can read more about this deprecation on Kubernetes v1.14 Changelog here:

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#deprecations

Fluent Bit to Elasticsearch

Fluent Bit DaemonSet ready to be used with Elasticsearch on a normal Kubernetes Cluster:

$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-ds.yaml

Fluent Bit to Elasticsearch on Minikube

If you are using Minikube for testing purposes, use the following alternative DaemonSet manifest:

$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-ds-minikube.yaml

Details

The default configuration of Fluent Bit makes sure of the following:

Consume all containers logs from the running Node.
The Tail input plugin will not append more than 5MB into the engine until they are flushed to the Elasticsearch backend. This limit aims to provide a workaround for backpressure scenarios.
The Kubernetes filter will enrich the logs with Kubernetes metadata, specifically labels and annotations. The filter only goes to the API Server when it cannot find the cached info, otherwise it uses the cache.
The default backend in the configuration is Elasticsearch set by the Elasticsearch Ouput Plugin. It uses the Logstash format to ingest the logs. If you need a different Index and Type, please refer to the plugin option and do your own adjustments.
There is an option called Retry_Limit set to False, that means if Fluent Bit cannot flush the records to Elasticsearch it will re-try indefinitely until it succeed.

Yocto / Embedded Linux

Fluent Bit source code provides Bitbake recipes to configure, build and package the software for a Yocto based image. Note that specific steps of usage of these recipes in your Yocto environment (Poky) is out of the scope of this documentation.

We distribute two main recipes, one for testing/dev purposes and other with the latest stable release.

Version

Recipe

Description

devel

Build Fluent Bit from GIT master. This recipe aims to be used for development and testing purposes only.

v1.5.7

Build latest stable version of Fluent Bit.

It's strongly recommended to always use the stable release of Fluent Bit recipe and not the one from GIT master for production deployments.

Fluent Bit and other architectures

Fluent Bit >= v1.1.x fully supports x86_64, x86, arm32v7 and arm64v8.

Windows

1# Windows

Fluent Bit is distributed as td-agent-bit package for Windows. Fluent Bit has two flavours of Windows installers: a ZIP archive (for quick testing) and an EXE installer (for system installation).

Installation Packages

The latest stable version is 1.5.7:

INSTALLERS

SHA256 CHECKSUMS

907514b34ea8c8a59209f70d7d5ec8b0ad09cfa3e7cc850bc64dcbac992b89c6

ba388f89a8519b221f6ea23151df7070cc95088486d5ed037b33a36b51bc95ee

2d48534ed3dca1ec6dd97cc4b1bed4bb226c3aa5e8240b29be3cdc0cd7e9cec8

3fab0f852a079861b946cd8785706b650d1f6ada4389f85ff3e50f98cb4f62d3

To check the integrity, use Get-FileHash commandlet on PowerShell.

PS> Get-FileHash td-agent-bit-1.5.7-win32.exe

Installing from ZIP archive

Download a ZIP archive from the download page. There are installers for 32-bit and 64-bit environments, so choose one suitable for your environment.

Then you need to expand the ZIP archive. You can do this by clicking "Extract All" on Explorer, or if you're using PowerShell, you can use Expand-Archive commandlet.

PS> Expand-Archive td-agent-bit-1.5.7-win64.zip

The ZIP package contains the following set of files.

td-agent-bit
├── bin
│   ├── fluent-bit.dll
│   └── fluent-bit.exe
├── conf
│   ├── fluent-bit.conf
│   ├── parsers.conf
│   └── plugins.conf
└── include
    │   ├── flb_api.h
    │   ├── ...
    │   └── flb_worker.h
    └── fluent-bit.h

Now, launch cmd.exe or PowerShell on your machine, and execute fluent-bit.exe as follows.

PS> .\bin\fluent-bit.exe -i dummy -o stdout

If you see the following output, it's working fine!

PS> .\bin\fluent-bit.exe  -i dummy -o stdout
Fluent Bit v1.5.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2019/06/28 10:13:04] [ info] [storage] initializing...
[2019/06/28 10:13:04] [ info] [storage] in-memory
[2019/06/28 10:13:04] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2019/06/28 10:13:04] [ info] [engine] started (pid=10324)
[2019/06/28 10:13:04] [ info] [sp] stream processor started
[0] dummy.0: [1561684385.443823800, {"message"=>"dummy"}]
[1] dummy.0: [1561684386.428399000, {"message"=>"dummy"}]
[2] dummy.0: [1561684387.443641900, {"message"=>"dummy"}]
[3] dummy.0: [1561684388.441405800, {"message"=>"dummy"}]

To halt the process, press CTRL-C in the terminal.

Installing from EXE installer

Download an EXE installer from the download page. It has both 32-bit and 64-bit builds. Choose one which is suitable for you.

Then, double-click the EXE installer you've downloaded. Installation wizard will automatically start.

Click Next and proceed. By default, Fluent Bit is installed into C:\Program Files\td-agent-bit\, so you should be able to launch fluent-bit as follow after installation.

PS> C:\Program Files\td-agent-bit\bin\fluent-bit.exe -i dummy -o stdout

Windows Service Support

Windows services are equivalent to "daemons" in UNIX (i.e. long-running background processes). Since v1.5.0, Fluent Bit has the native support for Windows Service.

Suppose you have the following installation layout:

C:\fluent-bit\
├── conf
│   ├── fluent-bit.conf
│   └── parsers.conf
└── bin
    ├── fluent-bit.dll
    └── fluent-bit.exe

To register Fluent Bit as a Windows service, you need to execute the following command on Command Prompt. Please be careful that a single space is required after binpath=.

% sc.exe create fluent-bit binpath= "\fluent-bit\bin\fluent-bit.exe -c \fluent-bit\conf\fluent-bit.conf"

Now Fluent Bit can be started and managed as a normal Windows service.

% sc.exe start fluent-bit
% sc.exe query fluent-bit
SERVICE_NAME: fluent-bit
    TYPE               : 10  WIN32_OWN_PROCESS
    STATE              : 4 Running
    ...

To halt the Fluent Bit service, just execute the "stop" command.

% sc.exe stop fluent-bit

Administration

Configuring Fluent Bit

Variables

Fluent Bit supports the usage of environment variables in any value associated to a key when using a configuration file.

The variables are case sensitive and can be used in the following format:

When Fluent Bit starts, the configuration reader will detect any request for ${MY_VARIABLE} and will try to resolve its value.

Example

Create the following configuration file (fluent-bit.conf):

Open a terminal and set the environment variable:

The above command set the 'stdout' value to the variable MY_OUTPUT.

Run Fluent Bit with the recently created configuration file:

As you can see the service worked properly as the configuration was valid.

Commands

Configuration files must be flexible enough for any deployment need, but they must keep a clean and readable format.

Fluent Bit Commands extends a configuration file with specific built-in features. The list of commands available as of Fluent Bit 0.12 series are:

Command

Prototype

Description

@INCLUDE FILE

Include a configuration file

@SET KEY=VAL

Set a configuration variable

@INCLUDE Command

Configuring a logging pipeline might lead to an extensive configuration file. In order to maintain a human-readable configuration, it's suggested to split the configuration in multiple files.

The @INCLUDE command allows the configuration reader to include an external configuration file, e.g:

[SERVICE]
    Flush 1

@INCLUDE inputs.conf
@INCLUDE outputs.conf

The above example defines the main service configuration file and also include two files to continue the configuration:

inputs.conf

[INPUT]
    Name cpu
    Tag  mycpu

[INPUT]
    Name tail
    Path /var/log/*.log
    Tag  varlog.*

outputs.conf

[OUTPUT]
    Name   stdout
    Match  mycpu

[OUTPUT]
    Name            es
    Match           varlog.*
    Host            127.0.0.1
    Port            9200
    Logstash_Format On

Note that despites the order of inclusion, Fluent Bit will ALWAYS respect the following order:

Service
Inputs
Filters
Outputs

@SET Command

Fluent Bit supports configuration variables, one way to expose this variables to Fluent Bit is through setting a Shell environment variable, the other is through the @SET command.

The @SET command can only be used at root level of each line, meaning it cannot be used inside a section, e.g:

@SET my_input=cpu
@SET my_output=stdout

[SERVICE]
    Flush 1

[INPUT]
    Name ${my_input}

[OUTPUT]
    Name ${my_output}

Upstream Servers

It's common that Fluent Bit aims to connect to external services to deliver the logs over the network, this is the case of , and within others. Being able to connect to one node (host) is normal and enough for more of the use cases, but there are other scenarios where balancing across different nodes is required. The Upstream feature provides such capability.

An Upstream defines a set of nodes that will be targeted by an output plugin, by the nature of the implementation an output plugin must support the Upstream feature. The following plugin(s) have Upstream support:

The current balancing mode implemented is round-robin.

Configuration

To define an Upstream it's required to create an specific configuration file that contains an UPSTREAM and one or multiple NODE sections. The following table describe the properties associated to each section. Note that all of them are mandatory:

Nodes and specific plugin configuration

A Node might contain additional configuration keys required by the plugin, on that way we provide enough flexibility for the output plugin, a common use case is Forward output where if TLS is enabled, it requires a shared key (more details in the example below).

Nodes and TLS (Transport Layer Security)

In addition to the properties defined in the table above, the network operations against a defined node can optionally be done through the use of TLS for further encryption and certificates use.

The TLS options available are described in the section and can be added to the any Node section.

Configuration File Example

The following example defines an Upstream called forward-balancing which aims to be used by Forward output plugin, it register three Nodes:

node-1: connects to 127.0.0.1:43000
node-2: connects to 127.0.0.1:44000
node-3: connects to 127.0.0.1:45000 using TLS without verification. It also defines a specific configuration option required by Forward output called shared_key.

Note that every Upstream definition must exists on it own configuration file in the file system. Adding multiple Upstreams in the same file or different files is not allowed.

Unit Sizes

Certain configuration directives in Fluent Bit refer to unit sizes such as when defining the size of a buffer or specific limits, we can find these in plugins like Tail Input, Forward Input or in generic properties like Mem_Buf_Limit.

Starting from Fluent Bit v0.11.10, all unit sizes have been standardized across the core and plugins, the following table describes the options that can be used and what they mean:

Suffix

Description

Example

When a suffix is not specified, it's assumed that the value given is a bytes representation.

Specifying a value of 32000, means 32000 bytes

k, K, KB, kb

Kilobyte: a unit of memory equal to 1,000 bytes.

32k means 32000 bytes.

m, M, MB, mb

Megabyte: a unit of memory equal to 1,000,000 bytes

1M means 1000000 bytes

g, G, GB, gb

Gigabyte: a unit of memory equal to 1,000,000,000 bytes

1G means 1000000000 bytes

Buffering & Storage

The end-goal of is to collect, parse, filter and ship logs to a central place. In this workflow there are many phases and one of the critical pieces is the ability to do buffering : a mechanism to place processed data into a temporal location until is ready to be shipped.

By default when Fluent Bit process data, it uses Memory as a primary and temporal place to store the record logs, but there are certain scenarios where would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities.

Starting with Fluent Bit v1.0, we introduced a new storage layer that can either work in memory or in the file system. Input plugins can be configured to use one or the other upon demand at start time.

Configuration

The storage layer configuration takes place in two areas:

Service Section
Input Section

The known Service section configure a global environment for the storage layer, and then in the Input sections defines which mechanism to use.

Service Section Configuration

The Service section refers to the section defined in the main :

a Service section will look like this:

that configuration configure an optional buffering mechanism where it root for data is /var/log/flb-storage/, it will use normal synchronization mode, without checksum and up to a maximum of 5MB of memory when processing backlog data.

Input Section Configuration

Optionally, any Input plugin can configure their storage preference, the following table describe the options available:

The following example configure a service that offers filesystem buffering capabilities and two Input plugins being the first based in filesystem and the second with memory only.

Backpressure

In certain environments is common to see that logs or data being ingested is faster than the ability to flush it to some destinations. The common case is reading from big log files and dispatching the logs to a backend over the network which takes some time to respond, this generate backpressure leading to a high memory consumption in the service.

In order to avoid backpressure, Fluent Bit implements a mechanism in the engine that restrict the amount of data than an input plugin can ingest, this is done through the configuration parameter Mem_Buf_Limit.

As described in the Buffering concepts section, Fluent Bit offers an hybrid mode for data handling: in-memory and filesystem (optional).

In memory is always available and can be restricted with Mem_Buf_Limit. If your plugin gets restricted because of the configuration and you are under a backpressure scenario, you won't be able to ingest more data until the data chunks that are in memory can flushed.

Depending of the input plugin type in use, this might lead to discard incoming data (e.g: TCP input plugin), but you can rely on the secondary filesystem buffering to be safe.

If in addition to Mem_Buf_Limit the input plugin defined a storage.type of filesystem (as described in Buffering & Storage), when the limit is reached, all the new data will be stored safety in the file system.

Mem_Buf_Limit

This option is disabled by default and can be applied to all input plugins. Let's explain it behavior using the following scenario:

Mem_Buf_Limit is set to 1MB (one megabyte)
input plugin tries to append 700KB
engine route the data to an output plugin
output plugin backend (HTTP Server) is down
engine scheduler will retry the flush after 10 seconds
input plugin tries to append 500KB

At this exact point, the engine will allow to append those 500KB of data into the engine: in total we have 1.2MB. The options works in a permissive mode before to reach the limit, but the limit is exceeded the following actions are taken:

block local buffers for the input plugin (cannot append more data)
notify the input plugin invoking a pause callback

The engine will protect it self and will not append more data coming from the input plugin in question; Note that is the plugin responsibility to keep their state and take some decisions about what to do on that paused state.

After some seconds if the scheduler was able to flush the initial 700KB of data or it gave up after retrying, that amount memory is released and internally the following actions happens:

Upon data buffer release (700KB), the internal counters get updated
Counters now are set at 500KB
Since 500KB is < 1MB it checks the input plugin state
If the plugin is paused, it invokes a resume callback
input plugin can continue appending more data

About pause and resume Callbacks

Each plugin is independent and not all of them implements the pause and resume callbacks. As said, these callbacks are just a notification mechanism for the plugin.

The plugin who implements and keep a good state is the Tail Input plugin. When the pause callback is triggered, it stop their collectors and stop appending data. Upon resume, it re-enable the collectors.

Scheduling and Retries

has an Engine that helps to coordinate the data ingestion from input plugins and call the Scheduler to decide when is time to flush the data through one or multiple output plugins. The Scheduler flush new data every a fixed time of seconds and Schedule retries when asked.

Once an output plugin gets call to flush some data, after processing that data it can notify the Engine three possible return statuses:

OK
Retry
Error

If the return status was OK, it means it was successfully able to process and flush the data, if it returned an Error status, means that an unrecoverable error happened and the engine should not try to flush that data again. If a Retry was requested, the Engine will ask the Scheduler to retry to flush that data, the Scheduler will decide how many seconds to wait before that happen.

Configuring Retries

The Scheduler provides a simple configuration option called Retry_Limit which can be set independently on each output section. This option allows to disable retries or impose a limit to try N times and then discard the data after reaching that limit:

Example

The following example configure two outputs where the HTTP plugin have an unlimited number of retries and the Elasticsearch plugin have a limit of 5 times:

Networking

implements a unified networking interface that is exposed to components like plugins. This interface abstract all the complexity of general I/O and is fully configurable.

A common use case is when a component or plugin needs to connect to a service to send and receive data. Despite the operational mode sounds easy to deal with, there are many factors that can make things hard like unresponsive services, networking latency or any kind of connectivity error. The networking interface aims to abstract and simplify the network I/O handling, minimize risks and optimize performance.

Concepts

TCP Connect Timeout

Most of the time creating a new TCP connection to a remote server is straightforward and takes a few milliseconds. But there are cases where DNS resolving, slow network or incomplete TLS handshakes might create long delays, or incomplete connection statuses.

The net.connect_timeout allows to configure the maximum time to wait for a connection to be established, note that this value already considers the TLS handshake process.

TCP Source Address

On environments with multiple network interfaces, might be desired to choose which interface to use for our data that will flow through the network.

The net.source_address allows to specify which network address must be used for a TCP connection and data flow.

TCP Keepalive

TCP is a connected oriented channel, to deliver and receive data from a remote end-point in most of cases we use a TCP connection. This TCP connection can be created and destroyed once is not longer needed, this approach has pros and cons, here we will refer to the opposite case: keep the connection open.

The concept of TCP Keepalive refers to the ability of the client (Fluent Bit on this case) to keep the TCP connection open in a persistent way, that means that once the connection is created and used, instead of close it, it can be recycled. This feature offers many benefits in terms of performance since communication channels are always established before hand.

Any component that uses TCP channels like HTTP or , can take advantage of this feature. For configuration purposes use the net.keepalive property.

TCP Keepalive Idle Timeout

If a TCP connection is keepalive enabled, there might be scenarios where the connection can be unused for long periods of time. Having an idle keepalive connection is not helpful and is recommendable to keep them alive if they are used.

In order to control how long a keepalive connection can be idle, we expose the configuration property called net.keepalive_idle_timeout.

Configuration Options

For plugins that relies on networking I/O, the following section describes the network configuration properties available and how they can be used to optimize performance or adjust to different configuration needs:

Example

As an example, we will send 5 random messages through a TCP output connection, in the remote side we will use nc (netcat) utility to see the data.

Put the following configuration snippet in a file called fluent-bit.conf:

In another terminal, start nc and make it listen for messages on TCP port 9090:

Now start Fluent Bit with the configuration file written above and you will see the data flowing to netcat:

If the net.keepalive option is not enabled, Fluent Bit will close the TCP connection and netcat will quit, here we can see how the keepalive connection works.

After the 5 records arrive, the connection will keep idle and after 10 seconds it will be closed due to net.keepalive_idle_timeout.

Memory Management

In certain scenarios would be ideal to estimate how much memory Fluent Bit could be using, this is very useful for containerized environments where memory limits are a must.

In order to estimate we will assume that the input plugins have set the Mem_Buf_Limit option (you can learn more about it in the section).

Estimating

Input plugins append data independently, so in order to do an estimation a limit should be imposed through the Mem_Buf_Limit option. If the limit was set to 10MB we need to estimate that in the worse case, the output plugin likely could use 20MB.

Fluent Bit has an internal binary representation for the data being processed, but when this data reach an output plugin, this one will likely create their own representation in a new memory buffer for processing. The best example are the and output plugins, both needs to convert the binary representation to their respective-custom JSON formats before to talk to their backend servers.

So, if we impose a limit of 10MB for the input plugins and considering the worse case scenario of the output plugin consuming 20MB extra, as a minimum we need (30MB x 1.2) = 36MB.

Glibc and Memory Fragmentation

Is well known that in intensive environments where memory allocations happens in the order of magnitude, the default memory allocator provided by Glibc could lead to a high fragmentation, reporting a high memory usage by the service.

It's strongly suggested that in any production environment, Fluent Bit should be built with enabled (e.g. -DFLB_JEMALLOC=On). Jemalloc is an alternative memory allocator that can reduce fragmentation (among others things) resulting in better performance.

You can check if Fluent Bit has been built with Jemalloc using the following command:

The output should looks like:

If the FLB_HAVE_JEMALLOC option is listed in Build Flags, everything will be fine.

Local Testing

Running a Logging Pipeline Locally

You may wish to test a logging pipeline locally to observe how it deals with log messages. The following is a walk-through for running Fluent Bit and Elasticsearch locally with which can serve as an example for testing other plugins locally.

Create a Configuration File

Refer to the to create a configuration to test.

fluent-bit.conf:

Docker Compose

Use to run Fluent Bit (with the configuration file mounted) and Elasticsearch.

docker-compose.yaml:

View indexed logs

To view indexed logs run:

To "start fresh", delete the index by running:

Data Pipeline

Inputs

Collectd

The collectd input plugin allows you to receive datagrams from collectd service.

Configuration Parameters

The plugin supports the following configuration parameters:

Configuration Examples

Here is a basic configuration example.

With this configuration, Fluent Bit listens to 0.0.0.0:25826, and outputs incoming datagram packets to stdout.

You must set the same types.db files that your collectd server uses. Otherwise, Fluent Bit may not be able to interpret the payload properly.

CPU Metrics

The cpu input plugin, measures the CPU usage of a process or the whole system by default (considering per CPU core). It reports values in percentage unit for every interval of time set. At the moment this plugin is only available for Linux.

The following tables describes the information generated by the plugin. The keys below represent the data used by the overall system, all values associated to the keys are in a percentage unit (0 to 100%):

In addition to the keys reported in the above table, a similar content is created per CPU core. The cores are listed from 0 to N as the Kernel reports:

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to get the statistics of the CPU usage of your system, you can run the plugin from the command line or through the configuration file:

Command Line

As described above, the CPU input plugin gathers the overall usage every one second and flushed the information to the output on the fifth second. On this example we used the stdout plugin to demonstrate the output records. In a real use-case you may want to flush this information to some central aggregator such as or .

Configuration File

In your main configuration file append the following Input & Output sections:

Disk I/O Metrics

The disk input plugin, gathers the information about the disk throughput of the running system every certain interval of time and reports them.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to get disk usage from your system, you can run the plugin from the command line or through the configuration file:

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

Note: Total interval (sec) = Interval_Sec + (Interval_Nsec / 1000000000).

e.g. 1.5s = 1s + 500000000ns

Docker Events

The docker events input plugin uses the docker API to capture server events. A complete list of possible events returned by this plugin can be found here

Configuration Parameters

This plugin supports the following configuration parameters:

Key

Description

Default

Unix_Path

The docker socket unix path

/var/run/docker.sock

Buffer_Size

The size of the buffer used to read docker events (in bytes)

8192

Parser

Specify the name of a parser to interpret the entry as a structured message.

None

Key

When a message is unstructured (no parser applied), it's appended as a string under the key name message.

message

Command Line

$ fluent-bit -i docker_events -o stdout

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name   docker_events

[OUTPUT]
    Name   stdout
    Match  *

Dummy

The dummy input plugin, generates dummy events. It is useful for testing, debugging, benchmarking and getting started with Fluent Bit.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

Health

Health input plugin allows you to check how healthy a TCP server is. It does the check by issuing a TCP connection every a certain interval of time.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to start performing the checks, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit generate the checks with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you will see some random values in the output interface similar to this:

Kernel Logs

The kmsg input plugin reads the Linux Kernel log buffer since the beginning, it gets every record and parse it field as priority, sequence, seconds, useconds, and message.

Getting Started

In order to start getting the Linux Kernel messages, you can run the plugin from the command line or through the configuration file:

Command Line

As described above, the plugin processed all messages that the Linux Kernel reported, the output has been truncated for clarification.

Configuration File

In your main configuration file append the following Input & Output sections:

Memory Metrics

The mem input plugin, gathers the information about the memory and swap usage of the running system every certain interval of time and reports the total amount of memory and the amount of free available.

Getting Started

In order to get memory and swap usage from your system, you can run the plugin from the command line or through the configuration file:

Command Line

$ fluent-bit -i mem -t memory -o stdout -m '*'
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2017/03/03 21:12:35] [ info] [engine] started
[0] memory: [1488543156, {"Mem.total"=>1016044, "Mem.used"=>841388, "Mem.free"=>174656, "Swap.total"=>2064380, "Swap.used"=>139888, "Swap.free"=>1924492}]
[1] memory: [1488543157, {"Mem.total"=>1016044, "Mem.used"=>841420, "Mem.free"=>174624, "Swap.total"=>2064380, "Swap.used"=>139888, "Swap.free"=>1924492}]
[2] memory: [1488543158, {"Mem.total"=>1016044, "Mem.used"=>841420, "Mem.free"=>174624, "Swap.total"=>2064380, "Swap.used"=>139888, "Swap.free"=>1924492}]
[3] memory: [1488543159, {"Mem.total"=>1016044, "Mem.used"=>841420, "Mem.free"=>174624, "Swap.total"=>2064380, "Swap.used"=>139888, "Swap.free"=>1924492}]

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name   mem
    Tag    memory

[OUTPUT]
    Name   stdout
    Match  *

MQTT

The MQTT input plugin, allows to retrieve messages/data from MQTT control packets over a TCP connection. The incoming data to receive must be a JSON map.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Listen

Listener network interface, default: 0.0.0.0

Port

TCP port where listening for connections, default: 1883

Getting Started

In order to start listening for MQTT messages, you can run the plugin from the command line or through the configuration file:

Command Line

Since the MQTT input plugin let Fluent Bit behave as a server, we need to dispatch some messages using some MQTT client, in the following example mosquitto tool is being used for the purpose:

$ fluent-bit -i mqtt -t data -o stdout -m '*'
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2016/05/20 14:22:52] [ info] starting engine
[0] data: [1463775773, {"topic"=>"some/topic", "key1"=>123, "key2"=>456}]

The following command line will send a message to the MQTT input plugin:

$ mosquitto_pub  -m '{"key1": 123, "key2": 456}' -t some/topic

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name   mqtt
    Tag    data
    Listen 0.0.0.0
    Port   1883

[OUTPUT]
    Name   stdout
    Match  *

Network I/O Metrics

The netif input plugin gathers network traffic information of the running system every certain interval of time, and reports them.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Interface

Specify the network interface to monitor. e.g. eth0

Interval_Sec

Polling interval (seconds). default: 1

Interval_NSec

Polling interval (nanosecond). default: 0

Verbose

If true, gather metrics precisely. default: false

Getting Started

In order to monitor network traffic from your system, you can run the plugin from the command line or through the configuration file:

Command Line

$ bin/fluent-bit -i netif -p interface=eth0 -o stdout
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2017/07/08 23:34:18] [ info] [engine] started
[0] netif.0: [1499524459.001698260, {"eth0.rx.bytes"=>89769869, "eth0.rx.packets"=>73357, "eth0.rx.errors"=>0, "eth0.tx.bytes"=>4256474, "eth0.tx.packets"=>24293, "eth0.tx.errors"=>0}]
[1] netif.0: [1499524460.002541885, {"eth0.rx.bytes"=>98, "eth0.rx.packets"=>1, "eth0.rx.errors"=>0, "eth0.tx.bytes"=>98, "eth0.tx.packets"=>1, "eth0.tx.errors"=>0}]
[2] netif.0: [1499524461.001142161, {"eth0.rx.bytes"=>98, "eth0.rx.packets"=>1, "eth0.rx.errors"=>0, "eth0.tx.bytes"=>98, "eth0.tx.packets"=>1, "eth0.tx.errors"=>0}]
[3] netif.0: [1499524462.002612971, {"eth0.rx.bytes"=>98, "eth0.rx.packets"=>1, "eth0.rx.errors"=>0, "eth0.tx.bytes"=>98, "eth0.tx.packets"=>1, "eth0.tx.errors"=>0}]

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name          netif
    Tag           netif
    Interval_Sec  1
    Interval_NSec 0
    Interface     eth0
[OUTPUT]
    Name   stdout
    Match  *

Note: Total interval (sec) = Interval_Sec + (Interval_Nsec / 1000000000).

e.g. 1.5s = 1s + 500000000ns

Process

Process input plugin allows you to check how health a process is. It does the check by issuing a process every a certain interval of time.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Proc_Name

Name of the target Process to check.

Interval_Sec

Interval in seconds between the service checks. Default value is 1.

Internal_Nsec

Specify a nanoseconds interval for service checks, it works in conjuntion with the Interval_Sec configuration key. Default value is 0.

Alert

If enabled, it will only generate messages if the target process is down. By default this option is disabled.

If enabled, a number of fd is appended to each records. Default value is true.

Mem

If enabled, memory usage of the process is appended to each records. Default value is true.

Getting Started

In order to start performing the checks, you can run the plugin from the command line or through the configuration file:

The following example will check the health of crond process.

$ fluent-bit -i proc -p proc_name=crond -o stdout

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name          proc
    Proc_Name     crond
    Interval_Sec  1
    Interval_NSec 0
    Fd            true
    Mem           true

[OUTPUT]
    Name   stdout
    Match  *

Testing

Once Fluent Bit is running, you will see the health of process:

$ fluent-bit -i proc -p proc_name=fluent-bit -o stdout
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2017/01/30 21:44:56] [ info] [engine] started
[0] proc.0: [1485780297, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1120000, "mem.VmRSS"=>1120000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]
[1] proc.0: [1485780298, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1148000, "mem.VmRSS"=>1148000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]
[2] proc.0: [1485780299, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1152000, "mem.VmRSS"=>1148000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]
[3] proc.0: [1485780300, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1152000, "mem.VmRSS"=>1148000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]

Random

Random input plugin generate very simple random value samples using the device interface /dev/urandom, if not available it will use a unix timestamp as value.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to start generating random samples, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit generate the samples with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

Serial Interface

The serial input plugin, allows to retrieve messages/data from a Serial interface.

Configuration Parameters

Getting Started

In order to retrieve messages over the Serial interface, you can run the plugin from the command line or through the configuration file:

Command Line

The following example loads the input serial plugin where it set a Bitrate of 9600, listen from the /dev/tnt0 interface and use the custom tag data to route the message.

The above interface (/dev/tnt0) is an emulation of the serial interface (more details at bottom), for demonstrative purposes we will write some message to the other end of the interface, in this case /dev/tnt1, e.g:

In Fluent Bit you should see an output like this:

Now using the Separator configuration, we could send multiple messages at once (run this command after starting Fluent Bit):

Configuration File

In your main configuration file append the following Input & Output sections:

Emulating Serial Interface on Linux

The following content is some extra information that will allow you to emulate a serial interface on your Linux system, so you can test this Serial input plugin locally in case you don't have such interface in your computer. The following procedure has been tested on Ubuntu 15.04 running a Linux Kernel 4.0.

Build and install the tty0tty module

Download the sources

Unpack and compile

Copy the new kernel module into the kernel modules directory

Load the module

You should see new serial ports in /dev/ (ls /dev/tnt*) Give appropriate permissions to the new serial ports:

When the module is loaded, it will interconnect the following virtual interfaces:

Standard Input

The stdin plugin allows to retrieve valid JSON text messages over the standard input interface (stdin). In order to use it, specify the plugin name as the input, e.g:

As input data the stdin plugin recognize the following JSON data formats:

A better example to demonstrate how it works will be through a Bash script that generates messages and writes them to . Write the following content in a file named test.sh:

Give the script execution permission:

Now lets start the script and in the following way:

Systemd

The Systemd input plugin allows to collect log messages from the Journald daemon on Linux environments.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to receive Systemd messages, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit listen for Systemd messages with the following options:

In the example above we are collecting all messages coming from the Docker service.

Configuration File

In your main configuration file append the following Input & Output sections:

TCP

The tcp input plugin allows to retrieve structured JSON or raw messages over a TCP network interface (TCP port).

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to receive JSON messages over TCP, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit listen for JSON messages with the following options:

By default the service will listen an all interfaces (0.0.0.0) through TCP port 5170, optionally you can change this directly, e.g:

In the example the JSON messages will only arrive through network interface under 192.168.3.2 address and TCP Port 9090.

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you can send some messages using the netcat:

In we should see the following output:

Performance Considerations

When receiving payloads in JSON format, there are high performance penalties. Parsing JSON is a very expensive task so you could expect your CPU usage increase under high load environments.

To get faster data ingestion, consider to use the option Format none to avoid JSON parsing if not needed.

Thermal

The thermal input plugin reports system temperatures periodically -- each second by default. Currently this plugin is only available for Linux.

The following tables describes the information generated by the plugin.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to get temperature(s) of your system, you can run the plugin from the command line or through the configuration file:

Command Line

Some systems provide multiple thermal zones. In this example monitor only thermal_zone0 by name, once per minute.

Configuration File

In your main configuration file append the following Input & Output sections:

Windows Event Log

The winlog input plugin allows you to read Windows Event Log.

Configuration Parameters

The plugin supports the following configuration parameters:

Note that if you do not set db, the plugin will read channels from the beginning on each startup.

Configuration Examples

Configuration File

Here is a minimum configuration example.

Note that some Windows Event Log channels (like Security) requires an admin privilege for reading. In this case, you need to run fluent-bit as an administrator.

Command Line

If you want to do a quick test, you can run this plugin from the command line.

Parsers

JSON

The JSON parser is the simplest option: if the original log source is a JSON map string, it will take it structure and convert it directly to the internal binary representation.

A simple configuration that can be found in the default parsers configuration file, is the entry to parse Docker log files (when the tail input plugin is used):

The following log entry is a valid content for the parser defined above:

After processing, it internal representation will be:

The time has been converted to Unix timestamp (UTC) and the map reduced to each component of the original message.

Regular Expression

The regex parser allows to define a custom Ruby Regular Expression that will use a named capture feature to define which content belongs to which key name.

Fluent Bit uses Onigmo regular expression library on Ruby mode, for testing purposes you can use the following web editor to test your expressions:

http://rubular.com/

Important: do not attempt to add multiline support in your regular expressions if you are using Tail input plugin since each line is handled as a separated entity. Instead use Tail Multiline support configuration feature.

Security Warning: Onigmo is a backtracking regex engine. You need to be careful not to use expensive regex patterns, or Onigmo can take very long time to perform pattern matching. For details, please read the article "ReDoS" on OWASP.

Note: understanding how regular expressions works is out of the scope of this content.

From a configuration perspective, when the format is set to regex, is mandatory and expected that a Regex configuration key exists.

The following parser configuration example aims to provide rules that can be applied to an Apache HTTP Server log entry:

[PARSER]
    Name   apache
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

As an example, takes the following Apache HTTP Server log entry:

192.168.2.20 - - [29/Jul/2015:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395

The above content do not provide a defined structure for Fluent Bit, but enabling the proper parser we can help to make a structured representation of it:

[1154104030, {"host"=>"192.168.2.20",
              "user"=>"-",
              "method"=>"GET",
              "path"=>"/cgi-bin/try/",
              "code"=>"200",
              "size"=>"3395",
              "referer"=>"",
              "agent"=>""
              }
]

A common pitfall is that you cannot use characters other than alphabets, numbers and underscore in group names. For example, a group name like (?<user-name>.*) will cause an error due to containing an invalid character (-).

In order to understand, learn and test regular expressions like the example above, we suggest you try the following Ruby Regular Expression Editor: http://rubular.com/r/X7BH0M4Ivm

LTSV

The ltsv parser allows to parse formatted texts.

Labeled Tab-separated Values (LTSV format is a variant of Tab-separated Values (TSV). Each record in a LTSV file is represented as a single line. Each field is separated by TAB and has a label and a value. The label and the value have been separated by ':'.

Here is an example how to use this format in the apache access log.

Config this in httpd.conf:

The parser.conf:

The following log entry is a valid content for the parser defined above:

After processing, it internal representation will be:

The time has been converted to Unix timestamp (UTC).

Logfmt

The logfmt parser allows to parse the logfmt format described in https://brandur.org/logfmt . A more formal description is in https://godoc.org/github.com/kr/logfmt .

Here is an example configuration:

[PARSER]
    Name        logfmt
    Format      logfmt

The following log entry is a valid content for the parser defined above:

key1=val1 key2=val2

After processing, it internal representation will be:

[1540936693, {"key1"=>"val1",
              "key2"=>"val2"}]

Decoders

There are certain cases where the log messages being parsed contains encoded data, a typical use case can be found in containerized environments with Docker: application logs it data in JSON format but becomes an escaped string, Consider the following example

Original message generated by the application:

Then the Docker log message become encapsulated as follows:

as you can see the original message is handled as an escaped string. Ideally in Fluent Bit we would like to keep having the original structured message and not a string.

Getting Started

Decoders are a built-in feature available through the Parsers file, each Parser definition can optionally set one or multiple decoders. There are two type of decoders type:

Decode_Field: if the content can be decoded in a structured message, append that structure message (keys and values) to the original log message.
Decode_Field_As: any content decoded (unstructured or structured) will be replaced in the same key/value, no extra keys are added.

Our pre-defined Docker Parser have the following definition:

Each line in the parser with a key Decode_Field instruct the parser to apply a specific decoder on a given field, optionally it offer the option to take an extra action if the decoder cannot succeed.

Decoders

Optional Actions

By default if a decoder fails to decode the field or want to try a next decoder, is possible to define an optional action. Available actions are:

Note that actions are affected by some restrictions:

on Decode_Field_As, if succeeded, another decoder of the same type in the same field can be applied only if the data continue being a unstructed message (raw text).
on Decode_Field, if succeeded, can only be applied once for the same field. By nature Decode_Field aims to decode a structured message.

Examples

escaped_utf8

Example input (from /path/to/log.log in configuration below)

Example output

Configuration file

The fluent-bit-parsers.conf file,

Filters

AWS Metadata

The AWS Filter Enriches logs with AWS Metadata. Currently the plugin adds the EC2 instance ID and availability zone to log records. To use this plugin, you must be running in EC2 and have the instance metadata service enabled.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

imds_version

Specify which version of the instance metadata service to use. Valid values are 'v1' or 'v2'.

Note: If you run Fluent Bit in a container, you may have to use instance metadata v1. The plugin behaves the same regardless of which version is used.

Usage

Metadata Fields

Currently, the plugin only adds the instance ID and availability zone. AWS plans to expand this plugin in the future.

Key

Value

The ; for example, "us-east-1a".

ec2_instance_id

The EC2 instance ID.

Command Line

$ bin/fluent-bit -i dummy -F aws -m '*' -o stdout

[2020/01/17 07:57:17] [ info] [engine] started (pid=32744)
[0] dummy.0: [1579247838.000171227, {"message"=>"dummy", "az"=>"us-west-2b", "ec2_instance_id"=>"i-06bc83dbc2ac2fdf8"}]
[1] dummy.0: [1579247839.000125097, {"message"=>"dummy", "az"=>"us-west-2b", "ec2_instance_id"=>"i-06bc87dbc2ac3fdf8"}]

Configuration File

[INPUT]
    Name dummy
    Tag dummy

[FILTER]
    Name aws
    Match *
    imds_version v1

[OUTPUT]
    Name stdout
    Match *

Expect

Made for testing: make sure that your records contain the expected key and values

The expect filter plugin allows you to validate that records match certain criteria in their structure, like validating that a key exists or it has a specific value.

The following page just describes the configuration properties available, for a detailed explanation of its usage and use cases, please refer the following page:

Validating and your Data and Structure

Configuration Parameters

The plugin supports the following configuration parameters:

Property

Description

key_exists

Check if a key with a given name exists in the record.

key_not_exists

Check if a key does not exist in the record.

key_val_is_null

check that the value of the key is NULL.

key_val_is_not_null

check that the value of the key is NOT NULL.

key_val_eq

check that the value of the key equals the given value in the configuration.

action

action to take when a rule does not match. The available options are warn or exit. On warn, a warning message is sent to the logging layer when a mismatch of the rules above is found; using exit makes Fluent Bit abort with status code 255

Getting Started

As mentioned on top, refer to the following page for specific details of usage of this filter:

Validating and your Data and Structure

Grep

Select or exclude records per patterns

The Grep Filter plugin allows you to match or exclude specific records based on regular expression patterns for values or nested values.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Value Format

Description

Regex

KEY REGEX

Keep records in which the content of KEY matches the regular expression.

Exclude

KEY REGEX

Exclude records in which the content of KEY matches the regular expression.

Record Accessor Enabled

This plugin enables the Record Accessor feature to specify the KEY. Using the record accessor is suggested if you want to match values against nested values.

Getting Started

In order to start filtering records, you can run the filter from the command line or through the configuration file. The following example assumes that you have a file called lines.txt with the following content:

{"log": "aaa"}
{"log": "aab"}
{"log": "bbb"}
{"log": "ccc"}
{"log": "ddd"}
{"log": "eee"}
{"log": "fff"}
{"log": "ggg"}

Command Line

Note: using the command line mode need special attention to quote the regular expressions properly. It's suggested to use a configuration file.

The following command will load the tail plugin and read the content of lines.txt file. Then the grep filter will apply a regular expression rule over the log field (created by tail plugin) and only pass the records which field value starts with aa:

$ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o stdout

Configuration File

[INPUT]
    name   tail
    path   lines.txt
    parser json

[FILTER]
    name   grep
    match  *
    regex  log aa

[OUTPUT]
    name   stdout
    match  *

The filter allows to use multiple rules which are applied in order, you can have many Regex and Exclude entries as required.

Nested fields example

If you want to match or exclude records based on nested values, you can use a Record Accessor format as the KEY name. Consider the following record example:

{
    "log": "something",
    "kubernetes": {
        "pod_name": "myapp-0",
        "namespace_name": "default",
        "pod_id": "216cd7ae-1c7e-11e8-bb40-000c298df552",
        "labels": {
            "app": "myapp"
        },
        "host": "minikube",
        "container_name": "myapp",
        "docker_id": "370face382c7603fdd309d8c6aaaf434fd98b92421ce"
    }
}

if you want to exclude records that match given nested field (for example kubernetes.labels.app), you can use the following rule:

[FILTER]
    Name    grep
    Match   *
    Exclude $kubernetes['labels']['app'] myapp

Record Modifier

The Record Modifier Filter plugin allows to append fields or to exclude specific fields.

Configuration Parameters

The plugin supports the following configuration parameters: Remove_key and Whitelist_key are exclusive.

Key

Description

Record

Append fields. This parameter needs key and value pair.

Remove_key

If the key is matched, that field is removed.

Whitelist_key

If the key is not matched, that field is removed.

Getting Started

In order to start filtering records, you can run the filter from the command line or through the configuration file.

This is a sample in_mem record to filter.

{"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>299352, "Swap.total"=>2064380, "Swap.used"=>32656, "Swap.free"=>2031724}

Append fields

The following configuration file is to append product name and hostname (via environment variable) to record.

[INPUT]
    Name mem
    Tag  mem.local

[OUTPUT]
    Name  stdout
    Match *

[FILTER]
    Name record_modifier
    Match *
    Record hostname ${HOSTNAME}
    Record product Awesome_Tool

You can also run the filter from command line.

$ fluent-bit -i mem -o stdout -F record_modifier -p 'Record=hostname ${HOSTNAME}' -p 'Record=product Awesome_Tool' -m '*'

The output will be

[0] mem.local: [1492436882.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>299352, "Swap.total"=>2064380, "Swap.used"=>32656, "Swap.free"=>2031724, "hostname"=>"localhost.localdomain", "product"=>"Awesome_Tool"}]

Remove fields with Remove_key

The following configuration file is to remove 'Swap.*' fields.

[INPUT]
    Name mem
    Tag  mem.local

[OUTPUT]
    Name  stdout
    Match *

[FILTER]
    Name record_modifier
    Match *
    Remove_key Swap.total
    Remove_key Swap.used
    Remove_key Swap.free

You can also run the filter from command line.

$ fluent-bit -i mem -o stdout -F  record_modifier -p 'Remove_key=Swap.total' -p 'Remove_key=Swap.free' -p 'Remove_key=Swap.used' -m '*'

The output will be

[0] mem.local: [1492436998.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>295332}]

Remove fields with Whitelist_key

The following configuration file is to remain 'Mem.*' fields.

[INPUT]
    Name mem
    Tag  mem.local

[OUTPUT]
    Name  stdout
    Match *

[FILTER]
    Name record_modifier
    Match *
    Whitelist_key Mem.total
    Whitelist_key Mem.used
    Whitelist_key Mem.free

You can also run the filter from command line.

$ fluent-bit -i mem -o stdout -F  record_modifier -p 'Whitelist_key=Mem.total' -p 'Whitelist_key=Mem.free' -p 'Whitelist_key=Mem.used' -m '*'

The output will be

[0] mem.local: [1492436998.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>295332}]

Standard Output

The stdout output plugin allows to print to the standard output the data received through the input plugin. Their usage is very simple as follows:

Configuration Parameters

Key

Description

default

Format

Specify the data format to be printed. Supported formats are msgpack json, json_lines and json_stream.

msgpack

json_date_key

Specify the name of the date field in output

date

json_date_format

Specify the format of the date. Supported formats are double, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and epoch.

double

Command Line

$ bin/fluent-bit -i cpu -o stdout -v

We have specified to gather CPU usage metrics and print them out to the standard output in a human readable way:

$ bin/fluent-bit -i cpu -o stdout -p format=msgpack -v
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2016/10/07 21:52:01] [ info] [engine] started
[0] cpu.0: [1475898721, {"cpu_p"=>0.500000, "user_p"=>0.250000, "system_p"=>0.250000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>1.000000}]
[1] cpu.0: [1475898722, {"cpu_p"=>0.250000, "user_p"=>0.250000, "system_p"=>0.000000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>1.000000, "cpu1.p_user"=>1.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>0.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>0.000000}]
[2] cpu.0: [1475898723, {"cpu_p"=>0.750000, "user_p"=>0.250000, "system_p"=>0.500000, "cpu0.p_cpu"=>2.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>1.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>1.000000, "cpu3.p_cpu"=>0.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>0.000000}]
[3] cpu.0: [1475898724, {"cpu_p"=>1.000000, "user_p"=>0.750000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>2.000000, "cpu1.p_user"=>1.000000, "cpu1.p_system"=>1.000000, "cpu2.p_cpu"=>1.000000, "cpu2.p_user"=>1.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

No more, no less, it just works.

Throttle

The Throttle Filter plugin sets the average Rate of messages per Interval, based on leaky bucket and sliding window algorithm. In case of overflood, it will leak within certain rate.

Configuration Parameters

The plugin supports the following configuration parameters:

Functional description

Lets imagine we have configured:

we received 1 message first second, 3 messages 2nd, and 5 3rd. As you can see, disregard that Window is actually 5, we use "slow" start to prevent overflooding during the startup.

But as soon as we reached Window size * Interval, we will have true sliding window with aggregation over complete window.

When we have average over window is more than Rate, we will start dropping messages, so that

will become:

As you can see, last pane of the window was overwritten and 1 message was dropped.

Interval vs Window size

You might noticed possibility to configure Interval of the Window shift. It is counter intuitive, but there is a difference between two examples above:

and

Even though both examples will allow maximum Rate of 60 messages per minute, first example may get all 60 messages within first second, and will drop all the rest for the entire minute:

While the second example will not allow more than 1 message per second every second, making output rate more smooth:

It may drop some data if the rate is ragged. I would recommend to use bigger interval and rate for streams of rare but important events, while keep Window bigger and Interval small for constantly intensive inputs.

Command Line

Note: It's suggested to use a configuration file.

The following command will load the tail plugin and read the content of lines.txt file. Then the throttle filter will apply a rate limit and only pass the records which are read below the certain rate:

Configuration File

The example above will pass 1000 messages per second in average over 300 seconds.