1 of 100

1.3 Introduction

Fluent Bit is a Fast and Lightweight Log Processor and Forwarder for Linux, OSX and BSD family operating systems. It has been made with a strong focus on performance to allow the collection of events from different sources without complexity.

Fluent Bit is part of the Fluentd project ecosystem, it's licensed under the terms of the Apache License v2.0. This project is made and sponsored by Treasure Data.

About

Fluent Bit is an open source and multi-platform log forwarder tool which aims to be a generic Swiss knife for log collection and distribution.

We, Treasure Data, as a Big Data company, provide an analytics infrastructure in the Cloud where we provide an end-to-end solution to collect, store and do analytics over the data. Fluent Bit is an integral part of this pipeline where it solves the log collection needs.

Being an open source project, it has been widely adopted to solve logging needs in Cloud Native environments where Docker and Kubernetes are key components; Fluent Bit is a natural fit.

Why ?

Data collection and log forwarding is hard.

Nowadays the number of sources of information in our environments is ever increasing. Handling data collection at scale is complex, and collecting and aggregating diverse data requires a specialized tool that can deal with:

Different sources of information.
Different data formats.
Multiple destinations.

Fluent Bit was born to address the need for a high performance and optimized tool that can collect data from any input source, unify that data and deliver it to multiple destinations.

Fluentd & Fluent Bit

Data collection matters and nowadays the scenarios from where the information can come from are very variable. For hence to be more flexible in certain markets needs, we may need different options. On this page, we will describe the relationship between the and open source projects.

and projects are both created and sponsored by and they aim to solve the collection, processing, and delivery of Logs.

Both projects share a lot of similarities, Fluent Bit is fully based on the design and experience of Fluentd architecture and general design. Choosing which one to use depends on the final needs, from an architecture perspective we can consider:

Fluentd is a log collector, processor, and aggregator.
Fluent Bit is a log collector and processor (it doesn't have strong aggregation features like Fluentd).

The following table describes a comparison in different areas of the projects:

Consider Fluentd mainly as an Aggregator and Fluent Bit as a Log Forwarder, we can see both projects complement each other providing a full reliable solution.

Installation

The following section will guide you to the step to download, build and install Fluent Bit from sources and specific instructions for the installation of binaries that we already distribute for Debian/Ubuntu/Redhat/CentOS and Raspberry Pi.

If you find some problem on a certain step, don't hesitate to report the problem on our bug tracker:

Supported Platforms

The following operating systems and architectures are supported in Fluent Bit.

Operating System

Distribution

Architecture

Linux

Centos 7

x86_64

Debian 8 (Jessie)

x86_64

Debian 9 (Stretch)

x86_64

Raspbian 8 (Debian Jessie)

AArch32

Raspbian 9 (Debian Stretch)

AArch32

Ubuntu 16.04 (Xenial Xerus)

x86_64

Ubuntu 18.04 (Bionic Beaver)

x86_64

From an architecture support perspective, Fluent Bit is fully functional on x86, x86_64, AArch32 and AArch64 based processors.

Fluent Bit can work also on OSX and *BSD systems, but not all plugins will be available on all platforms. Official support will be expanding based on community demand.

Requirements

Fluent Bit uses very low CPU and Memory consumption, it's compatible with most of x86, x86_64, AArch32 and AArch64 based platforms. In order to build it you need the following components in your system:

Compiler: GCC or clang
CMake
Flex (only if Stream Processor is enabled)
Bison (only if Stream Processor is enabled)

There are not other dependencies besides libc and pthreads in the most basic mode. For certain features that depends on third party components, those are included in the main source code repository.

Download Sources

Stable

For production systems, we strongly suggest that you always get the latest stable release from our web site, you can get the official tarballs (.tar.gz) from the following link:

Development

For people who aims to contribute to the project testing or extending the code base, can get the development version from our GIT repository:

Note that our master branch is where the development of Fluent Bit happens. Since it's a development version, expect issues when compiling or at run time.

We encourage everybody to help us testing every development version, at the end this is what will become stable.

Upgrade Notes

The following article cover the relevant notes for users upgrading from previous Fluent Bit versions. We aim to cover compatibility changes that you must be aware of.

For more details about changes on each release please refer to the Official Release Notes.

Fluent Bit v1.3

If you are migrating from Fluent Bit v1.2 to v1.3, there are not breaking changes. If you are upgrading from an older version please review the incremental changes below.

Fluent Bit v1.2

Docker, JSON, Parsers and Decoders

On Fluent Bit v1.2 we have fixed many issues associated with JSON encoding and decoding, for hence when parsing Docker logs is not longer necessary to use decoders. The new Docker parser looks like this:

[PARSER]
    Name         docker
    Format       json
    Time_Key     time
    Time_Format  %Y-%m-%dT%H:%M:%S.%L
    Time_Keep    On

Note: again, do not use decoders.

Kubernetes Filter

We have done improvements also on how Kubernetes Filter handle the stringified log message. If the option Merge_Log is enabled, it will try to handle the log content as a JSON map, if so, it will add the keys to the root map.

In addition, we have fixed and improved the option called Merge_Log_Key. If a merge log succeed, all new keys will be packaged under the key specified by this option, a suggested configuration is as follows:

[FILTER]
    Name             Kubernetes
    Match            kube.*
    Kube_Tag_Prefix  kube.var.log.containers.
    Merge_Log        On
    Merge_Log_Key    log_processed

As an example, if the original log content is the following map:

{"key1": "val1", "key2": "val2"}

the final record will be composed as follows:

{
    "log": "{\"key1\": \"val1\", \"key2\": \"val2\"}",
    "log_processed": {
        "key1": "val1",
        "key2": "val2"
    }
}

Fluent Bit v1.1

If you are upgrading from Fluent Bit <= 1.0.x you should take in consideration the following relevant changes when switching to Fluent Bit v1.1 series:

Kubernetes Filter

We introduced a new configuration property called Kube_Tag_Prefix to help Tag prefix resolution and address an unexpected behavior that landed in previous versions.

Duing 1.0.x release cycle, a commit in Tail input plugin changed the default behavior on how the Tag was composed when using the wildcard for expansion generating breaking compatibility with other services. Consider the following configuration example:

[INPUT]
    Name  tail
    Path  /var/log/containers/*.log
    Tag   kube.*

The expected behavior is that Tag will be expanded to:

kube.var.log.containers.apache.log

but the change introduced in 1.0 series switched from absolute path to the base file name only:

kube.apache.log

On Fluent Bit v1.1 release we restored to our default behavior and now the Tag is composed using the absolute path of the monitored file.

Having absolute path in the Tag is relevant for routing and flexible configuration where it also helps to keep compatibility with Fluentd behavior.

This behavior switch in Tail input plugin affects how Filter Kubernetes operates. As you know when the filter is used it needs to perform local metadata lookup that comes from the file names when using Tail as a source. Now with the new Kube_Tag_Prefix option you can specify what's the prefix used in Tail input plugin, for the configuration example above the new configuration will look as follows:

[INPUT]
    Name  tail
    Path  /var/log/containers/*.log
    Tag   kube.*

[FILTER]
    Name             kubernetes
    Match            *
    Kube_Tag_Prefix  kube.var.log.containers.

So the proper for Kube_Tag_Prefix value must be composed by Tag prefix set in Tail input plugin plus the converted monitored directory replacing slashes with dots.

Build with Static Configuration

Fluent Bit in normal operation mode allows to be configurable through text files or using specific arguments in the command line, while this is the ideal deployment case, there are scenarios where a more restricted configuration is required: static configuration mode.

Static configuration mode aims to include a built-in configuration in the final binary of Fluent Bit, disabling the usage of external files or flags at runtime.

Getting Started

Requirements

The following steps assumes you are familiar with configuring Fluent Bit using text files and you have experience building it from scratch as described in the Build and Install section.

Configuration Directory

In your file system prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. It is mandatory that this directory contain as a minimum one configuration file called fluent-bit.conf containing the required SERVICE, INPUT and OUTPUT sections. As an example create a new fluent-bit.conf file with the following content:

[SERVICE]
    Flush     1
    Daemon    off
    Log_Level info

[INPUT]
    Name      cpu

[OUTPUT]
    Name      stdout
    Match     *

the configuration provided above will calculate CPU metrics from the running system and print them to the standard output interface.

Build with Custom Configuration

Inside Fluent Bit source code, get into the build/ directory and run CMake appending the FLB_STATIC_CONF option pointing the configuration directory recently created, e.g:

$ cd fluent-bit/build/
$ cmake -DFLB_STATIC_CONF=/path/to/my/confdir/

then build it:

$ make

At this point the fluent-bit binary generated is ready to run without necessity of further configuration:

$ bin/fluent-bit 
Fluent-Bit v0.15.0
Copyright (C) Treasure Data

[2018/10/19 15:32:31] [ info] [engine] started (pid=15186)
[0] cpu.local: [1539984752.000347547, {"cpu_p"=>0.750000, "user_p"=>0.500000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

TD Agent Bit

We distribute Fluent Bit as packages for specific Enterprise Linux distributions under the name of td-agent-bit. These packages are maintained by .

The following distributions are supported:

Debian Packages

Fluent Bit is distributed as td-agent-bit package and is available for the latest (and old) stable Debian systems: Buster, Stretch and Jessie. This stable Fluent Bit distribution package is maintained by .

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

Update your sources lists

On Debian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Debian 10 (Buster)

Debian 9 (Stretch)

Debian 8 (Jessie)

Update your repositories database

Now let your system update the apt database:

Install TD Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Ubuntu Packages

Fluent Bit is distributed as td-agent-bit package and is available for the latest stable Ubuntu system: Xenial Xerus. This stable Fluent Bit distribution package is maintained by .

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

Update your sources lists

On Ubuntu, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Ubuntu 18.04 LTS (Bionic Beaver)

Ubuntu 16.04 LTS (Xenial Xerus)

Update your repositories database

Now let your system update the apt database:

Install TD-Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

Now the following step is to instruct systemd to enable the service:

If you do a status check, you should see a similar output like this:

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Raspberry Pi

Fluent Bit is distributed as td-agent-bit package and is available for the Raspberry, specifically for Raspbian 8. This stable Fluent Bit distribution package is maintained by Treasure Data, Inc.

Server GPG key

The first step is to add our server GPG key to your keyring, on that way you can get our signed packages:

$ wget -qO - https://packages.fluentbit.io/fluentbit.key | sudo apt-key add -

Update your sources lists

On Debian and derivated systems such as Raspbian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file:

Raspbian 9 (Stretch)

deb https://packages.fluentbit.io/raspbian/stretch stretch main

Raspbian 8 (Jessie)

deb https://packages.fluentbit.io/raspbian/jessie jessie main

Update your repositories database

Now let your system update the apt database:

$ sudo apt-get update

Install TD-Agent Bit

Using the following apt-get command you are able now to install the latest td-agent-bit:

$ sudo apt-get install td-agent-bit

Now the following step is to instruct systemd to enable the service:

$ sudo service td-agent-bit start

If you do a status check, you should see a similar output like this:

sudo service td-agent-bit status
● td-agent-bit.service - TD Agent Bit
   Loaded: loaded (/lib/systemd/system/td-agent-bit.service; disabled; vendor preset: enabled)
   Active: active (running) since mié 2016-07-06 16:58:25 CST; 2h 45min ago
 Main PID: 6739 (td-agent-bit)
    Tasks: 1
   Memory: 656.0K
      CPU: 1.393s
   CGroup: /system.slice/td-agent-bit.service
           └─6739 /opt/td-agent-bit/bin/td-agent-bit -c /etc/td-agent-bit/td-agent-bit.conf
...

The default configuration of td-agent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Yocto Project

source code provides Bitbake recipes to configure, build and package the software for a Yocto based image. Note that specific steps of usage of these recipes in your Yocto environment (Poky) is out of the scope of this documentation.

We distribute two main recipes, one for testing/dev purposes and other with the latest stable release.

It's strongly recommended to always use the stable release of Fluent Bit recipe and not the one from GIT master for production deployments.

Fluent Bit v1.1 and native ARMv8 (aarch64) support

Fluent Bit >= v1.1.x already integrates native AArch64 support where stack switches for co-routines are done through native ASM calls, on this scenario there is no issues as the one faced in previous series.

Getting Started

is a straightforward tool and to get started with it we need to understand it basic workflow. Consider the following diagram a global overview of it:

Service

Fluent Bit has a 'Service' which runs the filter chain from input to output. Global configuration here includes whether to daemonise, diagnostic logging, flush interval, etc.

For more details, please refer to the Service section.

Input

provides different Input Plugins to gather information from different sources, some of them just collect data from log files while others can gather metrics information from the operating system. There are many plugins for different needs.

When an input plugin is loaded, an internal instance is created. Every instance has its own and independent configuration. Configuration keys are often called properties.

Every input plugin has its own documentation section where it's specified how it can be used and what properties are available.

For more details, please refer to the section.

Parser

Dealing with raw strings is a constant pain; having a structure is highly desired. Ideally we want to set a structure to the incoming data by the Input Plugins as soon as they are collected:

The Parser allows you to convert from unstructured to structured data. As a demonstrative example consider the following Apache (HTTP Server) log entry:

192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395

The above log line is a raw string without format, ideally we would like to give it a structure that can be processed later easily. If the proper configuration is used, the log entry could be converted to:

{
  "host":    "192.168.2.20",
  "user":    "-",
  "method":  "GET",
  "path":    "/cgi-bin/try/",
  "code":    "200",
  "size":    "3395",
  "referer": "",
  "agent":   ""
 }

Parsers are fully configurable and are independently and optionally handled by each input plugin, for more details please refer to the Parsers section.

Filter

In production environments we want to have full control of the data we are collecting, filtering is an important feature that allows us to alter the data before delivering it to some destination.

Filtering is implemented through plugins, so each filter available could be used to match, exclude or enrich your logs with some specific metadata.

Very similar to the input plugins, Filters run in an instance context, which has its own independent configuration. Configuration keys are often called properties.

For more details about the Filters available and their usage, please refer to the Filters section.

Routing

Routing is a core feature that allows to route your data through Filters and finally to one or multiple destinations.

There are two important concepts in Routing:

Tag
Match

When the data is generated by the input plugins, it comes with a Tag (most of the time the Tag is configured manually), the Tag is a human-readable indicator that helps to identify the data source.

In order to define where the data should be routed, a Match rule must be specified in the output configuration.

Consider the following configuration example that aims to deliver CPU metrics to an Elasticsearch database and Memory metrics to the standard output interface:

[INPUT]
    Name cpu
    Tag  my_cpu

[INPUT]
    Name mem
    Tag  my_mem

[OUTPUT]
    Name   es
    Match  my_cpu

[OUTPUT]
    Name   stdout
    Match  my_mem

Note: the above is a simple example demonstrating how Routing is configured.

Routing works automatically reading the Input Tags and the Output Match rules. If some data has a Tag that doesn't match upon routing time, the data is deleted.

Routing with Wildcard

Routing is flexible enough to support wildcard in the Match pattern. The below example defines a common destination for both sources of data:

[INPUT]
    Name cpu
    Tag  my_cpu

[INPUT]
    Name mem
    Tag  my_mem

[OUTPUT]
    Name   stdout
    Match  my_*

The match rule is set to my_* which means it will match any Tag that starts with my_.

Output

The output interface allows us to define destinations for the data. Common destinations are remote services, local file system or standard interface with others. Outputs are implemented as plugins and there are many available.

When an output plugin is loaded, an internal instance is created. Every instance has its own independent configuration. Configuration keys are often called properties.

Every output plugin has its own documentation section specifying how it can be used and what properties are available.

For more details, please refer to the Output Plugins section.

Configuration

Fluent Bit is flexible enough to be configured either from the command line or through a configuration file. For production environments, we strongly recommend to use the configuration file approach.

Note that all configuration files use a specific fixed and strict schema, please proceed to the following sections for a better understanding:

(must read)

Configuration Schema

Fluent Bit may optionally use a configuration file to define how the service will behave, and before proceeding we need to understand how the configuration schema works. The schema is defined by three concepts:

Sections
Entries: Key/Value
Indented Configuration Mode

A simple example of a configuration file is as follows:

[SERVICE]
    # This is a commented line
    Daemon    off
    log_level debug

Sections

A section is defined by a name or title inside brackets. Looking at the example above, a Service section has been set using [SERVICE] definition. Section rules:

All section content must be indented (4 spaces ideally).
Multiple sections can exist on the same file.
A section is expected to have comments and entries, it cannot be empty.
Any commented line under a section, must be indented too.

Entries: Key/Value

A section may contain Entries, an entry is defined by a line of text that contains a Key and a Value, using the above example, the [SERVICE] section contains two entries, one is the key Daemon with value off and the other is the key Log_Level with the value debug. Entries rules:

An entry is defined by a key and a value.
A key must be indented.
A key must contain a value which ends in the breakline.
Multiple keys with the same name can exist.

Also commented lines are set prefixing the # character, those lines are not processed but they must be indented too.

Indented Configuration Mode

Fluent Bit configuration files are based in a strict Indented Mode, that means that each configuration file must follow the same pattern of alignment from left to right when writing text. By default an indentation level of four spaces from left to right is suggested. Example:

[FIRST_SECTION]
    # This is a commented line
    Key1  some value
    Key2  another value
    # more comments

[SECOND_SECTION]
    KeyN  3.14

As you can see there are two sections with multiple entries and comments, note also that empty lines are allowed and they do not need to be indented.

Configuration Variables

Fluent Bit support the usage of environment variables in any value associated to a key when using a configuration file.

The variables are case sensitive and can be used in the following format:

${MY_VARIABLE}

When Fluent Bit starts, the configuration reader will detect any request for ${MY_VARIABLE} and will try to resolve it value.

Example

Create the following configuration file (fluent-bit.conf):

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info

[INPUT]
    Name cpu
    Tag  cpu.local

[OUTPUT]
    Name  ${MY_OUTPUT}
    Match *

Open a terminal and set the environment variable:

$ export MY_OUTPUT=stdout

The above command set the 'stdout' value to the variable MY_OUTPUT.

Run Fluent Bit with the recently created configuration file:

$ bin/fluent-bit -c fluent-bit.conf
Fluent-Bit v0.11.0
Copyright (C) Treasure Data

[2017/04/03 12:25:25] [ info] [engine] started
[0] cpu.local: [1491243925, {"cpu_p"=>1.750000, "user_p"=>1.750000, "system_p"=>0.000000, "cpu0.p_cpu"=>3.000000, "cpu0.p_user"=>2.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>4.000000, "cpu2.p_user"=>4.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

As you can see the service worked properly as the configuration was valid.

Unit Sizes

Certain configuration directives in Fluent Bit refer to unit sizes such as when defining the size of a buffer or specific limits, we can find these in plugins like , or in generic properties like .

Starting from v0.11.10, all unit sizes have been standardized across the core and plugins, the following table describes the options that can be used and what they mean:

Backpressure

In certain environments is common to see that logs or data being ingested is faster than the ability to flush it to some destinations. The common case is reading from big log files and dispatching the logs to a backend over the network which takes some time to respond, this generate backpressure leading to a high memory consumption in the service.

In order to avoid backpressure, Fluent Bit implements a mechanism in the engine that restrict the amount of data than an input plugin can ingest, this is done through the configuration parameter Mem_Buf_Limit.

Mem_Buf_Limit

This option is disabled by default and can be applied to all input plugins. Let's explain it behavior using the following scenario:

Mem_Buf_Limit is set to 1MB (one megabyte)
input plugin tries to append 700KB
engine route the data to an output plugin
output plugin backend (HTTP Server) is down
engine scheduler will retry the flush after 10 seconds
input plugin tries to append 500KB

At this exact point, the engine will allow to append those 500KB of data into the engine: in total we have 1.2MB. The options works in a permissive mode before to reach the limit, but the limit is exceeded the following actions are taken:

block local buffers for the input plugin (cannot append more data)
notify the input plugin invoking a pause callback

The engine will protect it self and will not append more data coming from the input plugin in question; Note that is the plugin responsibility to keep their state and take some decisions about what to do on that paused state.

After some seconds if the scheduler was able to flush the initial 700KB of data or it gave up after retrying, that amount memory is released and internally the following actions happens:

Upon data buffer release (700KB), the internal counters get updated
Counters now are set at 500KB
Since 500KB is < 1MB it checks the input plugin state
If the plugin is paused, it invokes a resume callback
input plugin can continue appending more data

About pause and resume Callbacks

Each plugin is independent and not all of them implements the pause and resume callbacks. As said, these callbacks are just a notification mechanism for the plugin.

The plugin who implements and keep a good state is the Tail Input plugin. When the pause callback is triggered, it stop their collectors and stop appending data. Upon resume, it re-enable the collectors.

Scheduler

has an Engine that helps to coordinate the data ingestion from input plugins and call the Scheduler to decide when is time to flush the data through one or multiple output plugins. The Scheduler flush new data every a fixed time of seconds and Schedule retries when asked.

Once an output plugin gets call to flush some data, after processing that data it can notify the Engine three possible return statuses:

OK
Retry
Error

If the return status was OK, it means it was successfully able to process and flush the data, if it returned an Error status, means that an unrecoverable error happened and the engine should not try to flush that data again. If a Retry was requested, the Engine will ask the Scheduler to retry to flush that data, the Scheduler will decide how many seconds to wait before that happen.

Configuring Retries

The Scheduler provides a simple configuration option called Retry_Limit which can be set independently on each output section. This option allows to disable retries or impose a limit to try N times and then discard the data after reaching that limit:

Example

The following example configure two outputs where the HTTP plugin have an unlimited number of retries and the Elasticsearch plugin have a limit of 5 times:

Collectd

The collectd input plugin allows you to receive datagrams from collectd.

Content:

Configuration Parameters

The plugin supports the following configuration parameters:

Configuration Examples

Here is a basic configuration example.

With this configuration, fluent-bit listens to 0.0.0.0:25826, and outputs incoming datagram packets to stdout.

You must set the same types.db files that your collectd server uses. Otherwise, fluent-bit may not be able to interpret the payload properly.

Disk Usage

The disk input plugin, gathers the information about the disk throughput of the running system every certain interval of time and reports them.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to get disk usage from your system, you can run the plugin from the command line or through the configuration file:

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

Note: Total interval (sec) = Interval_Sec + (Interval_Nsec / 1000000000).

e.g. 1.5s = 1s + 500000000ns

Dummy

The dummy input plugin, generates dummy events. It is useful for testing, debugging, benchmarking and getting started with Fluent Bit.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

Exec

The exec input plugin, allows to execute external program and collects event logs.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

The following example will read events from the output of ls.

Configuration File

In your main configuration file append the following Input & Output sections:

Health

Health input plugin allows you to check how healthy a TCP server is. It does the check by issuing a TCP connection every a certain interval of time.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Host

Name of the target host or IP address to check.

Port

TCP port where to perform the connection check.

Interval_Sec

Interval in seconds between the service checks. Default value is 1.

Internal_Nsec

Specify a nanoseconds interval for service checks, it works in conjuntion with the Interval_Sec configuration key. Default value is 0.

Alert

If enabled, it will only generate messages if the target TCP service is down. By default this option is disabled.

Add_Host

If enabled, hostname is appended to each records. Default value is false.

Add_Port

If enabled, port number is appended to each records. Default value is false.

Getting Started

In order to start performing the checks, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit generate the checks with the following options:

$ fluent-bit -i health://127.0.0.1:80 -o stdout

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name          health
    Host          127.0.0.1
    Port          80
    Interval_Sec  1
    Interval_NSec 0

[OUTPUT]
    Name   stdout
    Match  *

Testing

Once Fluent Bit is running, you will see some random values in the output interface similar to this:

$ fluent-bit -i health://127.0.0.1:80 -o stdout
Fluent-Bit v0.9.0
Copyright (C) Treasure Data

[2016/10/07 21:37:51] [ info] [engine] started
[0] health.0: [1475897871, {"alive"=>true}]
[1] health.0: [1475897872, {"alive"=>true}]
[2] health.0: [1475897873, {"alive"=>true}]
[3] health.0: [1475897874, {"alive"=>true}]

Kernel Log Buffer

The kmsg input plugin reads the Linux Kernel log buffer since the beginning, it gets every record and parse it field as priority, sequence, seconds, useconds, and message.

Getting Started

In order to start getting the Linux Kernel messages, you can run the plugin from the command line or through the configuration file:

Command Line

As described above, the plugin processed all messages that the Linux Kernel reported, the output has been truncated for clarification.

Configuration File

In your main configuration file append the following Input & Output sections:

Memory Usage

The mem input plugin, gathers the information about the memory and swap usage of the running system every certain interval of time and reports the total amount of memory and the amount of free available.

Getting Started

In order to get memory and swap usage from your system, you can run the plugin from the command line or through the configuration file:

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

MQTT

The MQTT input plugin, allows to retrieve messages/data from MQTT control packets over a TCP connection. The incoming data to receive must be a JSON map.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Listen

Listener network interface, default: 0.0.0.0

Port

TCP port where listening for connections, default: 1883

Getting Started

In order to start listening for MQTT messages, you can run the plugin from the command line or through the configuration file:

Command Line

Since the MQTT input plugin let Fluent Bit behave as a server, we need to dispatch some messages using some MQTT client, in the following example mosquitto tool is being used for the purpose:

$ fluent-bit -i mqtt -t data -o stdout -m '*'
Fluent-Bit v0.8.0
Copyright (C) Treasure Data

[2016/05/20 14:22:52] [ info] starting engine
[0] data: [1463775773, {"topic"=>"some/topic", "key1"=>123, "key2"=>456}]

The following command line will send a message to the MQTT input plugin:

$ mosquitto_pub  -m '{"key1": 123, "key2": 456}' -t some/topic

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name   mqtt
    Tag    data
    Listen 0.0.0.0
    Port   1883

[OUTPUT]
    Name   stdout
    Match  *

Network Traffic

The netif input plugin gathers network traffic information of the running system every certain interval of time, and reports them.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

In order to monitor network traffic from your system, you can run the plugin from the command line or through the configuration file:

Command Line

Configuration File

In your main configuration file append the following Input & Output sections:

Note: Total interval (sec) = Interval_Sec + (Interval_Nsec / 1000000000).

e.g. 1.5s = 1s + 500000000ns

Process

Process input plugin allows you to check how health a process is. It does the check by issuing a process every a certain interval of time.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Proc_Name

Name of the target Process to check.

Interval_Sec

Interval in seconds between the service checks. Default value is 1.

Internal_Nsec

Specify a nanoseconds interval for service checks, it works in conjuntion with the Interval_Sec configuration key. Default value is 0.

Alert

If enabled, it will only generate messages if the target process is down. By default this option is disabled.

If enabled, a number of fd is appended to each records. Default value is true.

Mem

If enabled, memory usage of the process is appended to each records. Default value is true.

Getting Started

In order to start performing the checks, you can run the plugin from the command line or through the configuration file:

The following example will check the health of crond process.

$ fluent-bit -i proc -p proc_name=crond -o stdout

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name          proc
    Proc_Name     crond
    Interval_Sec  1
    Interval_NSec 0
    Fd            true
    Mem           true

[OUTPUT]
    Name   stdout
    Match  *

Testing

Once Fluent Bit is running, you will see the health of process:

$ fluent-bit -i proc -p proc_name=fluent-bit -o stdout
Fluent-Bit v0.11.0
Copyright (C) Treasure Data

[2017/01/30 21:44:56] [ info] [engine] started
[0] proc.0: [1485780297, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1120000, "mem.VmRSS"=>1120000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]
[1] proc.0: [1485780298, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1148000, "mem.VmRSS"=>1148000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]
[2] proc.0: [1485780299, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1152000, "mem.VmRSS"=>1148000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]
[3] proc.0: [1485780300, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1152000, "mem.VmRSS"=>1148000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]

Random

Random input plugin generate very simple random value samples using the device interface /dev/urandom, if not available it will use a unix timestamp as value.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Samples

If set, it will only generate a specific number of samples. By default this value is set to -1, which will generate unlimited samples.

Interval_Sec

Interval in seconds between samples generation. Default value is 1.

Internal_Nsec

Specify a nanoseconds interval for samples generation, it works in conjuntion with the Interval_Sec configuration key. Default value is 0.

Getting Started

In order to start generating random samples, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit generate the samples with the following options:

$ fluent-bit -i random -o stdout

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name          random
    Samples      -1
    Interval_Sec  1
    Interval_NSec 0

[OUTPUT]
    Name   stdout
    Match  *

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

$ fluent-bit -i random -o stdout
Fluent-Bit v0.9.0
Copyright (C) Treasure Data

[2016/10/07 20:27:34] [ info] [engine] started
[0] random.0: [1475893654, {"rand_value"=>1863375102915681408}]
[1] random.0: [1475893655, {"rand_value"=>425675645790600970}]
[2] random.0: [1475893656, {"rand_value"=>7580417447354808203}]
[3] random.0: [1475893657, {"rand_value"=>1501010137543905482}]
[4] random.0: [1475893658, {"rand_value"=>16238242822364375212}]

Standard Input

The stdin plugin allows to retrieve valid JSON text messages over the standard input interface (stdin). In order to use it, specify the plugin name as the input, e.g:

As input data the stdin plugin recognize the following JSON data formats:

A better example to demonstrate how it works will be through a Bash script that generates messages and writes them to . Write the following content in a file named test.sh:

Give the script execution permission:

Now lets start the script and in the following way:

Thermal

The thermal input plugin reports system temperatures periodically -- each second by default. Currently this plugin is only available for Linux.

The following tables describes the information generated by the plugin.

key

description

name

The name of the thermal zone, such as thermal_zone0

type

The type of the thermal zone, such as x86_pkg_temp

temp

Current temperature in celcius

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Interval_Sec

Polling interval (seconds). default: 1

Interval_NSec

Polling interval (nanoseconds). default: 0

name_regex

Optional name filter regex. default: None

type_regex

Optional type filter regex. default: None

Getting Started

In order to get temperature(s) of your system, you can run the plugin from the command line or through the configuration file:

Command Line

$ bin/fluent-bit -i thermal -t my_thermal -o stdout -m '*'
Fluent Bit v1.3.0
Copyright (C) Treasure Data

[2019/08/18 13:39:43] [ info] [storage] initializing...
...
[0] my_thermal: [1566099584.000085820, {"name"=>"thermal_zone0", "type"=>"x86_pkg_temp", "temp"=>60.000000}]
[1] my_thermal: [1566099585.000136466, {"name"=>"thermal_zone0", "type"=>"x86_pkg_temp", "temp"=>59.000000}]
[2] my_thermal: [1566099586.000083156, {"name"=>"thermal_zone0", "type"=>"x86_pkg_temp", "temp"=>59.000000}]

Some systems provide multiple thermal zones. In this example monitor only thermal_zone0 by name, once per minute.

$ bin/fluent-bit -i thermal -t my_thermal -p "interval_sec=60" -p "name_regex=thermal_zone0" -o stdout -m '*'
Fluent Bit v1.3.0
Copyright (C) Treasure Data

[2019/08/18 13:39:43] [ info] [storage] initializing...
...
[0] my_temp: [1565759542.001053749, {"name"=>"thermal_zone0", "type"=>"pch_skylake", "temp"=>48.500000}]
[0] my_temp: [1565759602.001661061, {"name"=>"thermal_zone0", "type"=>"pch_skylake", "temp"=>48.500000}]

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name thermal
    Tag  my_thermal

[OUTPUT]
    Name  stdout
    Match *

Windows Event Log

The winlog input plugin allows you to read Windows Event Log.

Content:

Configuration Parameters
Configuration Examples

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Channels

A comma-separated list of channels to read from.

Interval_Sec

Set the polling interval for each channel. (optional)

Set the path to save the read offsets. (optional)

Note that if you do not set db, the plugin will read channels from the beginning on each startup.

Configuration Examples

Configuration File

Here is a minimum configuration example.

[INPUT]
    Name         winlog
    Channels     Setup,Windows PowerShell
    Interval_Sec 1
    DB           winlog.sqlite

[OUTPUT]
    Name   stdout
    Match  *

Note that some Windows Event Log channels (like Security) requires an admin privilege for reading. In this case, you need to run fluent-bit as an administrator.

Command Line

If you want to do a quick test, you can run this plugin from the command line.

$ fluent-bit -i winlog -p 'channels=Setup' -o stdout

LTSV Parser

The ltsv parser allows to parse LTSV formatted texts.

Labeled Tab-separated Values (LTSV format is a variant of Tab-separated Values (TSV). Each record in a LTSV file is represented as a single line. Each field is separated by TAB and has a label and a value. The label and the value have been separated by ':'.

Here is an example how to use this format in the apache access log.

Config this in httpd.conf:

LogFormat "host:%h\tident:%l\tuser:%u\ttime:%t\treq:%r\tstatus:%>s\tsize:%b\treferer:%{Referer}i\tua:%{User-Agent}i" combined_ltsv
CustomLog "logs/access_log" combined_ltsv

The parser.conf:

[PARSER]
    Name        access_log_ltsv
    Format      ltsv
    Time_Key    time
    Time_Format [%d/%b/%Y:%H:%M:%S %z]
    Types       status:integer size:integer

The following log entry is a valid content for the parser defined above:

host:127.0.0.1  ident:- user:-  time:[10/Jul/2018:13:27:05 +0200]       req:GET / HTTP/1.1      status:200      size:16218      referer:http://127.0.0.1/       ua:Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
host:127.0.0.1  ident:- user:-  time:[10/Jul/2018:13:27:05 +0200]       req:GET /assets/plugins/bootstrap/css/bootstrap.min.css HTTP/1.1        status:200      size:121200     referer:http://127.0.0.1/       ua:Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
host:127.0.0.1  ident:- user:-  time:[10/Jul/2018:13:27:05 +0200]       req:GET /assets/css/headers/header-v6.css HTTP/1.1      status:200      size:37706      referer:http://127.0.0.1/       ua:Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
host:127.0.0.1  ident:- user:-  time:[10/Jul/2018:13:27:05 +0200]       req:GET /assets/css/style.css HTTP/1.1  status:200      size:1279       referer:http://127.0.0.1/       ua:Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0

After processing, it internal representation will be:

[1531222025.000000000, {"host"=>"127.0.0.1", "ident"=>"-", "user"=>"-", "req"=>"GET / HTTP/1.1", "status"=>200, "size"=>16218, "referer"=>"http://127.0.0.1/", "ua"=>"Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"}]
[1531222025.000000000, {"host"=>"127.0.0.1", "ident"=>"-", "user"=>"-", "req"=>"GET /assets/plugins/bootstrap/css/bootstrap.min.css HTTP/1.1", "status"=>200, "size"=>121200, "referer"=>"http://127.0.0.1/", "ua"=>"Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"}]
[1531222025.000000000, {"host"=>"127.0.0.1", "ident"=>"-", "user"=>"-", "req"=>"GET /assets/css/headers/header-v6.css HTTP/1.1", "status"=>200, "size"=>37706, "referer"=>"http://127.0.0.1/", "ua"=>"Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"}]
[1531222025.000000000, {"host"=>"127.0.0.1", "ident"=>"-", "user"=>"-", "req"=>"GET /assets/css/style.css HTTP/1.1", "status"=>200, "size"=>1279, "referer"=>"http://127.0.0.1/", "ua"=>"Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0"}]

The time has been converted to Unix timestamp (UTC).

Logfmt Parser

The logfmt parser allows to parse the logfmt format described in . A more formal description is in .

Here is an example configuration:

The following log entry is a valid content for the parser defined above:

After processing, it internal representation will be:

Record Modifier

The Record Modifier Filter plugin allows to append fields or to exclude specific fields.

Configuration Parameters

The plugin supports the following configuration parameters: Remove_key and Whitelist_key are exclusive.

Key

Description

Record

Append fields. This parameter needs key and value pair.

Remove_key

If the key is matched, that field is removed.

Whitelist_key

If the key is not matched, that field is removed.

Getting Started

In order to start filtering records, you can run the filter from the command line or through the configuration file.

This is a sample in_mem record to filter.

{"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>299352, "Swap.total"=>2064380, "Swap.used"=>32656, "Swap.free"=>2031724}

Append fields

The following configuration file is to append product name and hostname (via environment variable) to record.

[INPUT]
    Name mem
    Tag  mem.local

[OUTPUT]
    Name  stdout
    Match *

[FILTER]
    Name record_modifier
    Match *
    Record hostname ${HOSTNAME}
    Record product Awesome_Tool

You can also run the filter from command line.

$ fluent-bit -i mem -o stdout -F record_modifier -p 'Record=hostname ${HOSTNAME}' -p 'Record=product Awesome_Tool' -m '*'

The output will be

[0] mem.local: [1492436882.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>299352, "Swap.total"=>2064380, "Swap.used"=>32656, "Swap.free"=>2031724, "hostname"=>"localhost.localdomain", "product"=>"Awesome_Tool"}]

Remove fields with Remove_key

The following configuration file is to remove 'Swap.*' fields.

[INPUT]
    Name mem
    Tag  mem.local

[OUTPUT]
    Name  stdout
    Match *

[FILTER]
    Name record_modifier
    Match *
    Remove_key Swap.total
    Remove_key Swap.used
    Remove_key Swap.free

You can also run the filter from command line.

$ fluent-bit -i mem -o stdout -F  record_modifier -p 'Remove_key=Swap.total' -p 'Remove_key=Swap.free' -p 'Remove_key=Swap.used' -m '*'

The output will be

[0] mem.local: [1492436998.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>295332}]

Remove fields with Whitelist_key

The following configuration file is to remain 'Mem.*' fields.

[INPUT]
    Name mem
    Tag  mem.local

[OUTPUT]
    Name  stdout
    Match *

[FILTER]
    Name record_modifier
    Match *
    Whitelist_key Mem.total
    Whitelist_key Mem.used
    Whitelist_key Mem.free

You can also run the filter from command line.

$ fluent-bit -i mem -o stdout -F  record_modifier -p 'Whitelist_key=Mem.total' -p 'Whitelist_key=Mem.free' -p 'Whitelist_key=Mem.used' -m '*'

The output will be

[0] mem.local: [1492436998.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>295332}]

Standard Output

The Standard Output Filter plugin allows to print to the standard output the data received through the input plugin.

Configuration Parameters

There are no parameters.

Getting Started

In order to start filtering records, you can run the filter from the command line or through the configuration file.

Command Line

Configuration File

In your main configuration file append the following FILTER sections:

Counter

Counter is a very simple plugin that counts how many records it's getting upon flush time. Plugin output is as follows:

[TIMESTAMP, NUMBER_OF_RECORDS_NOW] (total = RECORDS_SINCE_IT_STARTED)

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

$ fluent-bit -i cpu -o counter

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name cpu
    Tag  cpu

[OUTPUT]
    Name  counter
    Match *

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

$ bin/fluent-bit -i cpu -o counter -f 1
Fluent-Bit v0.12.0
Copyright (C) Treasure Data

[2017/07/19 11:19:02] [ info] [engine] started
1500484743,1 (total = 1)
1500484744,1 (total = 2)
1500484745,1 (total = 3)
1500484746,1 (total = 4)
1500484747,1 (total = 5)

Kafka REST Proxy

The kafka-rest output plugin, allows to flush your records into a Kafka REST Proxy server. The following instructions assumes that you have a fully operational Kafka REST Proxy and Kafka services running in your environment.

Configuration Parameters

Key

Description

default

Host

IP address or hostname of the target Kafka REST Proxy server

127.0.0.1

Port

TCP port of the target Kafka REST Proxy server

8082

Topic

Set the Kafka topic

fluent-bit

Partition

Set the partition number (optional)

Message_Key

Set a message key (optional)

Time_Key

The Time_Key property defines the name of the field that holds the record timestamp.

@timestamp

Time_Key_Format

Defines the format of the timestamp.

%Y-%m-%dT%H:%M:%S

Include_Tag_Key

Append the Tag name to the final record.

Off

Tag_Key

If Include_Tag_Key is enabled, this property defines the key name for the tag.

_flb-key

TLS / SSL

Kafka REST Proxy output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.

Getting Started

In order to insert records into a Kafka REST Proxy service, you can run the plugin from the command line or through the configuration file:

Command Line

The kafka-rest plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

$ fluent-bit -i cpu -t cpu -o kafka-rest -p host=127.0.0.1 -p port=8082 -m '*'

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name        kafka-rest
    Match       *
    Host        127.0.0.1
    Port        8082
    Topic       fluent-bit
    Message_Key my_key

NATS

The nats output plugin, allows to flush your records into a NATS Server end point. The following instructions assumes that you have a fully operational NATS Server in place.

In order to flush records, the nats plugin requires to know two parameters:

parameter

description

default

host

IP address or hostname of the NATS Server

127.0.0.1

port

TCP port of the target NATS Server

4222

In order to override the default configuration values, the plugin uses the optional Fluent Bit network address format, e.g:

nats://host:port

Running

Fluent Bit only requires to know that it needs to use the nats output plugin, if no extra information is given, it will use the default values specified in the above table.

$ bin/fluent-bit -i cpu -o nats -V -f 5
Fluent-Bit v0.7.0
Copyright (C) Treasure Data

[2016/03/04 10:17:33] [ info] Configuration
flush time     : 5 seconds
input plugins  : cpu
collectors     :
[2016/03/04 10:17:33] [ info] starting engine
cpu[all] all=3.250000 user=2.500000 system=0.750000
cpu[i=0] all=3.000000 user=1.000000 system=2.000000
cpu[i=1] all=3.000000 user=2.000000 system=1.000000
cpu[i=2] all=2.000000 user=2.000000 system=0.000000
cpu[i=3] all=6.000000 user=5.000000 system=1.000000
[2016/03/04 10:17:33] [debug] [in_cpu] CPU 3.25%
...

As described above, the target service and storage point can be changed, e.g:

Data format

For every set of records flushed to a NATS Server, Fluent Bit uses the following JSON format:

[
  [UNIX_TIMESTAMP, JSON_MAP_1],
  [UNIX_TIMESTAMP, JSON_MAP_2],
  [UNIX_TIMESTAMP, JSON_MAP_N],
]

Each record is an individual entity represented in a JSON array that contains a UNIX_TIMESTAMP and a JSON map with a set of key/values. A summarized output of the CPU input plugin will looks as this:

[
  [1457108504,{"tag":"fluentbit","cpu_p":1.500000,"user_p":1,"system_p":0.500000}],
  [1457108505,{"tag":"fluentbit","cpu_p":4.500000,"user_p":3,"system_p":1.500000}],
  [1457108506,{"tag":"fluentbit","cpu_p":6.500000,"user_p":4.500000,"system_p":2}]
]

Null

The null output plugin just throws away events.

Configuration Parameters

The plugin doesn't support configuration parameters.

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit throws away events with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Forward

Forward is the protocol used by Fluentd to route messages between peers. The forward output plugin allows to provide interoperability between Fluent Bit and Fluentd. There are not configuration steps required besides to specify where Fluentd is located, it can be in the local host or a in a remote machine.

This plugin offers two different transports and modes:

Forward (TCP): It uses a plain TCP connection.
Secure Forward (TLS): when TLS is enabled, the plugin switch to Secure Forward mode.

Configuration Parameters

The following parameters are mandatory for either Forward for Secure Forward modes:

Key

Description

Default

Host

Target host where Fluent-Bit or Fluentd are listening for Forward messages.

127.0.0.1

Port

TCP Port of the target service.

24224

Time_as_Integer

Set timestamps in integer format, it enable compatibility mode for Fluentd v0.12 series.

False

Upstream

If Forward will connect to an Upstream instead of a simple host, this property defines the absolute path for the Upstream configuration file, for more details about this refer to the documentation section.

Send_options

Always send options (with "size"=count of messages)

False

Require_ack_response

Send "chunk"-option and wait for "ack" response from server. Enables at-least-once and receiving server can control rate of traffic. (Requires Fluentd v0.14.0+ server)

False

Secure Forward Mode Configuration Parameters

When using Secure Forward mode, the TLS mode requires to be enabled. The following additional configuration parameters are available:

Key

Description

Default

Shared_Key

A key string known by the remote Fluentd used for authorization.

Empty_Shared_Key

Use this option to connect to Fluentd with a zero-length secret.

False

Username

Specify the username to present to a Fluentd server that enables user_auth.

Password

Specify the password corresponding to the username.

Self_Hostname

Default value of the auto-generated certificate common name (CN).

tls

Enable or disable TLS support

Off

tls.verify

Force certificate validation

tls.debug

Set TLS debug verbosity level. It accept the following values: 0 (No debug), 1 (Error), 2 (State change), 3 (Informational) and 4 Verbose

tls.ca_file

Absolute path to CA certificate file

tls.crt_file

Absolute path to Certificate file.

tls.key_file

Absolute path to private Key file.

tls.key_passwd

Optional password for tls.key_file file.

Forward Setup

Before proceeding, make sure that Fluentd is installed in your system, if it's not the case please refer to the following Fluentd Installation document and go ahead with that.

Once Fluentd is installed, create the following configuration file example that will allow us to stream data into it:

<source>
  type forward
  bind 0.0.0.0
  port 24224
</source>

<match fluent_bit>
  type stdout
</match>

That configuration file specifies that it will listen for TCP connections on the port 24224 through the forward input type. Then for every message with a fluent_bit TAG, will print the message to the standard output.

In one terminal launch Fluentd specifying the new configuration file created (in_fluent-bit.conf):

$ fluentd -c test.conf
2017-03-23 11:50:43 -0600 [info]: reading config file path="test.conf"
2017-03-23 11:50:43 -0600 [info]: starting fluentd-0.12.33
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-mixin-config-placeholders' version '0.3.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-docker' version '0.1.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-elasticsearch' version '1.4.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-flatten-hash' version '0.2.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.0.4'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-influxdb' version '0.2.8'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-json-in-json' version '0.1.4'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-mongo' version '0.7.10'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-out-http' version '0.1.3'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-parser' version '0.6.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-record-reformer' version '0.7.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-stdin' version '0.1.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-td' version '0.10.27'
2017-03-23 11:50:43 -0600 [info]: adding match pattern="fluent_bit" type="stdout"
2017-03-23 11:50:43 -0600 [info]: adding source type="forward"
2017-03-23 11:50:43 -0600 [info]: using configuration file: <ROOT>
  <source>
    type forward
    bind 0.0.0.0
    port 24224
  </source>
  <match fluent_bit>
    type stdout
  </match>
</ROOT>
2017-03-23 11:50:43 -0600 [info]: listening fluent socket on 0.0.0.0:24224

Fluent Bit + Forward Setup

Now that Fluentd is ready to receive messages, we need to specify where the forward output plugin will flush the information using the following format:

bin/fluent-bit -i INPUT -o forward://HOST:PORT

If the TAG parameter is not set, the plugin will set the tag as fluent_bit. Keep in mind that TAG is important for routing rules inside Fluentd.

Using the CPU input plugin as an example we will flush CPU metrics to Fluentd:

$ bin/fluent-bit -i cpu -t fluent_bit -o forward://127.0.0.1:24224

Now on the Fluentd side, you will see the CPU metrics gathered in the last seconds:

2017-03-23 11:53:06 -0600 fluent_bit: {"cpu_p":0.0,"user_p":0.0,"system_p":0.0,"cpu0.p_cpu":0.0,"cpu0.p_user":0.0,"cpu0.p_system":0.0,"cpu1.p_cpu":0.0,"cpu1.p_user":0.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0}
2017-03-23 11:53:07 -0600 fluent_bit: {"cpu_p":2.25,"user_p":2.0,"system_p":0.25,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":1.0,"cpu1.p_user":1.0,"cpu1.p_system":0.0,"cpu2.p_cpu":1.0,"cpu2.p_user":1.0,"cpu2.p_system":0.0,"cpu3.p_cpu":3.0,"cpu3.p_user":2.0,"cpu3.p_system":1.0}
2017-03-23 11:53:08 -0600 fluent_bit: {"cpu_p":1.75,"user_p":1.0,"system_p":0.75,"cpu0.p_cpu":2.0,"cpu0.p_user":1.0,"cpu0.p_system":1.0,"cpu1.p_cpu":3.0,"cpu1.p_user":1.0,"cpu1.p_system":2.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":2.0,"cpu3.p_user":1.0,"cpu3.p_system":1.0}
2017-03-23 11:53:09 -0600 fluent_bit: {"cpu_p":4.75,"user_p":3.5,"system_p":1.25,"cpu0.p_cpu":4.0,"cpu0.p_user":3.0,"cpu0.p_system":1.0,"cpu1.p_cpu":5.0,"cpu1.p_user":4.0,"cpu1.p_system":1.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":5.0,"cpu3.p_user":4.0,"cpu3.p_system":1.0}

So we gathered CPU metrics and flushed them out to Fluentd properly.

Fluent Bit + Secure Forward Setup

DISCLAIMER: the following example do not consider the generation of certificates for a proper usage of production environments.

Secure Forward aims to provide a secure channel of communication with the remote Fluentd service using TLS. Above there is a minimalist configuration for testing purposes.

Fluent Bit

Paste this content in a file called flb.conf:

[SERVICE]
    Flush      5
    Daemon     off
    Log_Level  info

[INPUT]
    Name       cpu
    Tag        cpu_usage

[OUTPUT]
    Name          forward
    Match         *
    Host          127.0.0.1
    Port          24284
    Shared_Key    secret
    Self_Hostname flb.local
    tls           on
    tls.verify    off

Fluentd

Paste this content in a file called fld.conf:

<source>
  @type         secure_forward
  self_hostname myserver.local
  shared_key    secret
  secure no
</source>

<match **>
 @type stdout
</match>

If you're using Fluentd v1, set up it as below:

<source>
  @type forward
  <transport tls>
    cert_path /etc/td-agent/certs/fluentd.crt
    private_key_path /etc/td-agent/certs/fluentd.key
    private_key_passphrase password
  </transport>
  <security>
    self_hostname myserver.local
    shared_key secret
  </security>
</source>

<match **>
 @type stdout
</match>

Test Communication

Start Fluentd:

$ fluentd -c fld.conf

Start Fluent Bit:

$ fluent-bit -c flb.conf

After five seconds, Fluent Bit will write the records to Fluentd. In Fluentd output you will see a message like this:

2017-03-23 13:34:40 -0600 [info]: using configuration file: <ROOT>
  <source>
    @type secure_forward
    self_hostname myserver.local
    shared_key xxxxxx
    secure no
  </source>
  <match **>
    @type stdout
  </match>
</ROOT>
2017-03-23 13:34:41 -0600 cpu_usage: {"cpu_p":1.0,"user_p":0.75,"system_p":0.25,"cpu0.p_cpu":1.0,"cpu0.p_user":1.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":1.0,"cpu1.p_system":1.0,"cpu2.p_cpu":1.0,"cpu2.p_user":1.0,"cpu2.p_system":0.0,"cpu3.p_cpu":2.0,"cpu3.p_user":1.0,"cpu3.p_system":1.0}
2017-03-23 13:34:42 -0600 cpu_usage: {"cpu_p":1.75,"user_p":1.75,"system_p":0.0,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":2.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0}
2017-03-23 13:34:43 -0600 cpu_usage: {"cpu_p":1.75,"user_p":1.25,"system_p":0.5,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":2.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":0.0,"cpu3.p_system":1.0}
2017-03-23 13:34:44 -0600 cpu_usage: {"cpu_p":5.0,"user_p":3.25,"system_p":1.75,"cpu0.p_cpu":4.0,"cpu0.p_user":2.0,"cpu0.p_system":2.0,"cpu1.p_cpu":8.0,"cpu1.p_user":5.0,"cpu1.p_system":3.0,"cpu2.p_cpu":4.0,"cpu2.p_user":3.0,"cpu2.p_system":1.0,"cpu3.p_cpu":4.0,"cpu3.p_user":2.0,"cpu3.p_system":2.0}

Kubernetes

Fluent Bit Kubernetes Filter allows to enrich your log files with Kubernetes metadata.

When Fluent Bit is deployed in Kubernetes as a DaemonSet and configured to read the log files from the containers (using tail or systemd input plugins), this filter aims to perform the following operations:

Analyze the Tag and extract the following metadata:
- Pod Name
- Namespace
- Container Name
- Container ID
Query Kubernetes API Server to obtain extra metadata for the POD in question:
- Pod ID
- Labels
- Annotations

The data is cached locally in memory and appended to each record.

Configuration Parameters

The plugin supports the following configuration parameters:

Key

Description

Default

Buffer_Size

Set the buffer size for HTTP client when reading responses from Kubernetes API server. The value must be according to the specification.

32k

Kube_URL

API Server end-point

Kube_CA_File

CA certificate file

/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Kube_CA_Path

Absolute path to scan for certificate files

Kube_Token_File

Token file

/var/run/secrets/kubernetes.io/serviceaccount/token

Kube_Tag_Prefix

When the source records comes from Tail input plugin, this option allows to specify what's the prefix used in Tail configuration.

kube.var.log.containers.

Merge_Log

When enabled, it checks if the log field content is a JSON string map, if so, it append the map fields as part of the log structure.

Off

Merge_Log_Key

When Merge_Log is enabled, the filter tries to assume the log field from the incoming message is a JSON string message and make a structured representation of it at the same level of the log field in the map. Now if Merge_Log_Key is set (a string name), all the new structured fields taken from the original log content are inserted under the new key.

Merge_Log_Trim

When Merge_Log is enabled, trim (remove possible \n or \r) field values.

Merge_Parser

Optional parser name to specify how to parse the data contained in the log key. Recommended use is for developers or testing only.

Keep_Log

When Keep_Log is disabled, the log field is removed from the incoming message once it has been successfully merged (Merge_Log must be enabled as well).

tls.debug

Debug level between 0 (nothing) and 4 (every detail).

-1

tls.verify

When enabled, turns on certificate validation when connecting to the Kubernetes API server.

Use_Journal

When enabled, the filter reads logs coming in Journald format.

Off

Regex_Parser

Set an alternative Parser to process record Tag and extract pod_name, namespace_name, container_name and docker_id. The parser must be registered in a (refer to parser filter-kube-test as an example).

K8S-Logging.Parser

Allow Kubernetes Pods to suggest a pre-defined Parser (read more about it in Kubernetes Annotations section)

Off

K8S-Logging.Exclude

Allow Kubernetes Pods to exclude their logs from the log processor (read more about it in Kubernetes Annotations section).

Off

Labels

Include Kubernetes resource labels in the extra metadata.

Annotations

Include Kubernetes resource annotations in the extra metadata.

Kube_meta_preload_cache_dir

If set, Kubernetes meta-data can be cached/pre-loaded from files in JSON format in this directory, named as namespace-pod.meta

Dummy_Meta

If set, use dummy-meta data (for test/dev purposes)

Off

Processing the 'log' value

Kubernetes Filter aims to provide several ways to process the data contained in the log key. The following explanation of the workflow assumes that your original Docker parser defined in parsers.conf is as follows:

[PARSER]
    Name         docker
    Format       json
    Time_Key     time
    Time_Format  %Y-%m-%dT%H:%M:%S.%L
    Time_Keep    On

Since Fluent Bit v1.2 we are not suggesting the use of decoders (Decode_Field_As) if you are using Elasticsearch database in the output to avoid data type conflicts.

To perform processing of the log key, it's mandatory to enable the Merge_Log configuration property in this filter, then the following processing order will be done:

If a Pod suggest a parser, the filter will use that parser to process the content of log.
If the option Merge_Parser was set and the Pod did not suggest a parser, process the log content using the suggested parser in the configuration.
If no Pod was suggested and no Merge_Parser is set, try to handle the content as JSON.

If log value processing fails, the value is untouched. The order above is not chained, meaning it's exclusive and the filter will try only one of the options above, not all of them.

Kubernetes Annotations

A flexible feature of Fluent Bit Kubernetes filter is that allow Kubernetes Pods to suggest certain behaviors for the log processor pipeline when processing the records. At the moment it support:

Suggest a pre-defined parser
Request to exclude logs

The following annotations are available:

Annotation

Description

Default

fluentbit.io/parser[_stream][-container]

Suggest a pre-defined parser. The parser must be registered already by Fluent Bit. This option will only be processed if Fluent Bit configuration (Kubernetes Filter) have enabled the option K8S-Logging.Parser. If present, the stream (stdout or stderr) will restrict that specific stream. If present, the container can override a specific container in a Pod.

fluentbit.io/exclude[_stream][-container]

Request to Fluent Bit to exclude or not the logs generated by the Pod. This option will only be processed if Fluent Bit configuration (Kubernetes Filter) have enabled the option K8S-Logging.Exclude.

False

Annotation Examples in Pod definition

Suggest a parser

The following Pod definition runs a Pod that emits Apache logs to the standard output, in the Annotations it suggest that the data should be processed using the pre-defined parser called apache:

apiVersion: v1
kind: Pod
metadata:
  name: apache-logs
  labels:
    app: apache-logs
  annotations:
    fluentbit.io/parser: apache
spec:
  containers:
  - name: apache
    image: edsiper/apache_logs

Request to exclude logs

There are certain situations where the user would like to request that the log processor simply skip the logs from the Pod in question:

apiVersion: v1
kind: Pod
metadata:
  name: apache-logs
  labels:
    app: apache-logs
  annotations:
    fluentbit.io/exclude: "true"
spec:
  containers:
  - name: apache
    image: edsiper/apache_logs

Note that the annotation value is boolean which can take a true or false and must be quoted.

Workflow of Tail + Kubernetes Filter

Kubernetes Filter depends on either Tail or Systemd input plugins to process and enrich records with Kubernetes metadata. Here we will explain the workflow of Tail and how it configuration is correlated with Kubernetes filter. Consider the following configuration example (just for demo purposes, not production):

[INPUT]
    Name    tail
    Tag     kube.*
    Path    /var/log/containers/*.log
    Parser  docker

[FILTER]
    Name             kubernetes
    Match            kube.*
    Kube_URL         https://kubernetes.default.svc:443
    Kube_CA_File     /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File  /var/run/secrets/kubernetes.io/serviceaccount/token
    Kube_Tag_Prefix  kube.var.log.containers.
    Merge_Log        On
    Merge_Log_Key    log_processed

In the input section, the Tail plugin will monitor all files ending in .log in path /var/log/containers/. For every file it will read every line and apply the docker parser. Then the records are emitted to the next step with an expanded tag.

Tail support Tags expansion, which means that if a tag have a star character (*), it will replace the value with the absolute path of the monitored file, so if you file name and path is:

/var/log/containers/apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log

then the Tag for every record of that file becomes:

kube.var.log.containers.apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log

note that slashes are replaced with dots.

When Kubernetes Filter runs, it will try to match all records that starts with kube. (note the ending dot), so records from the file mentioned above will hit the matching rule and the filter will try to enrich the records

Kubernetes Filter do not care from where the logs comes from, but it cares about the absolute name of the monitored file, because that information contains the pod name and namespace name that are used to retrieve associated metadata to the running Pod from the Kubernetes Master/API Server.

If the configuration property Kube_Tag_Prefix was configured (available on Fluent Bit >= 1.1.x), it will use that value to remove the prefix that was appended to the Tag in the previous Input section. Note that the configuration property defaults to _kube._var.logs.containers. , so the previous Tag content will be transformed from:

kube.var.log.containers.apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log

to:

apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log

the transformation above do not modify the original Tag, just creates a new representation for the filter to perform metadata lookup.

that new value is used by the filter to lookup the pod name and namespace, for that purpose it uses an internal Regular expression:

(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$

If you want to know more details, check the source code of that definition here.

You can see on Rublar.com web site how this operation is performed, check the following demo link:

https://rubular.com/r/HZz3tYAahj6JCd

Custom Regex

Under certain and not common conditions, a user would want to alter that hard-coded regular expression, for that purpose the option Regex_Parser can be used (documented on top).

Final Comments

So at this point the filter is able to gather the values of pod_name and namespace, with that information it will check in the local cache (internal hash table) if some metadata for that key pair exists, if so, it will enrich the record with the metadata value, otherwise it will connect to the Kubernetes Master/API Server and retrieve that information.

Helm Charts and Yaml

Fluent Bit configuration reader expects that every line in the configuration files, ends with a \n (LF or 0x10). When composing Yaml files for a Helm chart always enable the multiline mode, example:

apiVersion: v1
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush           2
        Daemon          off
        Log_Level       info
        Parsers_File    parsers.conf
        Plugins_File    plugins.conf
        HTTP_Server     On
        HTTP_Listen     0.0.0.0
        HTTP_Port       2020