All pages
Powered by GitBook
1 of 35

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Outputs

FlowCounter

FlowCounter is the protocol to count records. The flowcounter output plugin allows to count up records and its size.

Configuration Parameters

The plugin supports the following configuration parameters:

Key
Description
Default

Unit

The unit of duration. (second/minute/hour/day)

minute

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

$ fluent-bit -i cpu -o flowcounter

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name cpu
    Tag  cpu

[OUTPUT]
    Name flowcounter
    Match *
    Unit second

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

$ fluent-bit -i cpu -o flowcounter  
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2016/12/23 11:01:20] [ info] [engine] started
[out_flowcounter] cpu.0:[1482458540, {"counts":60, "bytes":7560, "counts/minute":1, "bytes/minute":126 }]

NULL

The null output plugin just throws away events.

Configuration Parameters

The plugin doesn't support configuration parameters.

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit throws away events with the following options:

$ fluent-bit -i cpu -o null

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name cpu
    Tag  cpu

[OUTPUT]
    Name null
    Match *

Slack

The Slack output plugin delivers records or messages to your preferred Slack channel. It formats the outgoing content in JSON format for readability.

This connector uses the Slack Incoming Webhooks feature to post messages to Slack channels. Using this plugin in conjunction with the Stream Processor is a good combination for alerting.

Slack Webhook

Before configuring this plugin, make sure to setup your Incoming Webhook. For detailed step-by-step instructions, review the following official documentation:

  • https://api.slack.com/messaging/webhooks#getting_started

Once you have obtained the Webhook address you can place it in the configuration below.

Configuration Parameters

Key
Description
Default

webhook

Absolute address of the Webhook provided by Slack

Configuration File

Get started quickly with this configuration file:

[OUTPUT]
    name                 slack
    match                *
    webhook              https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX

Azure Log Analytics

Send logs, metrics to Azure Log Analytics

Azure output plugin allows to ingest your records into service.

To get more details about how to setup Azure Log Analytics, please refer to the following documentation:

Configuration Parameters

Key
Description
default

Getting Started

In order to insert records into an Azure Log Analytics instance, you can run the plugin from the command line or through the configuration file:

Command Line

The azure plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Input & Output sections:

Counter

Counter is a very simple plugin that counts how many records it's getting upon flush time. Plugin output is as follows:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

[TIMESTAMP, NUMBER_OF_RECORDS_NOW] (total = RECORDS_SINCE_IT_STARTED)
$ fluent-bit -i cpu -o counter
[INPUT]
    Name cpu
    Tag  cpu

[OUTPUT]
    Name  counter
    Match *
$ bin/fluent-bit -i cpu -o counter -f 1
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2017/07/19 11:19:02] [ info] [engine] started
1500484743,1 (total = 1)
1500484744,1 (total = 2)
1500484745,1 (total = 3)
1500484746,1 (total = 4)
1500484747,1 (total = 5)

Prometheus Exporter

An output plugin to expose Prometheus Metrics

The prometheus exporter allows you to take metrics from Fluent Bit and expose them such that a Prometheus instance can scrape them.

Important Note: The prometheus exporter only works with metric plugins, such as Node Exporter Metrics

Key
Description
Default

host

This is address Fluent Bit will bind to when hosting prometheus metrics. Note: listen parameter is deprecated from v1.9.0.

0.0.0.0

port

This is the port Fluent Bit will bind to when hosting prometheus metrics

2021

add_label

This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields

Getting Started

The Prometheus exporter only works with metrics captured from metric plugins. In the following example, host metrics are captured by the node exporter metrics plugin and then are routed to prometheus exporter. Within the output plugin two labels are added app="fluent-bit"and color="blue"

# Node Exporter Metrics + Prometheus Exporter
# -------------------------------------------
# The following example collect host metrics on Linux and expose
# them through a Prometheus HTTP end-point.
#
# After starting the service try it with:
#
# $ curl http://127.0.0.1:2021/metrics
#
[SERVICE]
    flush           1
    log_level       info

[INPUT]
    name            node_exporter_metrics
    tag             node_metrics
    scrape_interval 2

[OUTPUT]
    name            prometheus_exporter
    match           node_metrics
    host            0.0.0.0
    port            2021
    # add user-defined labels
    add_label       app fluent-bit
    add_label       color blue

Customer_ID

Customer ID or WorkspaceID string.

Shared_Key

The primary or the secondary Connected Sources client authentication key.

Log_Type

The name of the event type.

fluentbit

$ fluent-bit -i cpu -o azure -p customer_id=abc -p shared_key=def -m '*' -f 1
[INPUT]
    Name  cpu

[OUTPUT]
    Name        azure
    Match       *
    Customer_ID abc
    Shared_Key  def
Azure Log Analytics
Azure Log Analytics

Treasure Data

The td output plugin, allows to flush your records into the Treasure Data cloud service.

Configuration Parameters

The plugin supports the following configuration parameters:

Key
Description
Default

API

The API key. To obtain it please log into the and in the API keys box, copy the API key hash.

Database

Specify the name of your target database.

Table

Specify the name of your target table where the records will be stored.

Region

Set the service region, available values: US and JP

US

Getting Started

In order to start inserting records into Treasure Data, you can run the plugin from the command line or through the configuration file:

Command Line:

$ fluent-bit -i cpu -o td -p API="abc" -p Database="fluentbit" -p Table="cpu_samples"

Ideally you don't want to expose your API key from the command line, using a configuration file is highly desired.

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name cpu
    Tag  my_cpu

[OUTPUT]
    Name     td
    Match    *
    API      5713/e75be23caee19f8041dfa635ddfbd0dcd8c8d981
    Database fluentbit
    Table    cpu_samples

File

The file output plugin allows to write the data received through the input plugin to file.

Configuration Parameters

The plugin supports the following configuration parameters:

Key
Description
Default

Format

out_file format

Output time, tag and json records. There is no configuration parameters for out_file.

plain format

Output the records as JSON (without additional tag and timestamp attributes). There is no configuration parameters for plain format.

csv format

Output the records as csv. Csv supports an additional configuration parameter.

Key
Description

ltsv format

Output the records as LTSV. LTSV supports an additional configuration parameter.

Key
Description

template format

Output the records using a custom format template.

Key
Description

This accepts a formatting template and fills placeholders using corresponding values in a record.

For example, if you set up the configuration as below:

You will get the following output:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Datadog

Send logs to Datadog

The Datadog output plugin allows to ingest your logs into .

Before you begin, you need a , a , and you need to .

Configuration Parameters

Key
Description
Default

Configuration File

Get started quickly with this configuration file:

Troubleshooting

403 Forbidden

If you get a 403 Forbidden error response, double check that you have a valid and that you have .

Google Cloud BigQuery

BigQuery output plugin is an experimental plugin that allows you to stream records into service. The implementation does not support the following, which would be expected in a full production version:

  • .

  • using insertId.

  • using templateSuffix.

Google Cloud Configuration

Fluent Bit streams data into an existing BigQuery table using a service account that you specify. Therefore, before using the BigQuery output plugin, you must create a service account, create a BigQuery dataset and table, authorize the service account to write to the table, and provide the service account credentials to Fluent Bit.

Creating a Service Account

To stream data into BigQuery, the first step is to create a Google Cloud service account for Fluent Bit:

Creating a BigQuery Dataset and Table

Fluent Bit does not create datasets or tables for your data, so you must create these ahead of time. You must also grant the service account WRITER permission on the dataset:

Within the dataset you will need to create a table for the data to reside in. You can follow the following instructions for creating your table. Pay close attention to the schema. It must match the schema of your output JSON. Unfortunately, since BigQuery does not allow dots in field names, you will need to use a filter to change the fields for many of the standard inputs (e.g, mem or cpu).

Retrieving Service Account Credentials

Fluent Bit BigQuery output plugin uses a JSON credentials file for authentication credentials. Download the credentials file by following these instructions:

Configurations Parameters

Key
Description
default

See Google's for further details.

Configuration File

If you are using a Google Cloud Credentials File, the following configuration is enough to get you started:

NATS

The nats output plugin, allows to flush your records into a end point. The following instructions assumes that you have a fully operational NATS Server in place.

In order to flush records, the nats plugin requires to know two parameters:

parameter
description
default

In order to override the default configuration values, the plugin uses the optional Fluent Bit network address format, e.g:

Running

only requires to know that it needs to use the nats output plugin, if no extra information is given, it will use the default values specified in the above table.

As described above, the target service and storage point can be changed, e.g:

Data format

For every set of records flushed to a NATS Server, Fluent Bit uses the following JSON format:

Each record is an individual entity represented in a JSON array that contains a UNIX_TIMESTAMP and a JSON map with a set of key/values. A summarized output of the CPU input plugin will looks as this:

Path

Directory path to store files. If not set, Fluent Bit will write the files on it's own positioned directory. note: this option was added on Fluent Bit v1.4.6

File

Set file name to store the records. If not set, the file name will be the tag associated with the records.

Format

The format of the file content. See also Format section. Default: out_file.

Mkdir

Recursively create output directory if it does not exist. Permissions set to 0755.

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

1

tag: [time, {"key1":"value1", "key2":"value2", "key3":"value3"}]
{"key1":"value1", "key2":"value2", "key3":"value3"}

Delimiter

The character to separate each data. Default: ','

time[delimiter]"value1"[delimiter]"value2"[delimiter]"value3"

Delimiter

The character to separate each pair. Default: '\t'(TAB)

Label_Delimiter

The character to separate label and the value. Default: ':'

field1[label_delimiter]value1[delimiter]field2[label_delimiter]value2

Template

The format string. Default: '{time} {message}'

[INPUT]
  Name mem

[OUTPUT]
  Name file
  Format template
  Template {time} used={Mem.used} free={Mem.free} total={Mem.total}
1564462620.000254 used=1045448 free=31760160 total=32805608
$ fluent-bit -i cpu -o file -p path=output.txt
[INPUT]
    Name cpu
    Tag  cpu

[OUTPUT]
    Name file
    Match *
    Path output_dir
Treasure Data
Console

Host

Required - The Datadog server where you are sending your logs.

http-intake.logs.datadoghq.com

TLS

Required - End-to-end security communications security protocol. Datadog recommends setting this to on.

off

compress

Recommended - compresses the payload in GZIP format, Datadog supports and recommends setting this to gzip.

apikey

Required - Your Datadog API key.

Proxy

Optional - Specify an HTTP Proxy. The expected format of this value is http://host:port. Note that https is not supported yet.

provider

To activate the remapping, specify configuration flag provider with value ecs.

json_date_key

Date key name for output.

timestamp

include_tag_key

If enabled, a tag is appended to output. The key name is used tag_key property.

false

tag_key

The key name of tag. If include_tag_key is false, This property is ignored.

tagkey

dd_service

Recommended - The human readable name for your service generating the logs - the name of your application or database.

dd_source

Recommended - A human readable name for the underlying technology of your service. For example, postgres or nginx.

dd_tags

Optional - The tags you want to assign to your logs in Datadog.

dd_message_key

By default, the plugin searches for the key 'log' and remap the value to the key 'message'. If the property is set, the plugin will search the property name key.

[OUTPUT]
    Name        datadog
    Match       *
    Host        http-intake.logs.datadoghq.com
    TLS         on
    compress    gzip
    apikey      <my-datadog-api-key>
    dd_service  <my-app-service>
    dd_source   <my-app-source>
    dd_tags     team:logs,foo:bar
Datadog
Datadog account
Datadog API key
activate Datadog Logs Management
Datadog API key
activated Datadog Logs Management

google_service_credentials

Absolute path to a Google Cloud credentials JSON file

Value of the environment variable $GOOGLE_SERVICE_CREDENTIALS

project_id

The project id containing the BigQuery dataset to stream into.

The value of the project_id in the credentials file

dataset_id

The dataset id of the BigQuery dataset to write into. This dataset must exist in your project.

table_id

The table id of the BigQuery table to write into. This table must exist in the specified dataset and the schema must match the output.

skip_invalid_rows

Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist.

Off

ignore_unknown_values

Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors.

Off

[INPUT]
    Name  dummy
    Tag   dummy

[OUTPUT]
    Name       bigquery
    Match      *
    dataset_id my_dataset
    table_id   dummy_table
Google Cloud BigQuery
Application Default Credentials
Data deduplication
Template tables
Creating a Google Cloud Service Account
Creating and using datasets
Creating and using tables
Creating and Managing Service Account Keys
official documentation

host

IP address or hostname of the NATS Server

127.0.0.1

port

TCP port of the target NATS Server

4222

nats://host:port
$ bin/fluent-bit -i cpu -o nats -V -f 5
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2016/03/04 10:17:33] [ info] Configuration
flush time     : 5 seconds
input plugins  : cpu
collectors     :
[2016/03/04 10:17:33] [ info] starting engine
cpu[all] all=3.250000 user=2.500000 system=0.750000
cpu[i=0] all=3.000000 user=1.000000 system=2.000000
cpu[i=1] all=3.000000 user=2.000000 system=1.000000
cpu[i=2] all=2.000000 user=2.000000 system=0.000000
cpu[i=3] all=6.000000 user=5.000000 system=1.000000
[2016/03/04 10:17:33] [debug] [in_cpu] CPU 3.25%
...
[
  [UNIX_TIMESTAMP, JSON_MAP_1],
  [UNIX_TIMESTAMP, JSON_MAP_2],
  [UNIX_TIMESTAMP, JSON_MAP_N],
]
[
  [1457108504,{"tag":"fluentbit","cpu_p":1.500000,"user_p":1,"system_p":0.500000}],
  [1457108505,{"tag":"fluentbit","cpu_p":4.500000,"user_p":3,"system_p":1.500000}],
  [1457108506,{"tag":"fluentbit","cpu_p":6.500000,"user_p":4.500000,"system_p":2}]
]
NATS Server
Fluent Bit

Standard Output

The stdout output plugin allows to print to the standard output the data received through the input plugin. Their usage is very simple as follows:

Configuration Parameters

Key
Description
default

Format

Specify the data format to be printed. Supported formats are msgpack json, json_lines and json_stream.

msgpack

json_date_key

Specify the name of the time key in the output record. To disable the time key just set the value to false.

date

json_date_format

Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681)

double

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

1

Command Line

$ bin/fluent-bit -i cpu -o stdout -v

We have specified to gather CPU usage metrics and print them out to the standard output in a human readable way:

$ bin/fluent-bit -i cpu -o stdout -p format=msgpack -v
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2016/10/07 21:52:01] [ info] [engine] started
[0] cpu.0: [1475898721, {"cpu_p"=>0.500000, "user_p"=>0.250000, "system_p"=>0.250000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>1.000000}]
[1] cpu.0: [1475898722, {"cpu_p"=>0.250000, "user_p"=>0.250000, "system_p"=>0.000000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>1.000000, "cpu1.p_user"=>1.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>0.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>0.000000}]
[2] cpu.0: [1475898723, {"cpu_p"=>0.750000, "user_p"=>0.250000, "system_p"=>0.500000, "cpu0.p_cpu"=>2.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>1.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>1.000000, "cpu3.p_cpu"=>0.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>0.000000}]
[3] cpu.0: [1475898724, {"cpu_p"=>1.000000, "user_p"=>0.750000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>2.000000, "cpu1.p_user"=>1.000000, "cpu1.p_system"=>1.000000, "cpu2.p_cpu"=>1.000000, "cpu2.p_user"=>1.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

No more, no less, it just works.

Amazon Kinesis Data Streams

Send logs to Amazon Kinesis Streams

The Amazon Kinesis Data Streams output plugin allows to ingest your records into the service.

This is the documentation for the core Fluent Bit Kinesis plugin written in C. It has all the core features of the Golang Fluent Bit plugin released in 2019. The Golang plugin was named kinesis; this new high performance and highly efficient kinesis plugin is called kinesis_streams to prevent conflicts/confusion.

See for details on how AWS credentials are fetched.

Configuration Parameters

Key
Description

Getting Started

In order to send records into Amazon Kinesis Data Streams, you can run the plugin from the command line or through the configuration file:

Command Line

The kinesis_streams plugin, can read the parameters from the command line through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Output section:

Worker support

Fluent Bit 1.7 adds a new feature called workers which enables outputs to have dedicated threads. This kinesis_streams plugin fully supports workers.

Example:

If you enable a single worker, you are enabling a dedicated thread for your Kinesis output. We recommend starting with without workers, evaluating the performance, and then adding workers one at a time until you reach your desired/needed throughput. For most users, no workers or a single worker will be sufficient.

AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and these plugins.

GitHub

Amazon ECR Public Gallery

Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:

For example, you can pull the image with latest version by:

If you see errors for image pull limits, try log into public ECR with your AWS credentials:

You can check the for more details.

Docker Hub

Amazon ECR

You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:

For more see .

New Relic

is a data management platform that gives you real-time insights of your data for developers, operations and management teams.

The Fluent Bit nrlogs output plugin allows you to send your logs to New Relic service.

Before to get started with the plugin configuration, make sure to obtain the proper account to get access to the service. You can register and start with a free trial in the following link:

Configuration Parameters

The following configuration example, will emit a dummy example record and ingest it on New Relic. Copy and paste the following content in a file called newrelic.conf:

run Fluent Bit with the new configuration file:

Fluent Bit output:

Kafka REST Proxy

The kafka-rest output plugin, allows to flush your records into a server. The following instructions assumes that you have a fully operational Kafka REST Proxy and Kafka services running in your environment.

Configuration Parameters

Key
Description
default

TLS / SSL

Kafka REST Proxy output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.

Getting Started

In order to insert records into a Kafka REST Proxy service, you can run the plugin from the command line or through the configuration file:

Command Line

The kafka-rest plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Input & Output sections:

base_uri

Full address of New Relic API end-point. By default the value points to the US end-point.

If you want to use the EU end-point you can set this key to the following value: https://log-api.eu.newrelic.com/log/v1

https://log-api.newrelic.com/log/v1

api_key

Your key for data ingestion. The API key is also called the ingestion key, you can get more details on how to generated in the official documentation here.

From a configuration perspective either an api_key or an license_key is required. New Relic suggest to use primary the api_key.

license_key

Optional authentication parameter for data ingestion.

Note that New Relic suggest to use the api_key instead. You can read more about the License Key here.

compress

Set the compression mechanism for the payload. This option allows two values: gzip (enabled by default) or false to disable compression.

gzip

[SERVICE]
    flush     1
    log_level info

[INPUT]
    name      dummy
    dummy     {"message":"a simple message", "temp": "0.74", "extra": "false"}
    samples   1

[OUTPUT]
    name      nrlogs
    match     *
    api_key   YOUR_API_KEY_HERE
$ fluent-bit -c newrelic.conf
Fluent Bit v1.5.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/04/10 10:58:32] [ info] [storage] version=1.0.3, initializing...
[2020/04/10 10:58:32] [ info] [storage] in-memory
[2020/04/10 10:58:32] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/04/10 10:58:32] [ info] [engine] started (pid=2772591)
[2020/04/10 10:58:32] [ info] [output:newrelic:newrelic.0] configured, hostname=log-api.newrelic.com:443
[2020/04/10 10:58:32] [ info] [sp] stream processor started
[2020/04/10 10:58:35] [ info] [output:nrlogs:nrlogs.0] log-api.newrelic.com:443, HTTP status=202
{"requestId":"feb312fe-004e-b000-0000-0171650764ac"}
New Relic
New Relic Sign Up

Host

IP address or hostname of the target Kafka REST Proxy server

127.0.0.1

Port

TCP port of the target Kafka REST Proxy server

8082

Topic

Set the Kafka topic

fluent-bit

Partition

Set the partition number (optional)

Message_Key

Set a message key (optional)

Time_Key

The Time_Key property defines the name of the field that holds the record timestamp.

@timestamp

Time_Key_Format

Defines the format of the timestamp.

%Y-%m-%dT%H:%M:%S

Include_Tag_Key

Append the Tag name to the final record.

Off

Tag_Key

If Include_Tag_Key is enabled, this property defines the key name for the tag.

_flb-key

$ fluent-bit -i cpu -t cpu -o kafka-rest -p host=127.0.0.1 -p port=8082 -m '*'
[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name        kafka-rest
    Match       *
    Host        127.0.0.1
    Port        8082
    Topic       fluent-bit
    Message_Key my_key
Kafka REST Proxy
TLS/SSL

TCP & TLS

The tcp output plugin allows to send records to a remote TCP server. The payload can be formatted in different ways as required.

Configuration Parameters

Key
Description
default

Host

Target host where Fluent-Bit or Fluentd are listening for Forward messages.

127.0.0.1

Port

TCP Port of the target service.

5170

Format

Specify the data format to be printed. Supported formats are msgpack json, json_lines and json_stream.

msgpack

json_date_key

Specify the name of the time key in the output record. To disable the time key just set the value to false.

date

json_date_format

Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681)

double

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

2

TLS Configuration Parameters

The following parameters are available to configure a secure channel connection through TLS:

Key
Description
Default

tls

Enable or disable TLS support

Off

tls.verify

Force certificate validation

On

tls.debug

Set TLS debug verbosity level. It accept the following values: 0 (No debug), 1 (Error), 2 (State change), 3 (Informational) and 4 Verbose

1

tls.ca_file

Absolute path to CA certificate file

tls.crt_file

Absolute path to Certificate file.

tls.key_file

Absolute path to private Key file.

tls.key_passwd

Optional password for tls.key_file file.

Command Line

$ bin/fluent-bit -i cpu -o tcp://127.0.0.1:5170 -p format=json_lines -v

We have specified to gather CPU usage metrics and send them in JSON lines mode to a remote end-point using netcat service, e.g:

Start the TCP listener

Run the following in a separate terminal, netcat will start listening for messages on TCP port 5170

$ nc -l 5170

Start Fluent Bit

$ bin/fluent-bit -i cpu -o stdout -p format=msgpack -v
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2016/10/07 21:52:01] [ info] [engine] started
[0] cpu.0: [1475898721, {"cpu_p"=>0.500000, "user_p"=>0.250000, "system_p"=>0.250000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>1.000000}]
[1] cpu.0: [1475898722, {"cpu_p"=>0.250000, "user_p"=>0.250000, "system_p"=>0.000000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>1.000000, "cpu1.p_user"=>1.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>0.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>0.000000}]
[2] cpu.0: [1475898723, {"cpu_p"=>0.750000, "user_p"=>0.250000, "system_p"=>0.500000, "cpu0.p_cpu"=>2.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>1.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>1.000000, "cpu3.p_cpu"=>0.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>0.000000}]
[3] cpu.0: [1475898724, {"cpu_p"=>1.000000, "user_p"=>0.750000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>2.000000, "cpu1.p_user"=>1.000000, "cpu1.p_system"=>1.000000, "cpu2.p_cpu"=>1.000000, "cpu2.p_user"=>1.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}]

No more, no less, it just works.

region

The AWS region.

stream

The name of the Kinesis Streams Delivery stream that you want log records sent to.

time_key

Add the timestamp to the record under this key. By default the timestamp from Fluent Bit will not be added to records sent to Kinesis.

time_key_format

strftime compliant format string for the timestamp; for example, the default is '%Y-%m-%dT%H:%M:%S'. This option is used with time_key.

log_key

By default, the whole log record will be sent to Kinesis. If you specify a key name with this option, then only the value of that key will be sent to Kinesis. For example, if you are using the Fluentd Docker log driver, you can specify log_key log and only the log message will be sent to Kinesis.

role_arn

ARN of an IAM role to assume (for cross account access).

endpoint

Specify a custom endpoint for the Kinesis API.

sts_endpoint

Custom endpoint for the STS API.

auto_retry_requests

Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues.

$ fluent-bit -i cpu -o kinesis_streams -p stream=my-stream -p region=us-west-2 -m '*' -f 1
[OUTPUT]
    Name  kinesis_streams
    Match *
    region us-east-1
    stream my-stream
[OUTPUT]
    Name  kinesis_streams
    Match *
    region us-east-1
    stream my-stream
    workers 2
docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:<tag>
docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws
aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/
Kinesis
aws/amazon-kinesis-streams-for-fluent-bit
here
github.com/aws/aws-for-fluent-bit
aws-for-fluent-bit
Amazon ECR Public official doc
amazon/aws-for-fluent-bit
the AWS for Fluent Bit github repo

Amazon Kinesis Data Firehose

Send logs to Amazon Kinesis Firehose

The Amazon Kinesis Data Firehose output plugin allows to ingest your records into the Firehose service.

This is the documentation for the core Fluent Bit Firehose plugin written in C. It can replace the aws/amazon-kinesis-firehose-for-fluent-bit Golang Fluent Bit plugin released last year. The Golang plugin was named firehose; this new high performance and highly efficient firehose plugin is called kinesis_firehose to prevent conflicts/confusion.

See here for details on how AWS credentials are fetched.

Configuration Parameters

Key
Description

region

The AWS region.

delivery_stream

The name of the Kinesis Firehose Delivery stream that you want log records sent to.

time_key

Add the timestamp to the record under this key. By default the timestamp from Fluent Bit will not be added to records sent to Kinesis.

time_key_format

strftime compliant format string for the timestamp; for example, the default is '%Y-%m-%dT%H:%M:%S'. This option is used with time_key.

log_key

By default, the whole log record will be sent to Firehose. If you specify a key name with this option, then only the value of that key will be sent to Firehose. For example, if you are using the Fluentd Docker log driver, you can specify log_key log and only the log message will be sent to Firehose.

role_arn

ARN of an IAM role to assume (for cross account access).

endpoint

Specify a custom endpoint for the Firehose API.

sts_endpoint

Custom endpoint for the STS API.

auto_retry_requests

Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues.

Getting Started

In order to send records into Amazon Kinesis Data Firehose, you can run the plugin from the command line or through the configuration file:

Command Line

The firehose plugin, can read the parameters from the command line through the -p argument (property), e.g:

$ fluent-bit -i cpu -o kinesis_firehose -p delivery_stream=my-stream -p region=us-west-2 -m '*' -f 1

Configuration File

In your main configuration file append the following Output section:

[OUTPUT]
    Name  kinesis_firehose
    Match *
    region us-east-1
    delivery_stream my-stream

Worker support

Fluent Bit 1.7 adds a new feature called workers which enables outputs to have dedicated threads. This kinesis_firehose plugin fully supports workers.

Example:

[OUTPUT]
    Name  kinesis_firehose
    Match *
    region us-east-1
    delivery_stream my-stream
    workers 2

If you enable a single worker, you are enabling a dedicated thread for your Firehose output. We recommend starting with without workers, evaluating the performance, and then adding workers one at a time until you reach your desired/needed throughput. For most users, no workers or a single worker will be sufficient.

AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and these plugins.

GitHub

github.com/aws/aws-for-fluent-bit

Amazon ECR Public Gallery

aws-for-fluent-bit

Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:<tag>

For example, you can pull the image with latest version by:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest

If you see errors for image pull limits, try log into public ECR with your AWS credentials:

aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws

You can check the Amazon ECR Public official doc for more details.

Docker Hub

amazon/aws-for-fluent-bit

Amazon ECR

You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:

aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/

For more see the AWS for Fluent Bit github repo.

Prometheus Remote Write

An output plugin to submit Prometheus Metrics using the remote write protocol

The prometheus remote write plugin allows you to take metrics from Fluent Bit and submit them to a Prometheus server through the remote write mechanism.

Important Note: The prometheus exporter only works with metric plugins, such as Node Exporter Metrics

Key
Description
Default

host

IP address or hostname of the target HTTP Server

127.0.0.1

http_user

Basic Auth Username

http_passwd

Basic Auth Password. Requires HTTP_user to be set

port

TCP port of the target HTTP Server

80

proxy

Specify an HTTP Proxy. The expected format of this value is . Note that https is not supported yet. Please consider not setting this and use HTTP_PROXY environment variable instead, which supports both http and https.

uri

Specify an optional HTTP URI for the target web server, e.g: /something

/

header

Add a HTTP header key/value pair. Multiple headers can be set.

log_response_payload

Log the response payload within the Fluent Bit log

false

add_label

This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

2

Getting Started

The Prometheus remote write plugin only works with metrics collected by one of the from metric input plugins. In the following example, host metrics are collected by the node exporter metrics plugin and then delivered by the prometheus remote write output plugin.

# Node Exporter Metrics + Prometheus remote write output plugin
# -------------------------------------------
# The following example collects host metrics on Linux and delivers
# them through the Prometheus remote write plugin to new relic :
#
[SERVICE]
    Flush                1
    Log_level            info

[INPUT]
    Name                 node_exporter_metrics
    Tag                  node_metrics
    Scrape_interval      2

[OUTPUT]
    Name                 prometheus_remote_write
    Match                node_metrics
    Host                 metric-api.newrelic.com
    Port                 443
    Uri                  /prometheus/v1/write?prometheus_server=YOUR_DATA_SOURCE_NAME
    Header               Authorization Bearer YOUR_LICENSE_KEY
    Log_response_payload True
    Tls                  On
    Tls.verify           On
    # add user-defined labels
    add_label            app fluent-bit
    add_label            color blue

# Note : it would be necessary to replace both YOUR_DATA_SOURCE_NAME and YOUR_LICENSE_KEY
# with real values for this example to work.

Examples

The following are examples of using Prometheus remote write with hosted services below

Grafana Cloud

With Grafana Cloud hosted metrics you will need to use the specific host that is mentioned as well as specify the HTTP username and password given within the Grafana Cloud page.

[OUTPUT]
    name prometheus_remote_write
    host prometheus-us-central1.grafana.net
    match *
    uri /api/prom/push
    port 443
    tls on
    tls.verify on
    http_user <GRAFANA Username>
    http_passwd <GRAFANA Password>

Logz.io Infrastructure Monitoring

With Logz.io hosted prometheus you will need to make use of the header option and add the Authorization Bearer with the proper key. The host and port may also differ within your specific hosted instance.

[OUTPUT]
    name prometheus_remote_write
    host listener.logz.io
    port 8053 
    match *
    header Authorization Bearer <LOGZIO Key>
    tls on
    tls.verify on
    log_response_payload true

Coralogix

With Coralogix Metrics you may need to customize the URI. Additionally, you will make use of the header key with Coralogix private key.

[OUTPUT]
    name prometheus_remote_write
    host metrics-api.coralogix.com
    uri prometheus/api/v1/write?appLabelName=path&subSystemLabelName=path&severityLabelName=severity 
    match *
    port 443
    tls on
    tls.verify on
    header Authorization Bearer <CORALOGIX Key>

Azure Blob

Official and Microsoft Certified Azure Storage Blob connector

The Azure Blob output plugin allows ingesting your records into service. This connector is designed to use the Append Blob and Block Blob API.

Our plugin works with the official Azure Service and also can be configured to be used with a service emulator such as .

Azure Storage Account

Before getting started, make sure you already have an Azure Storage account. As a reference, the following link explains step-by-step how to set up your account:

Configuration Parameters

We expose different configuration properties. The following table lists all the options available, and the next section has specific configuration details for the official service or the emulator.

Key
Description
default

Getting Started

As mentioned above, you can either deliver records to the official service or an emulator. Below we have an example for each use case.

Configuration for Azure Storage Service

The following configuration example generates a random message with a custom tag:

After you run the configuration file above, you will be able to query the data using the Azure Storage Explorer. The example above will generate the following content in the explorer:

Configuring and using Azure Emulator: Azurite

Install and run Azurite

The quickest way to get started is to install Azurite using npm:

then run the service:

Configuring Fluent Bit for Azurite

comes with a default account_name and shared_key, so make sure to use the specific values provided in the example below (do an exact copy/paste):

after running that Fluent Bit configuration you will see the data flowing into Azurite:

Kafka

Kafka output plugin allows to ingest your records into an service. This plugin use the official (built-in dependency)

Configuration Parameters

Key
Description
default

Setting rdkafka.log.connection.close to false and rdkafka.request.required.acks to 1 are examples of recommended settings of librdfkafka properties.

Getting Started

In order to insert records into Apache Kafka, you can run the plugin from the command line or through the configuration file:

Command Line

The kafka plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Input & Output sections:

Avro Support

Fluent-bit comes with support for avro encoding for the out_kafka plugin. Avro support is optional and must be activated at build-time by using a build def with cmake: -DFLB_AVRO_ENCODER=On such as in the following example which activates:

  • out_kafka with avro encoding

  • fluent-bit's prometheus

  • metrics via an embedded http endpoint

  • debugging support

  • builds the test suites

Kafka Configuration File with Avro Encoding

This is example fluent-bit config tails kubernetes logs, decorates the log lines with kubernetes metadata via the kubernetes filter, and then sends the fully decorated log lines to a kafka broker encoded with a specific avro schema.

LogDNA

is an intuitive cloud based log management system that provides you an easy interface to query your logs once they are stored.

The Fluent Bit logdna output plugin allows you to send your log or events to a compliant service like:

Before to get started with the plugin configuration, make sure to obtain the proper account to get access to the service. You can start with a free trial in the following link:

Configuration Parameters

Key
Description
Default

Auto Enrichment & Data Discovery

One of the features of Fluent Bit + LogDNA integration is the ability to auto enrich each record with further context.

When the plugin process each record (or log), it tries to lookup for specific key names that might contain specific context for the record in question, the following table describe the keys and the discovery logic:

Key
Description

Getting Started

The following configuration example, will emit a dummy example record and ingest it on LogDNA. Copy and paste the following content in a file called logdna.conf:

run Fluent Bit with the new configuration file:

Fluent Bit output:

Your record will be available and visible in your LogDNA dashboard after a few seconds.

Query your Data in LogDNA

In your LogDNA dashboard, go to the top filters and mark the Tags aa and bb, then you will be able to see your records as the example below:

Syslog

The Syslog output plugin allows you to deliver messages to Syslog servers. It supports RFC3164 and RFC5424 formats through different transports such as UDP, TCP or TLS.

As of Fluent Bit v1.5.3 the configuration is very strict. You must be aware of the structure of your original record so you can configure the plugin to use specific keys to compose your outgoing Syslog message.

Future versions of Fluent Bit are expanding this plugin feature set to support better handling of keys and message composing.

Configuration Parameters

Key
Description
Default

Examples

Configuration File

Get started quickly with this configuration file:

Structured Data

The following is an example of how to configure the syslog_sd_key to send Structured Data to the remote Syslog server.

Example log:

Example configuration file:

Example output:

host

Domain or IP address of the remote Syslog server.

127.0.0.1

port

TCP or UDP port of the remote Syslog server.

514

mode

Desired transport type. Available options are tcp, tls and udp.

udp

syslog_format

The Syslog protocol format to use. Available options are rfc3164 and rfc5424.

rfc5424

syslog_maxsize

The maximum size allowed per message. The value must be an integer representing the number of bytes allowed. If no value is provided, the default size is set depending of the protocol version specified by syslog_format. rfc3164 sets max size to 1024 bytes. rfc5424 sets the size to 2048 bytes.

syslog_severity_key

The key name from the original record that contains the Syslog severity number. This configuration is optional.

syslog_facility_key

The key name from the original record that contains the Syslog facility number. This configuration is optional.

syslog_hostname_key

The key name from the original record that contains the hostname that generated the message. This configuration is optional.

syslog_appname_key

The key name from the original record that contains the application name that generated the message. This configuration is optional.

syslog_procid_key

The key name from the original record that contains the Process ID that generated the message. This configuration is optional.

syslog_msgid_key

The key name from the original record that contains the Message ID associated to the message. This configuration is optional.

syslog_sd_key

The key name from the original record that contains the Structured Data (SD) content. This configuration is optional.

syslog_message_key

The key name from the original record that contains the message to deliver. Note that this property is mandatory, otherwise the message will be empty.

[OUTPUT]
    name                 syslog
    match                *
    host                 syslog.yourserver.com
    port                 514
    mode                 udp
    syslog_format        rfc5424
    syslog_maxsize       2048
    syslog_severity_key  severity
    syslog_facility_key  facility
    syslog_hostname_key  hostname
    syslog_appname_key   appname
    syslog_procid_key    procid
    syslog_msgid_key     msgid
    syslog_sd_key        sd
    syslog_message_key   message
{
    "hostname": "myhost",
    "appname": "myapp",
    "procid": "1234",
    "msgid": "ID98",
    "uls@0": {
        "logtype": "access",
        "clustername": "mycluster",
        "namespace": "mynamespace"
    },
    "log": "Sample app log message."
}
[OUTPUT]
    name                 syslog
    match                *
    host                 syslog.yourserver.com
    port                 514
    mode                 udp
    syslog_format        rfc5424
    syslog_maxsize       2048
    syslog_hostname_key  hostname
    syslog_appname_key   appname
    syslog_procid_key    procid
    syslog_msgid_key     msgid    
    syslog_sd_key        uls@0
    syslog_message_key   log
<14>1 2021-07-12T14:37:35.569848Z myhost myapp 1234 ID98 [uls@0 logtype="access" clustername="mycluster" namespace="mynamespace"] Sample app log message.
http://host:port

format

Specify data format, options available: json, msgpack.

json

message_key

Optional key to store the message

message_key_field

If set, the value of Message_Key_Field in the record will indicate the message key. If not set nor found in the record, Message_Key will be used (if set).

timestamp_key

Set the key to store the record timestamp

@timestamp

timestamp_format

'iso8601' or 'double'

double

brokers

Single of multiple list of Kafka Brokers, e.g: 192.168.1.3:9092, 192.168.1.4:9092.

topics

Single entry or list of topics separated by comma (,) that Fluent Bit will use to send messages to Kafka. If only one topic is set, that one will be used for all records. Instead if multiple topics exists, the one set in the record by Topic_Key will be used.

fluent-bit

topic_key

If multiple Topics exists, the value of Topic_Key in the record will indicate the topic to use. E.g: if Topic_Key is router and the record is {"key1": 123, "router": "route_2"}, Fluent Bit will use topic route_2. Note that if the value of Topic_Key is not present in Topics, then by default the first topic in the Topics list will indicate the topic to be used.

dynamic_topic

adds unknown topics (found in Topic_Key) to Topics. So in Topics only a default topic needs to be configured

Off

queue_full_retries

Fluent Bit queues data into rdkafka library, if for some reason the underlying library cannot flush the records the queue might fills up blocking new addition of records. The queue_full_retries option set the number of local retries to enqueue the data. The default value is 10 times, the interval between each retry is 1 second. Setting the queue_full_retries value to 0 set's an unlimited number of retries.

10

rdkafka.{property}

{property} can be any librdkafka properties

$ fluent-bit -i cpu -o kafka -p brokers=192.168.1.3:9092 -p topics=test
[INPUT]
    Name  cpu

[OUTPUT]
    Name        kafka
    Match       *
    Brokers     192.168.1.3:9092
    Topics      test
cmake -DFLB_DEV=On -DFLB_OUT_KAFKA=On -DFLB_TLS=On -DFLB_TESTS_RUNTIME=On -DFLB_TESTS_INTERNAL=On -DCMAKE_BUILD_TYPE=Debug -DFLB_HTTP_SERVER=true -DFLB_AVRO_ENCODER=On ../
[INPUT]
    Name              tail
    Tag               kube.*
    Alias             some-alias
    Path              /logdir/*.log
    DB                /dbdir/some.db
    Skip_Long_Lines   On
    Refresh_Interval  10
    Parser some-parser

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://some_kube_api:443
    Kube_CA_File        /certs/ca.crt
    Kube_Token_File     /tokens/token
    Kube_Tag_Prefix     kube.var.log.containers.
    Merge_Log           On
    Merge_Log_Key       log_processed

[OUTPUT]
    Name        kafka
    Match       *
    Brokers     192.168.1.3:9092
    Topics      test
    Schema_str  {"name":"avro_logging","type":"record","fields":[{"name":"timestamp","type":"string"},{"name":"stream","type":"string"},{"name":"log","type":"string"},{"name":"kubernetes","type":{"name":"krec","type":"record","fields":[{"name":"pod_name","type":"string"},{"name":"namespace_name","type":"string"},{"name":"pod_id","type":"string"},{"name":"labels","type":{"type":"map","values":"string"}},{"name":"annotations","type":{"type":"map","values":"string"}},{"name":"host","type":"string"},{"name":"container_name","type":"string"},{"name":"docker_id","type":"string"},{"name":"container_hash","type":"string"},{"name":"container_image","type":"string"}]}},{"name":"cluster_name","type":"string"},{"name":"fabric","type":"string"}]}
    Schema_id some_schema_id
    rdkafka.client.id some_client_id
    rdkafka.debug All
    rdkafka.enable.ssl.certificate.verification true

    rdkafka.ssl.certificate.location /certs/some.cert
    rdkafka.ssl.key.location /certs/some.key
    rdkafka.ssl.ca.location /certs/some-bundle.crt
    rdkafka.security.protocol ssl
    rdkafka.request.required.acks 1
    rdkafka.log.connection.close false

    Format avro
    rdkafka.log_level 7
    rdkafka.metadata.broker.list 192.168.1.3:9092
Apache Kafka
librdkafka C library

WebSocket

The websocket output plugin allows to flush your records into a WebSocket endpoint. For now the functionality is pretty basic and it issues a HTTP GET request to do the handshake, and then use TCP connections to send the data records in either JSON or MessagePack (or JSON) format.

Configuration Parameters

Key
Description
default

Host

IP address or hostname of the target WebScoket Server

127.0.0.1

Port

TCP port of the target WebScoket Server

80

URI

Specify an optional HTTP URI for the target websocket server, e.g: /something

/

Format

Specify the data format to be used in the HTTP request body, by default it uses msgpack. Other supported formats are json, json_stream and json_lines and gelf.

msgpack

json_date_key

Specify the name of the date field in output

date

json_date_format

Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681)

double

Getting Started

In order to insert records into a HTTP server, you can run the plugin from the command line or through the configuration file:

Command Line

The websocket plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

http://host:port/something

Using the format specified, you could start Fluent Bit through:

$ fluent-bit -i cpu -t cpu -o websocket://192.168.2.3:80/something -m '*'

Configuration File

In your main configuration file, append the following Input & Output sections:

[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name  websocket
    Match *
    Host  192.168.2.3
    Port  80
    URI   /something
    Format json

Websocket plugin is working with tcp keepalive mode, please refer to networking section for details. Since websocket is a stateful plugin, it will decide when to send out handshake to server side, for example when plugin just begins to work or after connection with server has been dropped. In general, the interval to init a new websocket handshake would be less than the keepalive interval. With that stratgy, it could detect and resume websocket connetions.

Testing

Configuration File

[INPUT]
    Name        tcp
    Listen      0.0.0.0
    Port        5170
    Format      json
[OUTPUT]
    Name           websocket
    Match          *
    Host           127.0.0.1
    Port           8080
    URI            /
    Format         json
    workers	   4
    net.keepalive               on
    net.keepalive_idle_timeout  30

Once Fluent Bit is running, you can send some messages using the netcat:

$ echo '{"key 1": 123456789, "key 2": "abcdefg"}' | nc 127.0.0.1 5170; sleep 35; echo '{"key 1": 123456789, "key 2": "abcdefg"}' | nc 127.0.0.1 5170

In Fluent Bit we should see the following output:

bin/fluent-bit   -c ../conf/out_ws.conf
Fluent Bit v1.7.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/02/05 22:17:09] [ info] [engine] started (pid=6056)
[2021/02/05 22:17:09] [ info] [storage] version=1.1.0, initializing...
[2021/02/05 22:17:09] [ info] [storage] in-memory
[2021/02/05 22:17:09] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/02/05 22:17:09] [ info] [input:tcp:tcp.0] listening on 0.0.0.0:5170
[2021/02/05 22:17:09] [ info] [out_ws] we have following parameter /, 127.0.0.1, 8080, 25
[2021/02/05 22:17:09] [ info] [output:websocket:websocket.0] worker #1 started
[2021/02/05 22:17:09] [ info] [output:websocket:websocket.0] worker #0 started
[2021/02/05 22:17:09] [ info] [sp] stream processor started
[2021/02/05 22:17:09] [ info] [output:websocket:websocket.0] worker #3 started
[2021/02/05 22:17:09] [ info] [output:websocket:websocket.0] worker #2 started
[2021/02/05 22:17:33] [ info] [out_ws] handshake for ws
[2021/02/05 22:18:08] [ warn] [engine] failed to flush chunk '6056-1612534687.673438119.flb', retry in 7 seconds: task_id=0, input=tcp.0 > output=websocket.0 (out_id=0)
[2021/02/05 22:18:15] [ info] [out_ws] handshake for ws
^C[2021/02/05 22:18:23] [engine] caught signal (SIGINT)
[2021/02/05 22:18:23] [ warn] [engine] service will stop in 5 seconds
[2021/02/05 22:18:27] [ info] [engine] service stopped
[2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #0 stopping...
[2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #0 stopped
[2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #1 stopping...
[2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #1 stopped
[2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #2 stopping...
[2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #2 stopped
[2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #3 stopping...
[2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #3 stopped
[2021/02/05 22:18:27] [ info] [out_ws] flb_ws_conf_destroy

Scenario Description

From the output of fluent-bit log, we see that once data has been ingested into fluent bit, plugin would perform handshake. After a while, no data or traffic is undergoing, tcp connection would been abort. And then another piece of data arrived, a retry for websocket plugin has been triggered, with another handshake and data flush.

There is another scenario, once websocket server flaps in a short time, which means it goes down and up in a short time, fluent-bit would resume tcp connection immediately. But in that case, websocket output plugin is a malfunction state, it needs to restart fluent-bit to get back to work.

account_name

Azure Storage account name. This configuration property is mandatory

shared_key

Specify the Azure Storage Shared Key to authenticate against the service. This configuration property is mandatory.

container_name

Name of the container that will contain the blobs. This configuration property is mandatory

blob_type

Specify the desired blob type. Fluent Bit supports appendblob and blockblob.

appendblob

auto_create_container

If container_name does not exist in the remote service, enabling this option will handle the exception and auto-create the container.

on

path

Optional path to store your blobs. If your blob name is myblob, you can specify sub-directories where to store it using path, so setting path to /logs/kubernetes will store your blob in /logs/kubernetes/myblob.

emulator_mode

If you want to send data to an Azure emulator service like Azurite, enable this option so the plugin will format the requests to the expected format.

off

endpoint

If you are using an emulator, this option allows you to specify the absolute HTTP address of such service. e.g: http://127.0.0.1:10000.

tls

Enable or disable TLS encryption. Note that Azure service requires this to be turned on.

off

[SERVICE]
    flush     1
    log_level info

[INPUT]
    name      dummy
    dummy     {"name": "Fluent Bit", "year": 2020}
    samples   1
    tag       var.log.containers.app-default-96cbdef2340.log

[OUTPUT]
    name                  azure_blob
    match                 *
    account_name          YOUR_ACCOUNT_NAME
    shared_key            YOUR_SHARED_KEY
    path                  kubernetes
    container_name        logs
    auto_create_container on
    tls                   on
$ npm install -g azurite
$ azurite
Azurite Blob service is starting at http://127.0.0.1:10000
Azurite Blob service is successfully listening at http://127.0.0.1:10000
Azurite Queue service is starting at http://127.0.0.1:10001
Azurite Queue service is successfully listening at http://127.0.0.1:10001
[SERVICE]
    flush     1
    log_level info

[INPUT]
    name      dummy
    dummy     {"name": "Fluent Bit", "year": 2020}
    samples   1
    tag       var.log.containers.app-default-96cbdef2340.log

[OUTPUT]
    name                  azure_blob
    match                 *
    account_name          devstoreaccount1
    shared_key            Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==
    path                  kubernetes
    container_name        logs
    auto_create_container on
    tls                   off
    emulator_mode         on
    endpoint              http://127.0.0.1:10000
$ azurite
Azurite Blob service is starting at http://127.0.0.1:10000
Azurite Blob service is successfully listening at http://127.0.0.1:10000
Azurite Queue service is starting at http://127.0.0.1:10001
Azurite Queue service is successfully listening at http://127.0.0.1:10001
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "GET /devstoreaccount1/logs?restype=container HTTP/1.1" 404 -
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "PUT /devstoreaccount1/logs?restype=container HTTP/1.1" 201 -
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log?comp=appendblock HTTP/1.1" 404 -
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log HTTP/1.1" 201 -
127.0.0.1 - - [03/Sep/2020:17:40:04 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log?comp=appendblock HTTP/1.1" 201 -
Azure Blob Storage
Azurite
Azure Blob Storage Tutorial (Video)
Azurite

logdna_host

LogDNA API host address

logs.logdna.com

logdna_port

LogDNA TCP Port

443

api_key

API key to get access to the service. This property is mandatory.

hostname

Name of the local machine or device where Fluent Bit is running.

When this value is not set, Fluent Bit lookup the hostname and auto populate the value. If it cannot be found, an unknown value will be set instead.

mac

Mac address. This value is optional.

ip

IP address of the local hostname. This value is optional.

tags

A list of comma separated strings to group records in LogDNA and simplify the query with filters.

file

Optional name of a file being monitored. Note that this value is only set if the record do not contain a reference to it.

app

Name of the application. This value is auto discovered on each record, if not found, the default value is used.

Fluent Bit

level

If the record contains a key called level or severity, it will populate the context level key with that value. If not found, the context key is not set.

file

if the record contains a key called file, it will populate the context file with the value found, otherwise If the plugin configuration provided a file property, that value will be used instead (see table above).

app

If the record contains a key called app, it will populate the context app with the value found, otherwise it will use the value set for app in the configuration property (see table above).

meta

if the record contains a key called meta, it will populate the context meta with the value found.

[SERVICE]
    flush     1
    log_level info

[INPUT]
    name      dummy
    dummy     {"log":"a simple log message", "severity": "INFO", "meta": {"s1": 12345, "s2": true}, "app": "Fluent Bit"}
    samples   1

[OUTPUT]
    name      logdna
    match     *
    api_key   YOUR_API_KEY_HERE
    hostname  my-hostname
    ip        192.168.1.2
    mac       aa:bb:cc:dd:ee:ff
    tags      aa, bb
$ fluent-bit -c logdna.conf
Fluent Bit v1.5.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/04/07 17:44:37] [ info] [storage] version=1.0.3, initializing...
[2020/04/07 17:44:37] [ info] [storage] in-memory
[2020/04/07 17:44:37] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/04/07 17:44:37] [ info] [engine] started (pid=2157706)
[2020/04/07 17:44:37] [ info] [output:logdna:logdna.0] configured, hostname=monox-fluent-bit-2
[2020/04/07 17:44:37] [ info] [sp] stream processor started
[2020/04/07 17:44:38] [ info] [output:logdna:logdna.0] logs.logdna.com:443, HTTP status=200
{"status":"ok","batchID":"f95849a8-ec6c-4775-9d52-30763604df9b:40710:ld72"}
LogDNA
LogDNA
LogDNA
IBM Log Analysis
LogDNA Sign Up

InfluxDB

The influxdb output plugin, allows to flush your records into a InfluxDB time series database. The following instructions assumes that you have a fully operational InfluxDB service running in your system.

Configuration Parameters

Key
Description
default

Host

IP address or hostname of the target InfluxDB service

127.0.0.1

Port

TCP port of the target InfluxDB service

8086

Database

InfluxDB database name where records will be inserted

fluentbit

Bucket

InfluxDB bucket name where records will be inserted - if specified, database is ignored and v2 of API is used

Org

InfluxDB organization name where the bucket is (v2 only)

fluent

Sequence_Tag

The name of the tag whose value is incremented for the consecutive simultaneous events.

_seq

HTTP_User

Optional username for HTTP Basic Authentication

HTTP_Passwd

Password for user defined in HTTP_User

HTTP_Token

Authentication token used with InfluDB v2 - if specified, both HTTP_User and HTTP_Passwd are ignored

Tag_Keys

Space separated list of keys that needs to be tagged

Auto_Tags

Automatically tag keys where value is string. This option takes a boolean value: True/False, On/Off.

Off

Tags_List_Enabled

Dynamically tag keys which are in the string array at Tags_List_Key key. This option takes a boolean value: True/False, On/Off.

Off

Tags_List_Key

Key of the string array optionally contained within each log record that contains tag keys for that record

tags

TLS / SSL

InfluxDB output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.

Getting Started

In order to start inserting records into an InfluxDB service, you can run the plugin from the command line or through the configuration file:

Command Line

The influxdb plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

influxdb://host:port

Using the format specified, you could start Fluent Bit through:

$ fluent-bit -i cpu -t cpu -o influxdb://127.0.0.1:8086 -m '*'

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name          influxdb
    Match         *
    Host          127.0.0.1
    Port          8086
    Database      fluentbit
    Sequence_Tag  _seq

Tagging

Basic example of Tag_Keys usage:

[INPUT]
    Name            tail
    Tag             apache.access
    parser          apache2
    path            /var/log/apache2/access.log

[OUTPUT]
    Name          influxdb
    Match         *
    Host          127.0.0.1
    Port          8086
    Database      fluentbit
    Sequence_Tag  _seq
    # make tags from method and path fields
    Tag_Keys      method path

With Auto_Tags=On in this example cause error, because every parsed field value type is string. Best usage of this option in metrics like record where one or more field value is not string typed.

Basic example of Tags_List_Key usage:

[INPUT]
    Name              dummy
    # tagged fields: level, ID, businessObjectID, status
    Dummy             {"msg": "Transfer completed", "level": "info", "ID": "1234", "businessObjectID": "qwerty", "status": "OK", "tags": ["ID", "businessObjectID"]}

[OUTPUT]
    Name          influxdb
    Match         *
    Host          127.0.0.1
    Port          8086
    Bucket        My_Bucket
    Org           My_Org
    Sequence_Tag  _seq
    HTTP_Token    My_Token
    # tag all fields inside tags string array
    Tags_List_Enabled True
    Tags_List_Key tags
    # tag level, status fields
    Tag_Keys level status

Testing

Before to start Fluent Bit, make sure the target database exists on InfluxDB, using the above example, we will insert the data into a fluentbit database.

1. Create database

Log into InfluxDB console:

$ influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 1.1.0
InfluxDB shell version: 1.1.0
>

Create the database:

> create database fluentbit
>

Check the database exists:

> show databases
name: databases
name
----
_internal
fluentbit

>

2. Run Fluent Bit

The following command will gather CPU metrics from the system and send the data to InfluxDB database every five seconds:

$ bin/fluent-bit -i cpu -t cpu -o influxdb -m '*'

Note that all records coming from the cpu input plugin, have a tag cpu, this tag is used to generate the measurement in InfluxDB

3. Query the data

From InfluxDB console, choose your database:

> use fluentbit
Using database fluentbit

Now query some specific fields:

> SELECT cpu_p, system_p, user_p FROM cpu
name: cpu
time                  cpu_p   system_p    user_p
----                  -----   --------    ------
1481132860000000000   2.75        0.5      2.25
1481132861000000000   2           0.5      1.5
1481132862000000000   4.75        1.5      3.25
1481132863000000000   6.75        1.25     5.5
1481132864000000000   11.25       3.75     7.5

The CPU input plugin gather more metrics per CPU core, in the above example we just selected three specific metrics. The following query will give a full result:

> SELECT * FROM cpu

4. View tags

Query tagged keys:

> SHOW TAG KEYS ON fluentbit FROM "apache.access"
name: apache.access
tagKey
------
_seq
method
path

And now query method key values:

> SHOW TAG VALUES ON fluentbit FROM "apache.access" WITH KEY = "method"
name: apache.access
key    value
---    -----
method "MATCH"
method "POST"

Stackdriver

Stackdriver output plugin allows to ingest your records into Google Cloud Stackdriver Logging service.

Before to get started with the plugin configuration, make sure to obtain the proper credentials to get access to the service. We strongly recommend to use a common JSON credentials file, reference link:

  • Creating a Google Service Account for Stackdriver

Your goal is to obtain a credentials JSON file that will be used later by Fluent Bit Stackdriver output plugin.

Configuration Parameters

Key
Description
default

google_service_credentials

Absolute path to a Google Cloud credentials JSON file

Value of environment variable $GOOGLE_SERVICE_CREDENTIALS

service_account_email

Account email associated to the service. Only available if no credentials file has been provided.

Value of environment variable $SERVICE_ACCOUNT_EMAIL

service_account_secret

Private key content associated with the service account. Only available if no credentials file has been provided.

Value of environment variable $SERVICE_ACCOUNT_SECRET

metadata_server

Prefix for a metadata server. Can also set environment variable $METADATA_SERVER.

location

The GCP or AWS region in which to store data about the resource. If the resource type is one of the generic_node or generic_task, then this field is required.

namespace

A namespace identifier, such as a cluster name or environment. If the resource type is one of the generic_node or generic_task, then this field is required.

node_id

A unique identifier for the node within the namespace, such as hostname or IP address. If the resource type is generic_node, then this field is required.

job

An identifier for a grouping of related task, such as the name of a microservice or distributed batch. If the resource type is generic_task, then this field is required.

task_id

A unique identifier for the task within the namespace and job, such as a replica index identifying the task within the job. If the resource type is generic_task, then this field is required.

export_to_project_id

The GCP project that should receive these logs.

Defaults to the project ID of the google_service_credentials file, or the project_id from Google's metadata.google.internal server.

resource

Set resource type of data. Supported resource types: k8s_container, k8s_node, k8s_pod, global, generic_node, generic_task, and gce_instance.

global, gce_instance

k8s_cluster_name

The name of the cluster that the container (node or pod based on the resource type) is running in. If the resource type is one of the k8s_container, k8s_node or k8s_pod, then this field is required.

k8s_cluster_location

The physical location of the cluster that contains (node or pod based on the resource type) the container. If the resource type is one of the k8s_container, k8s_node or k8s_pod, then this field is required.

labels_key

The value of this field is used by the Stackdriver output plugin to find the related labels from jsonPayload and then extract the value of it to set the LogEntry Labels.

logging.googleapis.com/labels

tag_prefix

Set the tag_prefix used to validate the tag of logs with k8s resource type. Without this option, the tag of the log must be in format of k8s_container(pod/node).* in order to use the k8s_container resource type. Now the tag prefix is configurable by this option (note the ending dot).

k8s_container., k8s_pod., k8s_node.

severity_key

Specify the name of the key from the original record that contains the severity information.

autoformat_stackdriver_trace

Rewrite the trace field to include the projectID and format it for use with Cloud Trace. When this flag is enabled, the user can get the correct result by printing only the traceID (usually 32 characters).

false

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

2

Configuration File

If you are using a Google Cloud Credentials File, the following configuration is enough to get started:

[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name        stackdriver
    Match       *

Example configuration file for k8s resource type:

local_resource_id is used by stackdriver output plugin to set the labels field for different k8s resource types. Stackdriver plugin will try to find the local_resource_id field in the log entry. If there is no field logging.googleapis.com/local_resource_id in the log, the plugin will then construct it by using the tag value of the log.

The local_resource_id should be in format:

  • k8s_container.<namespace_name>.<pod_name>.<container_name>

  • k8s_node.<node_name>

  • k8s_pod.<namespace_name>.<pod_name>

This implies that if there is no local_resource_id in the log entry then the tag of logs should match this format. Note that we have an option tag_prefix so it is not mandatory to use k8s_container(node/pod) as the prefix for tag.

[INPUT]
    Name               tail
    Tag_Regex          var.log.containers.(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$
    Tag                custom_tag.<namespace_name>.<pod_name>.<container_name>
    Path               /var/log/containers/*.log
    Parser             docker
    DB                 /var/log/fluent-bit-k8s-container.db

[OUTPUT]
    Name        stackdriver
    Match       custom_tag.*
    Resource    k8s_container
    k8s_cluster_name test_cluster_name
    k8s_cluster_location  test_cluster_location
    tag_prefix  custom_tag.

Troubleshooting Notes

Upstream connection error

Github reference: #761

An upstream connection error means Fluent Bit was not able to reach Google services, the error looks like this:

[2019/01/07 23:24:09] [error] [oauth2] could not get an upstream connection

This belongs to a network issue by the environment where Fluent Bit is running, make sure that from the Host, Container or Pod you can reach the following Google end-points:

  • https://www.googleapis.com

  • https://logging.googleapis.com

Fail to process local_resource_id

The error looks like this:

[2020/08/04 14:43:03] [error] [output:stackdriver:stackdriver.0] fail to process local_resource_id from log entry for k8s_container

Do following check:

  • If the log entry does not contain the local_resource_id field, does the tag of the log match for format?

  • If tag_prefix is configured, does the prefix of tag specified in the input plugin match the tag_prefix?

    Other implementations

Stackdriver officially supports a logging agent based on Fluentd.

We plan to support some special fields in structured payloads. Use cases of special fields is here.

GELF

GELF is Graylog Extended Log Format. The GELF output plugin allows to send logs in GELF format directly to a Graylog input using TLS, TCP or UDP protocols.

The following instructions assumes that you have a fully operational Graylog server running in your environment.

Configuration Parameters

According to GELF Payload Specification, there are some mandatory and optional fields which are used by Graylog in GELF format. These fields are determined with Gelf\*_Key_ key in this plugin.

Key
Description
default

Match

Pattern to match which tags of logs to be outputted by this plugin

Host

IP address or hostname of the target Graylog server

127.0.0.1

Port

The port that your Graylog GELF input is listening on

12201

Mode

The protocol to use (tls, tcp or udp)

udp

Gelf_Short_Message_Key

A short descriptive message (MUST be set in GELF)

short_message

Gelf_Timestamp_Key

Your log timestamp (SHOULD be set in GELF)

timestamp

Gelf_Host_Key

Key which its value is used as the name of the host, source or application that sent this message. (MUST be set in GELF)

host

Gelf_Full_Message_Key

Key to use as the long message that can i.e. contain a backtrace. (Optional in GELF)

full_message

Gelf_Level_Key

Key to be used as the log level. Its value must be in (between 0 and 7). (Optional in GELF)

level

Packet_Size

If transport protocol is udp, you can set the size of packets to be sent.

1420

Compress

If transport protocol is udp, you can set this if you want your UDP packets to be compressed.

true

TLS / SSL

GELF output plugin supports TLS/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.

Notes

  • If you're using Fluent Bit to collect Docker logs, note that Docker places your log in JSON under key log. So you can set log as your Gelf_Short_Message_Key to send everything in Docker logs to Graylog. In this case, you need your log value to be a string; so don't parse it using JSON parser.

  • The order of looking up the timestamp in this plugin is as follows:

    1. Value of Gelf_Timestamp_Key provided in configuration

    2. Value of timestamp key

    3. If you're using Docker JSON parser, this parser can parse time and use it as timestamp of message. If all above fail, Fluent Bit tries to get timestamp extracted by your parser.

    4. Timestamp does not set by Fluent Bit. In this case, your Graylog server will set it to the current timestamp (now).

  • Your log timestamp has to be in UNIX Epoch Timestamp format. If the Gelf_Timestamp_Key value of your log is not in this format, your Graylog server will ignore it.

  • If you're using Fluent Bit in Kubernetes and you're using Kubernetes Filter Plugin, this plugin adds host value to your log by default, and you don't need to add it by your own.

  • The version of GELF message is also mandatory and Fluent Bit sets it to 1.1 which is the current latest version of GELF.

  • If you use udp as transport protocol and set Compress to true, Fluent Bit compresses your packets in GZIP format, which is the default compression that Graylog offers. This can be used to trade more CPU load for saving network bandwidth.

Configuration File Example

If you're using Fluent Bit for shipping Kubernetes logs, you can use something like this as your configuration file:

[INPUT]
    Name                    tail
    Tag                     kube.*
    Path                    /var/log/containers/*.log
    Parser                  docker
    DB                      /var/log/flb_kube.db
    Mem_Buf_Limit           5MB
    Refresh_Interval        10

[FILTER]
    Name                    kubernetes
    Match                   kube.*
    Merge_Log_Key           log
    Merge_Log               On
    Keep_Log                Off
    Annotations             Off
    Labels                  Off

[FILTER]
    Name                    nest
    Match                   *
    Operation               lift
    Nested_under            log

[OUTPUT]
    Name                    gelf
    Match                   kube.*
    Host                    <your-graylog-server>
    Port                    12201
    Mode                    tcp
    Gelf_Short_Message_Key  data

[PARSER]
    Name                    docker
    Format                  json
    Time_Key                time
    Time_Format             %Y-%m-%dT%H:%M:%S.%L
    Time_Keep               Off

By default, GELF tcp uses port 12201 and Docker places your logs in /var/log/containers directory. The logs are placed in value of the log key. For example, this is a log saved by Docker:

{"log":"{\"data\": \"This is an example.\"}","stream":"stderr","time":"2019-07-21T12:45:11.273315023Z"}

If you use Tail Input and use a Parser like the docker parser shown above, it decodes your message and extracts data (and any other present) field. This is how this log in stdout looks like after decoding:

[0] kube.log: [1565770310.000198491, {"log"=>{"data"=>"This is an example."}, "stream"=>"stderr", "time"=>"2019-07-21T12:45:11.273315023Z"}]

Now, this is what happens to this log:

  1. Fluent Bit GELF plugin adds "version": "1.1" to it.

  2. The Nest Filter, unnests fields inside log key. In our example, it puts data alongside stream and time.

  3. We used this data key as Gelf_Short_Message_Key; so GELF plugin changes it to short_message.

  4. Kubernetes Filter adds host name.

  5. Timestamp is generated.

  6. Any custom field (not present in GELF Payload Specification) is prefixed by an underline.

Finally, this is what our Graylog server input sees:

{"version":"1.1", "short_message":"This is an example.", "host": "<Your Node Name>", "_stream":"stderr", "timestamp":1565770310.000199}

Amazon CloudWatch

Send logs and metrics to Amazon CloudWatch

The Amazon CloudWatch output plugin allows to ingest your records into the CloudWatch Logs service. Support for CloudWatch Metrics is also provided via EMF.

This is the documentation for the core Fluent Bit CloudWatch plugin written in C. It can replace the aws/amazon-cloudwatch-logs-for-fluent-bit Golang Fluent Bit plugin released last year. The Golang plugin was named cloudwatch; this new high performance CloudWatch plugin is called cloudwatch_logs to prevent conflicts/confusion. Check the amazon repo for the Golang plugin for details on the deprecation/migration plan for the original plugin.

See here for details on how AWS credentials are fetched.

Configuration Parameters

Key
Description

region

The AWS region.

log_group_name

The name of the CloudWatch Log Group that you want log records sent to.

log_stream_name

The name of the CloudWatch Log Stream that you want log records sent to.

log_stream_prefix

Prefix for the Log Stream name. The tag is appended to the prefix to construct the full log stream name. Not compatible with the log_stream_name option.

log_key

By default, the whole log record will be sent to CloudWatch. If you specify a key name with this option, then only the value of that key will be sent to CloudWatch. For example, if you are using the Fluentd Docker log driver, you can specify log_key log and only the log message will be sent to CloudWatch.

log_format

An optional parameter that can be used to tell CloudWatch the format of the data. A value of json/emf enables CloudWatch to extract custom metrics embedded in a JSON payload. See the .

role_arn

ARN of an IAM role to assume (for cross account access).

auto_create_group

Automatically create the log group. Valid values are "true" or "false" (case insensitive). Defaults to false.

log_retention_days

If set to a number greater than zero, and newly create log group's retention policy is set to this many days. Valid values are: [1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 3653]

endpoint

Specify a custom endpoint for the CloudWatch Logs API.

metric_namespace

An optional string representing the CloudWatch namespace for the metrics. See Metrics Tutorial section below for a full configuration.

metric_dimensions

A list of lists containing the dimension keys that will be applied to all metrics. The values within a dimension set MUST also be members on the root-node. For more information about dimensions, see and . In the fluent-bit config, metric_dimensions is a comma and semicolon separated string. If you have only one list of dimensions, put the values as a comma separated string. If you want to put list of lists, use the list as semicolon separated strings. For example, if you set the value as 'dimension_1,dimension_2;dimension_3', we will convert it as [[dimension_1, dimension_2],[dimension_3]]

sts_endpoint

Specify a custom STS endpoint for the AWS STS API.

auto_retry_requests

Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues.

Getting Started

In order to send records into Amazon Cloudwatch, you can run the plugin from the command line or through the configuration file:

Command Line

The cloudwatch plugin, can read the parameters from the command line through the -p argument (property), e.g:

$ fluent-bit -i cpu -o cloudwatch_logs -p log_group_name=group -p log_stream_name=stream -p region=us-west-2 -m '*' -f 1

Configuration File

In your main configuration file append the following Output section:

[OUTPUT]
    Name cloudwatch_logs
    Match   *
    region us-east-1
    log_group_name fluent-bit-cloudwatch
    log_stream_prefix from-fluent-bit-
    auto_create_group On

Worker support

Fluent Bit 1.7 adds a new feature called workers which enables outputs to have dedicated threads. This cloudwatch_logs plugin has partial support for workers. The plugin can support a single worker; enabling multiple workers will lead to errors/indeterminate behavior.

Example:

[OUTPUT]
    Name cloudwatch_logs
    Match   *
    region us-east-1
    log_group_name fluent-bit-cloudwatch
    log_stream_prefix from-fluent-bit-
    auto_create_group On
    workers 1

If you enable a single worker, you are enabling a dedicated thread for your CloudWatch output. We recommend starting without workers, evaluating the performance, and then enabling a worker if needed. For most users, the plugin can provide sufficient throughput without workers.

Metrics Tutorial

Fluent Bit has different input plugins (cpu, mem, disk, netif) to collect host resource usage metrics. cloudwatch_logs output plugin can be used to send these host metrics to CloudWatch in Embedded Metric Format (EMF). If data comes from any of the above mentioned input plugins, cloudwatch_logs output plugin will convert them to EMF format and sent to CloudWatch as JSON log. Additionally, if we set json/emf as the value of log_format config option, CloudWatch will extract custom metrics from embedded JSON payload.

Note: Right now, only cpu and mem metrics can be sent to CloudWatch.

For using the mem input plugin and sending memory usage metrics to CloudWatch, we can consider the following example config file. Here, we use the aws filter which adds ec2_instance_id and az (availability zone) to the log records. Later, in the output config section, we set ec2_instance_id as our metric dimension.

[SERVICE]
    Log_Level info

[INPUT]
    Name mem
    Tag mem

[FILTER]
    Name aws
    Match *

[OUTPUT]
    Name cloudwatch_logs
    Match *
    log_stream_name fluent-bit-cloudwatch
    log_group_name fluent-bit-cloudwatch
    region us-west-2
    log_format json/emf
    metric_namespace fluent-bit-metrics
    metric_dimensions ec2_instance_id
    auto_create_group true

The following config will set two dimensions to all of our metrics- ec2_instance_id and az.

[FILTER]
    Name aws
    Match *

[OUTPUT]
    Name cloudwatch_logs
    Match *
    log_stream_name fluent-bit-cloudwatch
    log_group_name fluent-bit-cloudwatch
    region us-west-2
    log_format json/emf
    metric_namespace fluent-bit-metrics
    metric_dimensions ec2_instance_id,az
    auto_create_group true

AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and these plugins.

GitHub

github.com/aws/aws-for-fluent-bit

Amazon ECR Public Gallery

aws-for-fluent-bit

Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:<tag>

For example, you can pull the image with latest version by:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest

If you see errors for image pull limits, try log into public ECR with your AWS credentials:

aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws

You can check the Amazon ECR Public official doc for more details

Docker Hub

amazon/aws-for-fluent-bit

Amazon ECR

You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:

aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/

For more see the AWS for Fluent Bit github repo.

HTTP

The http output plugin allows to flush your records into a HTTP endpoint. For now the functionality is pretty basic and it issues a POST request with the data records in (or JSON) format.

Configuration Parameters

Key
Description
default

TLS / SSL

HTTP output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.

Getting Started

In order to insert records into a HTTP server, you can run the plugin from the command line or through the configuration file:

Command Line

The http plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

Using the format specified, you could start Fluent Bit through:

Configuration File

In your main configuration file, append the following Input & Output sections:

By default, the URI becomes tag of the message, the original tag is ignored. To retain the tag, multiple configuration sections have to be made based and flush to different URIs.

Another approach we also support is the sending the original message tag in a configurable header. It's up to the receiver to do what it wants with that header field: parse it and use it as the tag for example.

To configure this behaviour, add this config:

Provided you are using Fluentd as data receiver, you can combine in_http and out_rewrite_tag_filter to make use of this HTTP header.

Notice how we override the tag, which is from URI path, with our custom header

Example : Add a header

Example : Sumo Logic HTTP Collector

Suggested configuration for Sumo Logic using json_lines with iso8601 timestamps. The PrivateKey is specific to a configured HTTP collector.

A sample Sumo Logic query for the input. (Requires json_lines format with iso8601 date format for the timestamp field).

http://metadata.google.internal
standard syslog levels

host

IP address or hostname of the target HTTP Server

127.0.0.1

http_User

Basic Auth Username

http_Passwd

Basic Auth Password. Requires HTTP_User to be set

port

TCP port of the target HTTP Server

80

Proxy

Specify an HTTP Proxy. The expected format of this value is http://HOST:PORT. Note that HTTPS is not currently supported. It is recommended not to set this and to configure the HTTP proxy environment variables instead as they support both HTTP and HTTPS.

uri

Specify an optional HTTP URI for the target web server, e.g: /something

/

compress

Set payload compression mechanism. Option available is 'gzip'

format

Specify the data format to be used in the HTTP request body, by default it uses msgpack. Other supported formats are json, json_stream and json_lines and gelf.

msgpack

allow_duplicated_headers

Specify if duplicated headers are allowed. If a duplicated header is found, the latest key/value set is preserved.

true

log_response_payload

Specify if the response paylod should be logged or not.

true

header_tag

Specify an optional HTTP header field for the original message tag.

header

Add a HTTP header key/value pair. Multiple headers can be set.

json_date_key

Specify the name of the time key in the output record. To disable the time key just set the value to false.

date

json_date_format

Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681)

double

gelf_timestamp_key

Specify the key to use for timestamp in gelf format

gelf_host_key

Specify the key to use for the host in gelf format

gelf_short_message_key

Specify the key to use as the short message in gelf format

gelf_full_message_key

Specify the key to use for the full message in gelf format

gelf_level_key

Specify the key to use for the level in gelf format

successful_response_code

Specify what a successful HTTP response code is in case you need to retry for other HTTP codes (E.g. 204 where)

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

2

http://host:port/something
$ fluent-bit -i cpu -t cpu -o http://192.168.2.3:80/something -m '*'
[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name  http
    Match *
    Host  192.168.2.3
    Port  80
    URI   /something
[OUTPUT]
    Name  http
    Match *
    Host  192.168.2.3
    Port  80
    URI   /something
    Format json
    header_tag  FLUENT-TAG
<source>
  @type http
  add_http_headers true
</source>

<match something>
  @type rewrite_tag_filter
  <rule>
    key HTTP_FLUENT_TAG
    pattern /^(.*)$/
    tag $1
  </rule>
</match>
[OUTPUT]
    Name           http
    Match          *
    Host           127.0.0.1
    Port           9000
    Header         X-Key-A Value_A
    Header         X-Key-B Value_B
    URI            /something
[OUTPUT]
    Name             http
    Match            *
    Host             collectors.au.sumologic.com
    Port             443
    URI              /receiver/v1/http/[PrivateKey]
    Format           json_lines
    Json_date_key    timestamp
    Json_date_format iso8601
_sourcecategory="my_fluent_bit"
| json "cpu_p" as cpu
| timeslice 1m
| max(cpu) as cpu group by _timeslice
MessagePack
TLS/SSL
CPU

Loki

Loki is multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate.

The Fluent Bit loki built-in output plugin allows you to send your log or events to a Loki service. It supports data enrichment with Kubernetes labels, custom label keys and Tenant ID within others.

Be aware there is a separate Golang output plugin provided by Grafana with different configuration options.

Configuration Parameters

Key
Description
Default

host

Loki hostname or IP address. Do not include the subpath, i.e. loki/api/v1/push, but just the base hostname/URL.

127.0.0.1

port

Loki TCP port

3100

http_user

Set HTTP basic authentication user name

http_passwd

Set HTTP basic authentication password

tenant_id

Tenant ID used by default to push logs to Loki. If omitted or empty it assumes Loki is running in single-tenant mode and no X-Scope-OrgID header is sent.

labels

Stream labels for API request. It can be multiple comma separated of strings specifying key=value pairs. In addition to fixed parameters, it also allows to add custom record keys (similar to label_keys property). More details in the Labels section.

job=fluentbit

label_keys

Optional list of record keys that will be placed as stream labels. This configuration property is for records key only. More details in the Labels section.

remove_keys

Optional list of keys to remove.

drop_single_key

If set to true and after extracting labels only a single key remains, the log line sent to Loki will be the value of that key in line_format.

off

line_format

Format to use when flattening the record to a log line. Valid values are json or key_value. If set to json, the log line sent to Loki will be the Fluent Bit record dumped as JSON. If set to key_value, the log line will be each item in the record concatenated together (separated by a single space) in the format.

json

auto_kubernetes_labels

If set to true, it will add all Kubernetes labels to the Stream labels

off

tenant_id_key

Specify the name of the key from the original record that contains the Tenant ID. The value of the key is set as X-Scope-OrgID of HTTP header. It is useful to set Tenant ID dynamically.

Labels

Loki store the record logs inside Streams, a stream is defined by a set of labels, at least one label is required.

Fluent Bit implements a flexible mechanism to set labels by using fixed key/value pairs of text but also allowing to set as labels certain keys that exists as part of the records that are being processed. Consider the following JSON record (pretty printed for readability):

{
    "key": 1,
    "sub": {
        "stream": "stdout",
        "id": "some id"
    },
    "kubernetes": {
        "labels": {
            "team": "Santiago Wanderers"
        }
    }
}

If you decide that your Loki Stream will be composed by two labels called job and the value of the record key called stream , your labels configuration properties might look as follows:

[OUTPUT]
    name   loki
    match  *
    labels job=fluentbit, $sub['stream']

As you can see the label job has the value fluentbit and the second label is configured to access the nested map called sub targeting the value of the key stream . Note that the second label name must starts with a $, that means that's a Record Accessor pattern so it provide you the ability to retrieve values from nested maps by using the key names.

When processing above's configuration, internally the ending labels for the stream in question becomes:

job="fluentbit", stream="stdout"

Another feature of Labels management is the ability to provide custom key names, using the same record accessor pattern we can specify the key name manually and let the value to be populated automatically at runtime, e.g:

[OUTPUT]
    name   loki
    match  *
    labels job=fluentbit, mystream=$sub['stream']

When processing that new configuration, the internal labels will be:

job="fluentbit", mystream="stdout"

Using the label_keys property

The additional configuration property called label_keys allow to specify multiple record keys that needs to be placed as part of the outgoing Stream Labels, yes, this is a similar feature than the one explained above in the labels property. Consider this as another way to set a record key in the Stream, but with the limitation that you cannot use a custom name for the key value.

The following configuration examples generate the same Stream Labels:

[OUTPUT]
    name       loki
    match      *
    labels     job=fluentbit
    label_keys $sub['stream']

the above configuration accomplish the same than this one:

[OUTPUT]
    name   loki
    match  *
    labels job=fluentbit, $sub['stream']

both will generate the following Streams label:

job="fluentbit", stream="stdout"

Kubernetes & Labels

Note that if you are running in a Kubernetes environment, you might want to enable the option auto_kubernetes_labels which will auto-populate the streams with the Pod labels for you. Consider the following configuration:

[OUTPUT]
    name                   loki
    match                  *
    labels                 job=fluentbit
    auto_kubernetes_labels on

Based in the JSON example provided above, the internal stream labels will be:

job="fluentbit", team="Santiago Wanderers"

Networking and TLS Configuration

This plugin inherit core Fluent Bit features to customize the network behavior and optionally enable TLS in the communication channel. For more details about the specific options available refer to the following articles:

  • Networking Setup: timeouts, keepalive and source address

  • Security & TLS: all about TLS configuration and certificates

Note that all options mentioned in the articles above must be enabled in the plugin configuration in question.

Fluent Bit + Grafana Cloud

Fluent Bit supports sending logs (and metrics) to Grafana Cloud by providing the appropriate URL and ensuring TLS is enabled.

An example configuration - make sure to set the credentials and ensure the host URL matches the correct one for your deployment:

    [OUTPUT]
        Name        loki
        Match       *
        Host        logs-prod-eu-west-0.grafana.net
        port        443
        tls         on
        tls.verify  on
        http_user   XXX
        http_passwd XXX

Getting Started

The following configuration example, will emit a dummy example record and ingest it on Loki . Copy and paste the following content into a file called out_loki.conf:

[SERVICE]
    flush     1
    log_level info

[INPUT]
    name      dummy
    dummy     {"key": 1, "sub": {"stream": "stdout", "id": "some id"}, "kubernetes": {"labels": {"team": "Santiago Wanderers"}}}
    samples   1

[OUTPUT]
    name                   loki
    match                  *
    host                   127.0.0.1
    port                   3100
    labels                 job=fluentbit
    label_keys             $sub['stream']
    auto_kubernetes_labels on

run Fluent Bit with the new configuration file:

$ fluent-bit -c out_loki.conf

Fluent Bit output:

Fluent Bit v1.7.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/10/14 20:57:45] [ info] [engine] started (pid=809736)
[2020/10/14 20:57:45] [ info] [storage] version=1.0.6, initializing...
[2020/10/14 20:57:45] [ info] [storage] in-memory
[2020/10/14 20:57:45] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/10/14 20:57:45] [ info] [output:loki:loki.0] configured, hostname=127.0.0.1:3100
[2020/10/14 20:57:45] [ info] [sp] stream processor started
[2020/10/14 20:57:46] [debug] [http] request payload (272 bytes)
[2020/10/14 20:57:46] [ info] [output:loki:loki.0] 127.0.0.1:3100, HTTP status=204

PostgreSQL

PostgreSQL is a very popular and versatile open source database management system that supports the SQL language and that is capable of storing both structured and unstructured data, such as JSON objects.

Given that Fluent Bit is designed to work with JSON objects, the pgsql output plugin allows users to send their data to a PostgreSQL database and store it using the JSONB type.

PostgreSQL 9.4 or higher is required.

Preliminary steps

According to the parameters you have set in the configuration file, the plugin will create the table defined by the table option in the database defined by the database option hosted on the server defined by the host option. It will use the PostgreSQL user defined by the user option, which needs to have the right privileges to create such a table in that database.

NOTE: If you are not familiar with how PostgreSQL's users and grants system works, you might find useful reading the recommended links in the "References" section at the bottom.

A typical installation normally consists of a self-contained database for Fluent Bit in which you can store the output of one or more pipelines. Ultimately, it is your choice to to store them in the same table, or in separate tables, or even in separate databases based on several factors, including workload, scalability, data protection and security.

In this example, for the sake of simplicity, we use a single table called fluentbit in a database called fluentbit that is owned by the user fluentbit. Feel free to use different names. Preferably, for security reasons, do not use the postgres user (which has SUPERUSER privileges).

Create the fluentbit user

Generate a robust random password (e.g. pwgen 20 1) and store it safely. Then, as postgres system user on the server where PostgreSQL is installed, execute:

createuser -P fluentbit

At the prompt, please provide the password that you previously generated.

As a result, the user fluentbit without superuser privileges will be created.

If you prefer, instead of the createuser application, you can directly use the SQL command CREATE USER.

Create the fluentbit database

As postgres system user, please run:

createdb -O fluentbit fluentbit

This will create a database called fluentbit owned by the fluentbit user. As a result, the fluentbit user will be able to safely create the data table.

Alternatively, you can use the SQL command CREATE DATABASE.

Connection

Make sure that the fluentbit user can connect to the fluentbit database on the specified target host. This might require you to properly configure the pg_hba.conf file.

Configuration Parameters

Key
Description
Default

Host

Hostname/IP address of the PostgreSQL instance

- (127.0.0.1)

Port

PostgreSQL port

- (5432)

User

PostgreSQL username

- (current user)

Password

Password of PostgreSQL username

-

Database

Database name to connect to

- (current user)

Table

Table name where to store data

-

Timestamp_Key

Key in the JSON object containing the record timestamp

date

Async

Define if we will use async or sync connections

false

min_pool_size

Minimum number of connection in async mode

1

max_pool_size

Maximum amount of connections in async mode

4

cockroachdb

Set to true if you will connect the plugin with a CockroachDB

false

Libpq

Fluent Bit relies on libpq, the PostgreSQL native client API, written in C language. For this reason, default values might be affected by environment variables and compilation settings. The above table, in brackets, list the most common default values for each connection option.

For security reasons, it is advised to follow the directives included in the password file section.

Configuration Example

In your main configuration file add the following section:

[OUTPUT]
    Name          pgsql
    Match         *
    Host          172.17.0.2
    Port          5432
    User          fluentbit
    Password      YourCrazySecurePassword
    Database      fluentbit
    Table         fluentbit
    Timestamp_Key ts

The output table

The output plugin automatically creates a table with the name specified by the table configuration option and made up of the following fields:

  • tag TEXT

  • time TIMESTAMP WITHOUT TIMEZONE

  • data JSONB

As you can see, the timestamp does not contain any information about the time zone and it is therefore referred to the time zone used by the connection to PostgreSQL (timezone setting).

For more information on the JSONB data type in PostgreSQL, please refer to the JSON types page in the official documentation, where you can find instructions on how to index or query the objects (including jsonpath introduced in PostgreSQL 12).

Scalability

PostgreSQL 10 introduces support for declarative partitioning. In order to improve vertical scalability of the database, you can decide to partition your tables on time ranges (for example on a monthly basis). PostgreSQL supports also subpartitions, allowing you to even partition by hash your records (version 11+), and default partitions (version 11+).

For more information on horizontal partitioning in PostgreSQL, please refer to the Table partitioning page in the official documentation.

If you are starting now, our recommendation at the moment is to choose the latest major version of PostgreSQL.

There's more ...

PostgreSQL is a really powerful and extensible database engine. More expert users can indeed take advantage of BEFORE INSERT triggers on the main table and re-route records on normalised tables, depending on tags and content of the actual JSON objects.

For example, you can use Fluent Bit to send HTTP log records to the landing table defined in the configuration file. This table contains a BEFORE INSERT trigger (a function in plpgsql language) that normalises the content of the JSON object and that inserts the record in another table (with its own structure and partitioning model). This kind of triggers allow you to discard the record from the landing table by returning NULL.

References

Here follows a list of useful resources from the PostgreSQL documentation:

  • Database Roles

  • GRANT

  • CREATE USER

  • CREATE DATABASE

  • The pg_hba.conf file

  • JSON types

  • Date/Time functions and operators

  • Table partitioning

  • libpq - C API for PostgreSQL

  • libpq - Environment variables

  • libpq - password file

  • Trigger functions

Embedded Metric Format
Dimension
Dimensions

Splunk

Send logs to Splunk HTTP Event Collector

Splunk output plugin allows to ingest your records into a service through the HTTP Event Collector (HEC) interface.

To get more details about how to setup the HEC in Splunk please refer to the following documentation:

Configuration Parameters

Connectivity, transport and authentication configuration properties:

Key
Description
default

Content and Splunk metadata (fields) handling configuration properties:

Key
Description
default

TLS / SSL

Splunk output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.

Getting Started

In order to insert records into a Splunk service, you can run the plugin from the command line or through the configuration file:

Command Line

The splunk plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Input & Output sections:

Data format

By default, the Splunk output plugin nests the record under the event key in the payload sent to the HEC. It will also append the time of the record to a top level time key.

If you would like to customize any of the Splunk event metadata, such as the host or target index, you can set Splunk_Send_Raw On in the plugin configuration, and add the metadata as keys/values in the record. Note: with Splunk_Send_Raw enabled, you are responsible for creating and populating the event section of the payload.

For example, to add a custom index and hostname:

This will create a payload that looks like:

For more information on the Splunk HEC payload format and all event meatadata Splunk accepts, see here:

Sending Raw Events

If the option splunk_send_raw has been enabled, the user must take care to put all log details in the event field, and only specify fields known to Splunk in the top level event, if there is a mismatch, Splunk will return a HTTP error 400.

Consider the following example:

splunk_send_raw off

splunk_send_raw on

For up to date information about the valid keys in the top level object, refer to the Splunk documentation:

Splunk Metric Index

With Splunk version 8.0> you can also use the Fluent Bit Splunk output plugin to send data to metric indices. This allows you to perform visualizations, metric queries, and analysis with other metrics you may be collecting. This is based off of Splunk 8.0 support of multi metric support via single JSON payload, more details can be found on

Sending to a Splunk Metric index requires the use of Splunk_send_raw option being enabled and formatting the message properly. This includes three specific operations

  • Nest metric events under a "fields" property

  • Add metric_name: to all metrics

  • Add index, source, sourcetype as fields in the message

Example Configuration

The following configuration gathers CPU metrics, nests the appropriate field, adds the required identifiers and then sends to Splunk.

host

IP address or hostname of the target Splunk service.

127.0.0.1

port

TCP port of the target Splunk service.

8088

splunk_token

Specify the Authentication Token for the HTTP Event Collector interface.

http_user

Optional username for Basic Authentication on HEC

http_passwd

Password for user defined in HTTP_User

http_buffer_size

Buffer size used to receive Splunk HTTP responses

2M

compress

Set payload compression mechanism. The only available option is gzip.

channel

Specify X-Splunk-Request-Channel Header for the HTTP Event Collector interface.

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

2

splunk_send_raw

When enabled, the record keys and values are set in the top level of the map instead of under the event key. Refer to the Sending Raw Events section from the docs for more details to make this option work properly.

off

event_key

Specify the key name that will be used to send a single value as part of the record.

event_host

Specify the key name that contains the host value. This option allows a record accessors pattern.

event_source

Set the source value to assign to the event data.

event_sourcetype

Set the sourcetype value to assign to the event data.

event_sourcetype_key

Set a record key that will populate 'sourcetype'. If the key is found, it will have precedence over the value set in event_sourcetype.

event_index

The name of the index by which the event data is to be indexed.

event_index_key

Set a record key that will populate the index field. If the key is found, it will have precedence over the value set in event_index.

event_field

Set event fields for the record. This option can be set multiple times and the format is key_name record_accessor_pattern.

$ fluent-bit -i cpu -t cpu -o splunk -p host=127.0.0.1 -p port=8088 \
  -p tls=on -p tls.verify=off -m '*'
[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name        splunk
    Match       *
    Host        127.0.0.1
    Port        8088
    TLS         On
    TLS.Verify  Off
[INPUT]
    Name  cpu
    Tag   cpu

# nest the record under the 'event' key
[FILTER]
    Name nest
    Match *
    Operation nest
    Wildcard *
    Nest_under event

# add event metadata
[FILTER]
    Name      modify
    Match     *
    Add index my-splunk-index
    Add host  my-host

[OUTPUT]
    Name        splunk
    Match       *
    Host        127.0.0.1
    Splunk_Token xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx
    Splunk_Send_Raw On
{
    "time": "1535995058.003385189",
    "index": "my-splunk-index",
    "host": "my-host",
    "event": {
        "cpu_p":0.000000,
        "user_p":0.000000,
        "system_p":0.000000
    }
}
{"time": ..., "event": {"k1": "foo", "k2": "bar", "index": "applogs"}}
{"time": .., "k1": "foo", "k2": "bar", "index": "applogs"}
[INPUT]
    name cpu
    tag cpu

# Move CPU metrics to be nested under "fields" and 
# add the prefix "metric_name:" to all metrics
# NOTE: you can change Wildcard field to only select metric fields    
[FILTER]
    Name nest
    Match cpu
    Wildcard *
    Operation nest
    Nest_under fields
    Add_Prefix metric_name:

# Add index, source, sourcetype
[FILTER]
    Name    modify
    Match   cpu
    Set index cpu-metrics 
    Set source fluent-bit
    Set sourcetype custom

# ensure splunk_send_raw is on
[OUTPUT]
    name splunk 
    match *
    host <HOST>
    port 8088
    splunk_send_raw on
    splunk_token f9bd5bdb-c0b2-4a83-bcff-9625e5e908db 
    tls on
    tls.verify off
Splunk Enterprise
Splunk / Use the HTTP Event Collector
TLS/SSL
http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC
http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC
Splunk's documentation page

Amazon S3

Send logs, data, metrics to Amazon S3

The Amazon S3 output plugin allows you to ingest your records into the S3 cloud object store.

The plugin can upload data to S3 using the multipart upload API or using S3 PutObject. Multipart is the default and is recommended; Fluent Bit will stream data in a series of 'parts'. This limits the amount of data it has to buffer on disk at any point in time. By default, every time 5 MiB of data have been received, a new 'part' will be uploaded. The plugin can create files up to gigabytes in size from many small chunks/parts using the multipart API. All aspects of the upload process are configurable using the configuration options.

The plugin allows you to specify a maximum file size, and a timeout for uploads. A file will be created in S3 when the max size is reached, or the timeout is reached- whichever comes first.

Records are stored in files in S3 as newline delimited JSON.

See here for details on how AWS credentials are fetched.

Configuration Parameters

Key
Description
Default

region

The AWS region of you S3 bucket

us-east-1

bucket

S3 Bucket name

None

json_date_key

Specify the name of the time key in the output record. To disable the time key just set the value to false.

date

json_date_format

Specify the format of the date. Supported formats are double, epoch, iso8601 (eg: 2018-05-30T09:39:52.000681Z) and java_sql_timestamp (eg: 2018-05-30 09:39:52.000681)

iso8601

total_file_size

Specifies the size of files in S3. Maximum size is 50G, minimim is 1M.

100M

upload_chunk_size

The size of each 'part' for multipart uploads. Max: 50M

5,242,880 bytes

upload_timeout

Whenever this amount of time has elapsed, Fluent Bit will complete an upload and create a new file in S3. For example, set this value to 60m and you will get a new file every hour.

10m

store_dir

Directory to locally buffer data before sending. When multipart uploads are used, data will only be buffered until the upload_chunk_size is reached.

/tmp/fluent-bit/s3

s3_key_format

Format string for keys in S3. This option supports a UUID, strftime time formatters, a syntax for selecting parts of the Fluent log tag using a syntax inspired by the rewrite_tag filter. Add $UUID in the format string to insert a random string. Add $TAG in the format string to insert the full log tag; add $TAG[0] to insert the first part of the tag in the s3 key. The tag is split into “parts” using the characters specified with the s3_key_format_tag_delimiters option. Add extension directly after the last piece of the format string to insert a key suffix. If you want to specify a key suffix and you are in use_put_object mode, you must specify $UUID as well. More explanations can be found in use_put_object option. See the in depth examples and tutorial in the documentation.

/fluent-bit-logs/$TAG/%Y/%m/%d/%H/%M/%S

s3_key_format_tag_delimiters

A series of characters which will be used to split the tag into 'parts' for use with the s3_key_format option. See the in depth examples and tutorial in the documentation.

.

use_put_object

Use the S3 PutObject API, instead of the multipart upload API. When this option is on, key extension is only available when $UUID is specified in s3_key_format. If $UUID is not included, a random string will be appended at the end of the format string and the key extension cannot be customized in this case.

false

role_arn

ARN of an IAM role to assume (ex. for cross account access).

None

endpoint

Custom endpoint for the S3 API.

None

sts_endpoint

Custom endpoint for the STS API.

None

canned_acl

for S3 objects.

None

compression

Compression type for S3 objects. 'gzip' is currently the only supported value. The Content-Encoding HTTP Header will be set to 'gzip'. Compression can be enabled when use_put_object is on.

None

content_type

A standard MIME type for the S3 object; this will be set as the Content-Type HTTP header. This option can be enabled when use_put_object is on.

None

send_content_md5

Send the Content-MD5 header with PutObject and UploadPart requests, as is required when Object Lock is enabled.

false

auto_retry_requests

Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues.

false

Permissions

The plugin requires s3:PutObject permission.

S3 Key Format and Tag Delimiters

In Fluent Bit, all logs have an associated tag. The s3_key_format option lets you inject the tag into the s3 key using the following syntax:

  • $TAG => the full tag

  • $TAG[n] => the nth part of the tag (index starting at zero). This syntax is copied from the rewrite tag filter. By default, “parts” of the tag are separated with dots, but you can change this with s3_key_format_tag_delimiters.

In the example below, assume the date is January 1st, 2020 00:00:00 and the tag associated with the logs in question is my_app_name-logs.prod.

[OUTPUT]
    Name                         s3
    Match                        *
    bucket                       my-bucket
    region                       us-west-2
    total_file_size              250M
    s3_key_format                /$TAG[2]/$TAG[0]/%Y/%m/%d/%H/%M/%S/$UUID.gz
    s3_key_format_tag_delimiters .-

With the delimiters as . and -, the tag will be split into parts as follows:

  • $TAG[0] = my_app_name

  • $TAG[1] = logs

  • $TAG[2] = prod

So the key in S3 will be /prod/my_app_name/2020/01/01/00/00/00/bgdHN1NM.gz.

Reliability

The store_dir is used to temporarily store data before it is uploaded. If Fluent Bit is stopped suddenly it will try to send all data and complete all uploads before it shuts down. If it can not send some data, on restart it will look in the store_dir for existing data and will try to send it.

Multipart uploads are ideal for most use cases because they allow the plugin to upload data in small chunks over time. For example, 1 GB file can be created from 200 5MB chunks. While the file size in S3 will be 1 GB, only 5 MB will be buffered on disk at any one point in time.

There is one minor drawback to multipart uploads- the file and data will not be visible in S3 until the upload is completed with a CompleteMultipartUpload call. The plugin will attempt to make this call whenever Fluent Bit is shut down to ensure your data is available in s3. It will also store metadata about each upload in the store_dir, ensuring that uploads can be completed when Fluent Bit restarts (assuming it has access to persistent disk and the store_dir files will still be present on restart).

Using S3 without persisted disk

If you run Fluent Bit in an environment without persistent disk, or without the ability to restart Fluent Bit and give it access to the data stored in the store_dir from previous executions- some considerations apply. This might occur if you run Fluent Bit on AWS Fargate.

In these situations, we recommend using the PutObject API, and sending data frequently, to avoid local buffering as much as possible. This will limit data loss in the event Fluent Bit is killed unexpectedly.

The following settings are recommended for this use case:

[OUTPUT]
     Name s3
     Match *
     bucket your-bucket
     region us-east-1
     total_file_size 1M
     upload_timeout 1m
     use_put_object On

Worker support

Fluent Bit 1.7 adds a new feature called workers which enables outputs to have dedicated threads. This s3 plugin has partial support for workers. The plugin can only support a single worker; enabling multiple workers will lead to errors/indeterminate behavior.

Example:

[OUTPUT]
     Name s3
     Match *
     bucket your-bucket
     region us-east-1
     total_file_size 1M
     upload_timeout 1m
     use_put_object On
     workers 1

If you enable a single worker, you are enabling a dedicated thread for your S3 output. We recommend starting without workers, evaluating the performance, and then enabling a worker if needed. For most users, the plugin can provide sufficient throughput without workers.

Getting Started

In order to send records into Amazon S3, you can run the plugin from the command line or through the configuration file.

Command Line

The s3 plugin, can read the parameters from the command line through the -p argument (property), e.g:

$ fluent-bit -i cpu -o s3 -p bucket=my-bucket -p region=us-west-2 -p -m '*' -f 1

Configuration File

In your main configuration file append the following Output section:

[OUTPUT]
     Name s3
     Match *
     bucket your-bucket
     region us-east-1
     store_dir /home/ec2-user/buffer
     total_file_size 50M
     upload_timeout 10m

An example that using PutObject instead of multipart:

[OUTPUT]
     Name s3
     Match *
     bucket your-bucket
     region us-east-1
     store_dir /home/ec2-user/buffer
     use_put_object On
     total_file_size 10M
     upload_timeout 10m

AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and this plugins.

GitHub

github.com/aws/aws-for-fluent-bit

Amazon ECR Public Gallery

aws-for-fluent-bit

Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:<tag>

For example, you can pull the image with latest version by:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest

If you see errors for image pull limits, try log into public ECR with your AWS credentials:

aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws

You can check the Amazon ECR Public official doc for more details.

Docker Hub

amazon/aws-for-fluent-bit

Amazon ECR

You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:

aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/

For more see the AWS for Fluent Bit github repo.

Forward

Forward is the protocol used by to route messages between peers. The forward output plugin provides interoperability between and . There are no configuration steps required besides specifying where is located, which can be a local or a remote destination.

This plugin offers two different transports and modes:

  • Forward (TCP): It uses a plain TCP connection.

  • Secure Forward (TLS): when TLS is enabled, the plugin switch to Secure Forward mode.

Configuration Parameters

The following parameters are mandatory for either Forward for Secure Forward modes:

Key
Description
Default

Secure Forward Mode Configuration Parameters

When using Secure Forward mode, the mode requires to be enabled. The following additional configuration parameters are available:

Key
Description
Default

Forward Setup

Before proceeding, make sure that is installed, if it's not the case please refer to the following document and go ahead with that.

Once is installed, create the following configuration file example that will allow us to stream data into it:

That configuration file specifies that it will listen for TCP connections on the port 24224 through the forward input type. Then for every message with a fluent_bit TAG, will print the message to the standard output.

In one terminal launch specifying the new configuration file created:

Fluent Bit + Forward Setup

Now that is ready to receive messages, we need to specify where the forward output plugin will flush the information using the following format:

If the TAG parameter is not set, the plugin will retain the tag. Keep in mind that TAG is important for routing rules inside .

Using the input plugin as an example we will flush CPU metrics to with tag fluent_bit:

Now on the side, you will see the CPU metrics gathered in the last seconds:

So we gathered metrics and flushed them out to properly.

Fluent Bit + Secure Forward Setup

DISCLAIMER: the following example does not consider the generation of certificates for best practice on production environments.

Secure Forward aims to provide a secure channel of communication with the remote Fluentd service using .

Fluent Bit

Paste this content in a file called flb.conf:

Fluentd

Paste this content in a file called fld.conf:

If you're using Fluentd v1, set up it as below:

Test Communication

Start Fluentd:

Start Fluent Bit:

After five seconds, Fluent Bit will write records to Fluentd. In Fluentd output you will see a message like this:

Host

Target host where Fluent-Bit or Fluentd are listening for Forward messages.

127.0.0.1

Port

TCP Port of the target service.

24224

Time_as_Integer

Set timestamps in integer format, it enable compatibility mode for Fluentd v0.12 series.

False

Upstream

If Forward will connect to an Upstream instead of a simple host, this property defines the absolute path for the Upstream configuration file, for more details about this refer to the Upstream Servers documentation section.

Tag

Overwrite the tag as we transmit. This allows the receiving pipeline start fresh, or to attribute source.

Send_options

Always send options (with "size"=count of messages)

False

Require_ack_response

Send "chunk"-option and wait for "ack" response from server. Enables at-least-once and receiving server can control rate of traffic. (Requires Fluentd v0.14.0+ server)

False

Compress

Set to "gzip" to enable gzip compression. Incompatible with Time_as_Integer=True and tags set dynamically using the Rewrite Tag filter. (Requires Fluentd v0.14.7+ server)

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

2

Shared_Key

A key string known by the remote Fluentd used for authorization.

Empty_Shared_Key

Use this option to connect to Fluentd with a zero-length secret.

False

Username

Specify the username to present to a Fluentd server that enables user_auth.

Password

Specify the password corresponding to the username.

Self_Hostname

Default value of the auto-generated certificate common name (CN).

localhost

tls

Enable or disable TLS support

Off

tls.verify

Force certificate validation

On

tls.debug

Set TLS debug verbosity level. It accept the following values: 0 (No debug), 1 (Error), 2 (State change), 3 (Informational) and 4 Verbose

1

tls.ca_file

Absolute path to CA certificate file

tls.crt_file

Absolute path to Certificate file.

tls.key_file

Absolute path to private Key file.

tls.key_passwd

Optional password for tls.key_file file.

<source>
  type forward
  bind 0.0.0.0
  port 24224
</source>

<match fluent_bit>
  type stdout
</match>
$ fluentd -c test.conf
2017-03-23 11:50:43 -0600 [info]: reading config file path="test.conf"
2017-03-23 11:50:43 -0600 [info]: starting fluentd-0.12.33
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-mixin-config-placeholders' version '0.3.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-docker' version '0.1.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-elasticsearch' version '1.4.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-flatten-hash' version '0.2.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.0.4'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-influxdb' version '0.2.8'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-json-in-json' version '0.1.4'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-mongo' version '0.7.10'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-out-http' version '0.1.3'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-parser' version '0.6.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-record-reformer' version '0.7.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-stdin' version '0.1.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-td' version '0.10.27'
2017-03-23 11:50:43 -0600 [info]: adding match pattern="fluent_bit" type="stdout"
2017-03-23 11:50:43 -0600 [info]: adding source type="forward"
2017-03-23 11:50:43 -0600 [info]: using configuration file: <ROOT>
  <source>
    type forward
    bind 0.0.0.0
    port 24224
  </source>
  <match fluent_bit>
    type stdout
  </match>
</ROOT>
2017-03-23 11:50:43 -0600 [info]: listening fluent socket on 0.0.0.0:24224
bin/fluent-bit -i INPUT -o forward://HOST:PORT
$ bin/fluent-bit -i cpu -t fluent_bit -o forward://127.0.0.1:24224
2017-03-23 11:53:06 -0600 fluent_bit: {"cpu_p":0.0,"user_p":0.0,"system_p":0.0,"cpu0.p_cpu":0.0,"cpu0.p_user":0.0,"cpu0.p_system":0.0,"cpu1.p_cpu":0.0,"cpu1.p_user":0.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0}
2017-03-23 11:53:07 -0600 fluent_bit: {"cpu_p":2.25,"user_p":2.0,"system_p":0.25,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":1.0,"cpu1.p_user":1.0,"cpu1.p_system":0.0,"cpu2.p_cpu":1.0,"cpu2.p_user":1.0,"cpu2.p_system":0.0,"cpu3.p_cpu":3.0,"cpu3.p_user":2.0,"cpu3.p_system":1.0}
2017-03-23 11:53:08 -0600 fluent_bit: {"cpu_p":1.75,"user_p":1.0,"system_p":0.75,"cpu0.p_cpu":2.0,"cpu0.p_user":1.0,"cpu0.p_system":1.0,"cpu1.p_cpu":3.0,"cpu1.p_user":1.0,"cpu1.p_system":2.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":2.0,"cpu3.p_user":1.0,"cpu3.p_system":1.0}
2017-03-23 11:53:09 -0600 fluent_bit: {"cpu_p":4.75,"user_p":3.5,"system_p":1.25,"cpu0.p_cpu":4.0,"cpu0.p_user":3.0,"cpu0.p_system":1.0,"cpu1.p_cpu":5.0,"cpu1.p_user":4.0,"cpu1.p_system":1.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":5.0,"cpu3.p_user":4.0,"cpu3.p_system":1.0}
[SERVICE]
    Flush      5
    Daemon     off
    Log_Level  info

[INPUT]
    Name       cpu
    Tag        cpu_usage

[OUTPUT]
    Name          forward
    Match         *
    Host          127.0.0.1
    Port          24284
    Shared_Key    secret
    Self_Hostname flb.local
    tls           on
    tls.verify    off
<source>
  @type         secure_forward
  self_hostname myserver.local
  shared_key    secret
  secure no
</source>

<match **>
 @type stdout
</match>
<source>
  @type forward
  <transport tls>
    cert_path /etc/td-agent/certs/fluentd.crt
    private_key_path /etc/td-agent/certs/fluentd.key
    private_key_passphrase password
  </transport>
  <security>
    self_hostname myserver.local
    shared_key secret
  </security>
</source>

<match **>
 @type stdout
</match>
$ fluentd -c fld.conf
$ fluent-bit -c flb.conf
2017-03-23 13:34:40 -0600 [info]: using configuration file: <ROOT>
  <source>
    @type secure_forward
    self_hostname myserver.local
    shared_key xxxxxx
    secure no
  </source>
  <match **>
    @type stdout
  </match>
</ROOT>
2017-03-23 13:34:41 -0600 cpu_usage: {"cpu_p":1.0,"user_p":0.75,"system_p":0.25,"cpu0.p_cpu":1.0,"cpu0.p_user":1.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":1.0,"cpu1.p_system":1.0,"cpu2.p_cpu":1.0,"cpu2.p_user":1.0,"cpu2.p_system":0.0,"cpu3.p_cpu":2.0,"cpu3.p_user":1.0,"cpu3.p_system":1.0}
2017-03-23 13:34:42 -0600 cpu_usage: {"cpu_p":1.75,"user_p":1.75,"system_p":0.0,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":2.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0}
2017-03-23 13:34:43 -0600 cpu_usage: {"cpu_p":1.75,"user_p":1.25,"system_p":0.5,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":2.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":0.0,"cpu3.p_system":1.0}
2017-03-23 13:34:44 -0600 cpu_usage: {"cpu_p":5.0,"user_p":3.25,"system_p":1.75,"cpu0.p_cpu":4.0,"cpu0.p_user":2.0,"cpu0.p_system":2.0,"cpu1.p_cpu":8.0,"cpu1.p_user":5.0,"cpu1.p_system":3.0,"cpu2.p_cpu":4.0,"cpu2.p_user":3.0,"cpu2.p_system":1.0,"cpu3.p_cpu":4.0,"cpu3.p_user":2.0,"cpu3.p_system":2.0}
Fluentd
Fluent Bit
Fluentd
Fluentd
TLS
Fluentd
Fluentd Installation
Fluentd
Fluentd
Fluentd
Fluentd
CPU
Fluentd
Fluentd
CPU
Fluentd
TLS
Predefined Canned ACL policy

Elasticsearch

Send logs to Elasticsearch (including Amazon OpenSearch Service)

The es output plugin, allows to ingest your records into an Elasticsearch database. The following instructions assumes that you have a fully operational Elasticsearch service running in your environment.

Configuration Parameters

Key
Description
default

Host

IP address or hostname of the target Elasticsearch instance

127.0.0.1

Port

TCP port of the target Elasticsearch instance

9200

Path

Elasticsearch accepts new data on HTTP query path "/_bulk". But it is also possible to serve Elasticsearch behind a reverse proxy on a subpath. This option defines such path on the fluent-bit side. It simply adds a path prefix in the indexing HTTP POST URI.

Empty string

Buffer_Size

Specify the buffer size used to read the response from the Elasticsearch HTTP service. This option is useful for debugging purposes where is required to read full responses, note that response size grows depending of the number of records inserted. To set an unlimited amount of memory set this value to False, otherwise the value must be according to the specification.

4KB

Pipeline

Newer versions of Elasticsearch allows to setup filters called pipelines. This option allows to define which pipeline the database should use. For performance reasons is strongly suggested to do parsing and filtering on Fluent Bit side, avoid pipelines.

AWS_Auth

Enable AWS Sigv4 Authentication for Amazon OpenSearch Service

Off

AWS_Region

Specify the AWS region for Amazon OpenSearch Service

AWS_STS_Endpoint

Specify the custom sts endpoint to be used with STS API for Amazon OpenSearch Service

AWS_Role_ARN

AWS IAM Role to assume to put records to your Amazon cluster

AWS_External_ID

External ID for the AWS IAM Role specified with aws_role_arn

Cloud_ID

If you are using Elastic's Elasticsearch Service you can specify the cloud_id of the cluster running

Cloud_Auth

Specify the credentials to use to connect to Elastic's Elasticsearch Service running on Elastic Cloud

HTTP_User

Optional username credential for Elastic X-Pack access

HTTP_Passwd

Password for user defined in HTTP_User

Index

Index name

fluent-bit

Type

Type name

_doc

Logstash_Format

Enable Logstash format compatibility. This option takes a boolean value: True/False, On/Off

Off

Logstash_Prefix

When Logstash_Format is enabled, the Index name is composed using a prefix and the date, e.g: If Logstash_Prefix is equals to 'mydata' your index will become 'mydata-YYYY.MM.DD'. The last string appended belongs to the date when the data is being generated.

logstash

Logstash_DateFormat

Time format (based on ) to generate the second part of the Index name.

%Y.%m.%d

Time_Key

When Logstash_Format is enabled, each record will get a new timestamp field. The Time_Key property defines the name of that field.

@timestamp

Time_Key_Format

When Logstash_Format is enabled, this property defines the format of the timestamp.

%Y-%m-%dT%H:%M:%S

Time_Key_Nanos

When Logstash_Format is enabled, enabling this property sends nanosecond precision timestamps.

Off

Include_Tag_Key

When enabled, it append the Tag name to the record.

Off

Tag_Key

When Include_Tag_Key is enabled, this property defines the key name for the tag.

_flb-key

Generate_ID

When enabled, generate _id for outgoing records. This prevents duplicate records when retrying ES.

Off

Id_Key

If set, _id will be the value of the key from incoming record and Generate_ID option is ignored.

Replace_Dots

When enabled, replace field name dots with underscore, required by Elasticsearch 2.0-2.3.

Off

Trace_Output

When enabled print the elasticsearch API calls to stdout (for diag only)

Off

Trace_Error

When enabled print the elasticsearch API calls to stdout when elasticsearch returns an error (for diag only)

Off

Current_Time_Index

Use current time for index generation instead of message record

Off

Logstash_Prefix_Key

When included: the value in the record that belongs to the key will be looked up and over-write the Logstash_Prefix for index generation. If the key/value is not found in the record then the Logstash_Prefix option will act as a fallback. Nested keys are not supported (if desired, you can use the nest filter plugin to remove nesting)

Suppress_Type_Name

When enabled, mapping types is removed and Type option is ignored. Types are deprecated in APIs in . This options is for v7.0 or later.

Off

Workers

Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0.

2

The parameters index and type can be confusing if you are new to Elastic, if you have used a common relational database before, they can be compared to the database and table concepts. Also see the FAQ below

TLS / SSL

Elasticsearch output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.

Getting Started

In order to insert records into a Elasticsearch service, you can run the plugin from the command line or through the configuration file:

Command Line

The es plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

es://host:port/index/type

Using the format specified, you could start Fluent Bit through:

$ fluent-bit -i cpu -t cpu -o es://192.168.2.3:9200/my_index/my_type \
    -o stdout -m '*'

which is similar to do:

$ fluent-bit -i cpu -t cpu -o es -p Host=192.168.2.3 -p Port=9200 \
    -p Index=my_index -p Type=my_type -o stdout -m '*'

Configuration File

In your main configuration file append the following Input & Output sections. You can visualize this configuration here

[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name  es
    Match *
    Host  192.168.2.3
    Port  9200
    Index my_index
    Type  my_type
example configuration visualization from config.calyptia.com

About Elasticsearch field names

Some input plugins may generate messages where the field names contains dots, since Elasticsearch 2.0 this is not longer allowed, so the current es plugin replaces them with an underscore, e.g:

{"cpu0.p_cpu"=>17.000000}

becomes

{"cpu0_p_cpu"=>17.000000}

FAQ

Elasticsearch rejects requests saying "the final mapping would have more than 1 type"

Since Elasticsearch 6.0, you cannot create multiple types in a single index. This means that you cannot set up your configuration as below anymore.

[OUTPUT]
    Name  es
    Match foo.*
    Index search
    Type  type1

[OUTPUT]
    Name  es
    Match bar.*
    Index search
    Type  type2

If you see an error message like below, you'll need to fix your configuration to use a single type on each index.

Rejecting mapping update to [search] as the final mapping would have more than 1 type

For details, please read the official blog post on that issue.

Elasticsearch rejects requests saying "Document mapping type name can't start with '_'"

Fluent Bit v1.5 changed the default mapping type from flb_type to _doc, which matches the recommendation from Elasticsearch from version 6.2 forwards (see commit with rationale). This doesn't work in Elasticsearch versions 5.6 through 6.1 (see Elasticsearch discussion and fix). Ensure you set an explicit map (such as doc or flb_type) in the configuration, as seen on the last line:

[OUTPUT]
    Name  es
    Match *
    Host  vpc-test-domain-ke7thhzoo7jawsrhmm6mb7ite7y.us-west-2.es.amazonaws.com
    Port  443
    Index my_index
    AWS_Auth On
    AWS_Region us-west-2
    tls   On
    Type  doc

Fluent Bit + Amazon OpenSearch Service

The Amazon OpenSearch Service adds an extra security layer where HTTP requests must be signed with AWS Sigv4. Fluent Bit v1.5 introduced full support for Amazon OpenSearch Service with IAM Authentication.

See here for details on how AWS credentials are fetched.

Example configuration:

[OUTPUT]
    Name  es
    Match *
    Host  vpc-test-domain-ke7thhzoo7jawsrhmm6mb7ite7y.us-west-2.es.amazonaws.com
    Port  443
    Index my_index
    Type  my_type
    AWS_Auth On
    AWS_Region us-west-2
    tls     On

Notice that the Port is set to 443, tls is enabled, and AWS_Region is set.

Fluent Bit + Elastic Cloud

Fluent Bit supports connecting to Elastic Cloud providing just the cloud_id and the cloud_auth settings.

Example configuration:

[OUTPUT]
    Name es
    Include_Tag_Key true
    Tag_Key tags
    tls On
    tls.verify Off
    cloud_id elastic-obs-deployment:ZXVybxxxxxxxxxxxg==
    cloud_auth elastic:2vxxxxxxxxYV

Validation Failed: 1: an id must be provided if version type or value are set

Since v1.8.2, Fluent Bit started using create method (instead of index) for data submission. This makes Flunt Bit compatible with Datastream introduced in Elasticsearch 7.9.

If you see action_request_validation_exception errors on your pipeline with Fluent Bit >= v1.8.2, you can fix it up by turning on Generate_ID as follows:

[OUTPUT]
    Name es
    Match *
    Host  192.168.12.1
    Generate_ID on
Unit Size
strftime
v7.0