1 of 31

Outputs

Amazon CloudWatch

Send logs and metrics to Amazon CloudWatch

The Amazon CloudWatch output plugin allows to ingest your records into the CloudWatch Logs service. Support for CloudWatch Metrics is also provided via EMF.

This is the documentation for the core Fluent Bit CloudWatch plugin written in C. It can replace the aws/amazon-cloudwatch-logs-for-fluent-bit Golang Fluent Bit plugin released last year. The Golang plugin was named cloudwatch; this new high performance CloudWatch plugin is called cloudwatch_logs to prevent conflicts/confusion. Check the amazon repo for the Golang plugin for details on the deprecation/migration plan for the original plugin.

Configuration Parameters

Getting Started

In order to send records into Amazon Cloudwatch, you can run the plugin from the command line or through the configuration file:

Command Line

The cloudwatch plugin, can read the parameters from the command line through the -p argument (property), e.g:

$ fluent-bit -i cpu -o cloudwatch_logs -p log_group_name=group -p log_stream_name=stream -p region=us-west-2 -m '*' -f 1

Configuration File

In your main configuration file append the following Output section:

[OUTPUT]
    Name cloudwatch_logs
    Match   *
    region us-east-1
    log_group_name fluent-bit-cloudwatch
    log_stream_prefix from-fluent-bit-
    auto_create_group On

Metrics Tutorial

Fluent Bit has different input plugins (cpu, mem, disk, netif) to collect host resource usage metrics. cloudwatch_logs output plugin can be used to send these host metrics to CloudWatch in Embedded Metric Format (EMF). If data comes from any of the above mentioned input plugins, cloudwatch_logs output plugin will convert them to EMF format and sent to CloudWatch as JSON log. Additionally, if we set json/emf as the value of log_format config option, CloudWatch will extract custom metrics from embedded JSON payload.

Note: Right now, only cpu and mem metrics can be sent to CloudWatch.

For using the mem input plugin and sending memory usage metrics to CloudWatch, we can consider the following example config file. Here, we use the aws filter which adds ec2_instance_id and az (availability zone) to the log records. Later, in the output config section, we set ec2_instance_id as our metric dimension.

[SERVICE]
    Log_Level info

[INPUT]
    Name mem
    Tag mem

[FILTER]
    Name aws
    Match *

[OUTPUT]
    Name cloudwatch_logs
    Match *
    log_stream_name fluent-bit-cloudwatch
    log_group_name fluent-bit-cloudwatch
    region us-west-2
    log_format json/emf
    metric_namespace fluent-bit-metrics
    metric_dimensions ec2_instance_id
    auto_create_group true

The following config will set two dimensions to all of our metrics- ec2_instance_id and az.

[FILTER]
    Name aws
    Match *

[OUTPUT]
    Name cloudwatch_logs
    Match *
    log_stream_name fluent-bit-cloudwatch
    log_group_name fluent-bit-cloudwatch
    region us-west-2
    log_format json/emf
    metric_namespace fluent-bit-metrics
    metric_dimensions ec2_instance_id,az
    auto_create_group true

AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and these plugins.

GitHub

github.com/aws/aws-for-fluent-bit

Amazon ECR Public Gallery

aws-for-fluent-bit

Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:<tag>

For example, you can pull the image with latest version by:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest

If you see errors for image pull limits, try log into public ECR with your AWS credentials:

aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws

You can check the Amazon ECR Public official doc for more details

Docker Hub

amazon/aws-for-fluent-bit

Amazon ECR

You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:

aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/

For more see the AWS for Fluent Bit github repo.

Amazon Kinesis Data Firehose

Send logs to Amazon Kinesis Firehose

The Amazon Kinesis Data Firehose output plugin allows to ingest your records into the Firehose service.

This is the documentation for the core Fluent Bit Firehose plugin written in C. It can replace the aws/amazon-kinesis-firehose-for-fluent-bit Golang Fluent Bit plugin released last year. The Golang plugin was named firehose; this new high performance and highly efficient firehose plugin is called kinesis_firehose to prevent conflicts/confusion.

Configuration Parameters

Getting Started

In order to send records into Amazon Kinesis Data Firehose, you can run the plugin from the command line or through the configuration file:

Command Line

The firehose plugin, can read the parameters from the command line through the -p argument (property), e.g:

$ fluent-bit -i cpu -o kinesis_firehose -p delivery_stream=my-stream -p region=us-west-2 -m '*' -f 1

Configuration File

In your main configuration file append the following Output section:

[OUTPUT]
    Name  kinesis_firehose
    Match *
    region us-east-1
    delivery_stream my-stream

AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and these plugins.

GitHub

github.com/aws/aws-for-fluent-bit

Amazon ECR Public Gallery

aws-for-fluent-bit

Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:<tag>

For example, you can pull the image with latest version by:

docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest

If you see errors for image pull limits, try log into public ECR with your AWS credentials:

aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws

You can check the Amazon ECR Public official doc for more details.

Docker Hub

amazon/aws-for-fluent-bit

Amazon ECR

You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:

aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/

For more see the AWS for Fluent Bit github repo.

Amazon S3

Send logs, data, metrics to Amazon S3

The Amazon S3 output plugin allows you to ingest your records into the S3 cloud object store.

The plugin can upload data to S3 using the multipart upload API or using S3 PutObject. Multipart is the default and is recommended; Fluent Bit will stream data in a series of 'parts'. This limits the amount of data it has to buffer on disk at any point in time. By default, every time 5 MiB of data have been received, a new 'part' will be uploaded. The plugin can create files up to gigabytes in size from many small chunks/parts using the multipart API. All aspects of the upload process are configurable using the configuration options.

The plugin allows you to specify a maximum file size, and a timeout for uploads. A file will be created in S3 when the max size is reached, or the timeout is reached- whichever comes first.

Records are stored in files in S3 as newline delimited JSON.

Configuration Parameters

S3 Key Format and Tag Delimiters

In Fluent Bit, all logs have an associated tag. The s3_key_format option lets you inject the tag into the s3 key using the following syntax:

$TAG => the full tag
$TAG[n] => the nth part of the tag (index starting at zero). This syntax is copied from the rewrite tag filter. By default, “parts” of the tag are separated with dots, but you can change this with s3_key_format_tag_delimiters.

In the example below, assume the date is January 1st, 2020 00:00:00 and the tag associated with the logs in question is my_app_name-logs.prod.

[OUTPUT]
    Name                         s3
    Match                        *
    bucket                       my-bucket
    region                       us-west-2
    total_file_size              250M
    s3_key_format                $TAG[2]/$TAG[0]/%Y/%m/%d/%H/%M/%S
    s3_key_format_tag_delimiters .-_

With the delimiters as . and -, the tag will be split into parts as follows:

$TAG[0] = my_app_name
$TAG[1] = logs
$TAG[2] = prod

So the key in S3 will be prod/my_app_name/2020/01/01/00/00/00.

Reliability

The store_dir is used to temporarily store data before it is uploaded. If Fluent Bit is stopped suddenly it will try to send all data and complete all uploads before it shuts down. If it can not send some data, on restart it will look in the store_dir for existing data and will try to send it.

Multipart uploads are ideal for most use cases because they allow the plugin to upload data in small chunks over time. For example, 1 GB file can be created from 200 5MB chunks. While the file size in S3 will be 1 GB, only 5 MB will be buffered on disk at any one point in time.

There is one minor drawback to multipart uploads- the file and data will not be visible in S3 until the upload is completed with a CompleteMultipartUpload call. The plugin will attempt to make this call whenever Fluent Bit is shut down to ensure your data is available in s3. It will also store metadata about each upload in the store_dir, ensuring that uploads can be completed when Fluent Bit restarts (assuming it has access to persistent disk and the store_dir files will still be present on restart).

Using S3 without persisted disk

If you run Fluent Bit in an environment without persistent disk, or without the ability to restart Fluent Bit and give it access to the data stored in the store_dir from previous executions- some considerations apply. This might occur if you run Fluent Bit on AWS Fargate.

In these situations, we recommend using the PutObject API, and sending data frequently, to avoid local buffering as much as possible. This will limit data loss in the event Fluent Bit is killed unexpectedly.

The following settings are recommended for this use case:

[OUTPUT]
     Name s3
     Match *
     bucket your-bucket
     region us-east-1
     total_file_size 1M
     upload_timeout 1m

Getting Started

In order to send records into Amazon S3, you can run the plugin from the command line or through the configuration file.

Command Line

The s3 plugin, can read the parameters from the command line through the -p argument (property), e.g:

$ fluent-bit -i cpu -o s3 -p bucket=my-bucket -p region=us-west-2 -p -m '*' -f 1

Configuration File

In your main configuration file append the following Output section:

[OUTPUT]
     Name s3
     Match *
     bucket your-bucket
     region us-east-1
     store_dir /home/ec2-user/buffer
     total_file_size 50M
     upload_timeout 10m

An example that using PutObject instead of multipart:

[OUTPUT]
     Name s3
     Match *
     bucket your-bucket
     region us-east-1
     store_dir /home/ec2-user/buffer
     use_put_object On
     total_file_size 10M
     upload_timeout 10m

AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and this plugins.

GitHub

github.com/aws/aws-for-fluent-bit

Docker Hub

amazon/aws-for-fluent-bit

Amazon ECR

You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:

aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/

For more see the AWS for Fluent Bit github repo.

Azure Log Analytics

Send logs, metrics to Azure Log Analytics

Azure output plugin allows to ingest your records into Azure Log Analytics service.

To get more details about how to setup Azure Log Analytics, please refer to the following documentation: Azure Log Analytics

Configuration Parameters

Getting Started

In order to insert records into an Azure Log Analytics instance, you can run the plugin from the command line or through the configuration file:

Command Line

The azure plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

$ fluent-bit -i cpu -o azure -p customer_id=abc -p shared_key=def -m '*' -f 1

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name  cpu

[OUTPUT]
    Name        azure
    Match       *
    Customer_ID abc
    Shared_Key  def

Azure Blob

Official and Microsoft Certified Azure Storage Blob connector

The Azure Blob output plugin allows ingesting your records into Azure Blob Storage service. This connector is designed to use the Append Blob and Block Blob API.

Our plugin works with the official Azure Service and also can be configured to be used with a service emulator such as Azurite.

Azure Storage Account

Before getting started, make sure you already have an Azure Storage account. As a reference, the following link explains step-by-step how to set up your account:

Azure Blob Storage Tutorial (Video)

Configuration Parameters

We expose different configuration properties. The following table lists all the options available, and the next section has specific configuration details for the official service or the emulator.

Getting Started

As mentioned above, you can either deliver records to the official service or an emulator. Below we have an example for each use case.

Configuration for Azure Storage Service

The following configuration example generates a random message with a custom tag:

[SERVICE]
    flush     1
    log_level info

[INPUT]
    name      dummy
    dummy     {"name": "Fluent Bit", "year": 2020}
    samples   1
    tag       var.log.containers.app-default-96cbdef2340.log

[OUTPUT]
    name                  azure_blob
    match                 *
    account_name          YOUR_ACCOUNT_NAME
    shared_key            YOUR_SHARED_KEY
    path                  kubernetes
    container_name        logs
    auto_create_container on
    tls                   on

After you run the configuration file above, you will be able to query the data using the Azure Storage Explorer. The example above will generate the following content in the explorer:

Configuring and using Azure Emulator: Azurite

Install and run Azurite

The quickest way to get started is to install Azurite using npm:

$ npm install -g azurite

then run the service:

$ azurite
Azurite Blob service is starting at http://127.0.0.1:10000
Azurite Blob service is successfully listening at http://127.0.0.1:10000
Azurite Queue service is starting at http://127.0.0.1:10001
Azurite Queue service is successfully listening at http://127.0.0.1:10001

Configuring Fluent Bit for Azurite

Azurite comes with a default account_name and shared_key, so make sure to use the specific values provided in the example below (do an exact copy/paste):

[SERVICE]
    flush     1
    log_level info

[INPUT]
    name      dummy
    dummy     {"name": "Fluent Bit", "year": 2020}
    samples   1
    tag       var.log.containers.app-default-96cbdef2340.log

[OUTPUT]
    name                  azure_blob
    match                 *
    account_name          devstoreaccount1
    shared_key            Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==
    path                  kubernetes
    container_name        logs
    auto_create_container on
    tls                   off
    emulator_mode         on
    endpoint              http://127.0.0.1:10000

after running that Fluent Bit configuration you will see the data flowing into Azurite:

$ azurite
Azurite Blob service is starting at http://127.0.0.1:10000
Azurite Blob service is successfully listening at http://127.0.0.1:10000
Azurite Queue service is starting at http://127.0.0.1:10001
Azurite Queue service is successfully listening at http://127.0.0.1:10001
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "GET /devstoreaccount1/logs?restype=container HTTP/1.1" 404 -
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "PUT /devstoreaccount1/logs?restype=container HTTP/1.1" 201 -
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log?comp=appendblock HTTP/1.1" 404 -
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log HTTP/1.1" 201 -
127.0.0.1 - - [03/Sep/2020:17:40:04 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log?comp=appendblock HTTP/1.1" 201 -

Google Cloud BigQuery

BigQuery output plugin is an experimental plugin that allows you to stream records into Google Cloud BigQuery service. The implementation does not support the following, which would be expected in a full production version:

Application Default Credentials.
Data deduplication using insertId.
Template tables using templateSuffix.

Google Cloud Configuration

Fluent Bit streams data into an existing BigQuery table using a service account that you specify. Therefore, before using the BigQuery output plugin, you must create a service account, create a BigQuery dataset and table, authorize the service account to write to the table, and provide the service account credentials to Fluent Bit.

Creating a Service Account

To stream data into BigQuery, the first step is to create a Google Cloud service account for Fluent Bit:

Creating a Google Cloud Service Account

Creating a BigQuery Dataset and Table

Fluent Bit does not create datasets or tables for your data, so you must create these ahead of time. You must also grant the service account WRITER permission on the dataset:

Creating and using datasets

Within the dataset you will need to create a table for the data to reside in. You can follow the following instructions for creating your table. Pay close attention to the schema. It must match the schema of your output JSON. Unfortunately, since BigQuery does not allow dots in field names, you will need to use a filter to change the fields for many of the standard inputs (e.g, mem or cpu).

Creating and using tables

Retrieving Service Account Credentials

Fluent Bit BigQuery output plugin uses a JSON credentials file for authentication credentials. Download the credentials file by following these instructions:

Creating and Managing Service Account Keys

Configurations Parameters

Configuration File

If you are using a Google Cloud Credentials File, the following configuration is enough to get you started:

[INPUT]
    Name  dummy
    Tag   dummy

[OUTPUT]
    Name       bigquery
    Match      *
    dataset_id my_dataset
    table_id   dummy_table

Counter

Counter is a very simple plugin that counts how many records it's getting upon flush time. Plugin output is as follows:

[TIMESTAMP, NUMBER_OF_RECORDS_NOW] (total = RECORDS_SINCE_IT_STARTED)

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

$ fluent-bit -i cpu -o counter

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name cpu
    Tag  cpu

[OUTPUT]
    Name  counter
    Match *

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

$ bin/fluent-bit -i cpu -o counter -f 1
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2017/07/19 11:19:02] [ info] [engine] started
1500484743,1 (total = 1)
1500484744,1 (total = 2)
1500484745,1 (total = 3)
1500484746,1 (total = 4)
1500484747,1 (total = 5)

Datadog

Send logs to Datadog

The Datadog output plugin allows to ingest your logs into .

Before you begin, you need a , a , and you need to .

Configuration Parameters

Configuration File

Get started quickly with this configuration file:

Troubleshooting

403 Forbidden

File

The file output plugin allows to write the data received through the input plugin to file.

Configuration Parameters

The plugin supports the following configuration parameters:

Format

out_file format

Output time, tag and json records. There is no configuration parameters for out_file.

plain format

Output the records as JSON (without additional tag and timestamp attributes). There is no configuration parameters for plain format.

csv format

Output the records as csv. Csv supports an additional configuration parameter.

ltsv format

Output the records as LTSV. LTSV supports an additional configuration parameter.

template format

Output the records using a custom format template.

This accepts a formatting template and fills placeholders using corresponding values in a record.

For example, if you set up the configuration as below:

You will get the following output:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

FlowCounter

FlowCounter is the protocol to count records. The flowcounter output plugin allows to count up records and its size.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

Forward

Forward is the protocol used by to route messages between peers. The forward output plugin allows to provide interoperability between and . There are not configuration steps required besides to specify where is located, it can be in the local host or a in a remote machine.

This plugin offers two different transports and modes:

Forward (TCP): It uses a plain TCP connection.
Secure Forward (TLS): when TLS is enabled, the plugin switch to Secure Forward mode.

Configuration Parameters

The following parameters are mandatory for either Forward for Secure Forward modes:

Secure Forward Mode Configuration Parameters

When using Secure Forward mode, the mode requires to be enabled. The following additional configuration parameters are available:

Forward Setup

That configuration file specifies that it will listen for TCP connections on the port 24224 through the forward input type. Then for every message with a fluent_bit TAG, will print the message to the standard output.

Fluent Bit + Forward Setup

Fluent Bit + Secure Forward Setup

DISCLAIMER: the following example do not consider the generation of certificates for a proper usage of production environments.

Fluent Bit

Paste this content in a file called flb.conf:

Fluentd

Paste this content in a file called fld.conf:

If you're using Fluentd v1, set up it as below:

Test Communication

Start Fluentd:

Start Fluent Bit:

After five seconds, Fluent Bit will write the records to Fluentd. In Fluentd output you will see a message like this:

GELF

GELF is Graylog Extended Log Format. The GELF output plugin allows to send logs in GELF format directly to a Graylog input using TLS, TCP or UDP protocols.

The following instructions assumes that you have a fully operational Graylog server running in your environment.

Configuration Parameters

According to GELF Payload Specification, there are some mandatory and optional fields which are used by Graylog in GELF format. These fields are determined with Gelf\*_Key_ key in this plugin.

TLS / SSL

GELF output plugin supports TLS/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.

Notes

If you're using Fluent Bit to collect Docker logs, note that Docker places your log in JSON under key log. So you can set log as your Gelf_Short_Message_Key to send everything in Docker logs to Graylog. In this case, you need your log value to be a string; so don't parse it using JSON parser.
The order of looking up the timestamp in this plugin is as follows:
1. Value of Gelf_Timestamp_Key provided in configuration
2. Value of timestamp key
3. If you're using Docker JSON parser, this parser can parse time and use it as timestamp of message. If all above fail, Fluent Bit tries to get timestamp extracted by your parser.
4. Timestamp does not set by Fluent Bit. In this case, your Graylog server will set it to the current timestamp (now).
Your log timestamp has to be in UNIX Epoch Timestamp format. If the Gelf_Timestamp_Key value of your log is not in this format, your Graylog server will ignore it.
If you're using Fluent Bit in Kubernetes and you're using Kubernetes Filter Plugin, this plugin adds host value to your log by default, and you don't need to add it by your own.
The version of GELF message is also mandatory and Fluent Bit sets it to 1.1 which is the current latest version of GELF.
If you use udp as transport protocol and set Compress to true, Fluent Bit compresses your packets in GZIP format, which is the default compression that Graylog offers. This can be used to trade more CPU load for saving network bandwidth.

Configuration File Example

If you're using Fluent Bit for shipping Kubernetes logs, you can use something like this as your configuration file:

[INPUT]
    Name                    tail
    Tag                     kube.*
    Path                    /var/log/containers/*.log
    Parser                  docker
    DB                      /var/log/flb_kube.db
    Mem_Buf_Limit           5MB
    Refresh_Interval        10

[FILTER]
    Name                    kubernetes
    Match                   kube.*
    Merge_Log_Key           log
    Merge_Log               On
    Keep_Log                Off
    Annotations             Off
    Labels                  Off

[FILTER]
    Name                    nest
    Match                   *
    Operation               lift
    Nested_under            log

[OUTPUT]
    Name                    gelf
    Match                   kube.*
    Host                    <your-graylog-server>
    Port                    12201
    Mode                    tcp
    Gelf_Short_Message_Key  data

[PARSER]
    Name                    docker
    Format                  json
    Time_Key                time
    Time_Format             %Y-%m-%dT%H:%M:%S.%L
    Time_Keep               Off

By default, GELF tcp uses port 12201 and Docker places your logs in /var/log/containers directory. The logs are placed in value of the log key. For example, this is a log saved by Docker:

{"log":"{\"data\": \"This is an example.\"}","stream":"stderr","time":"2019-07-21T12:45:11.273315023Z"}

If you use Tail Input and use a Parser like the docker parser shown above, it decodes your message and extracts data (and any other present) field. This is how this log in stdout looks like after decoding:

[0] kube.log: [1565770310.000198491, {"log"=>{"data"=>"This is an example."}, "stream"=>"stderr", "time"=>"2019-07-21T12:45:11.273315023Z"}]

Now, this is what happens to this log:

Fluent Bit GELF plugin adds "version": "1.1" to it.
The Nest Filter, unnests fields inside log key. In our example, it puts data alongside stream and time.
We used this data key as Gelf_Short_Message_Key; so GELF plugin changes it to short_message.
Kubernetes Filter adds host name.
Timestamp is generated.
Any custom field (not present in GELF Payload Specification) is prefixed by an underline.

Finally, this is what our Graylog server input sees:

{"version":"1.1", "short_message":"This is an example.", "host": "<Your Node Name>", "_stream":"stderr", "timestamp":1565770310.000199}

HTTP

The http output plugin allows to flush your records into a HTTP endpoint. For now the functionality is pretty basic and it issues a POST request with the data records in (or JSON) format.

Configuration Parameters

TLS / SSL

Getting Started

In order to insert records into a HTTP server, you can run the plugin from the command line or through the configuration file:

Command Line

The http plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

Using the format specified, you could start Fluent Bit through:

Configuration File

In your main configuration file, append the following Input & Output sections:

By default, the URI becomes tag of the message, the original tag is ignored. To retain the tag, multiple configuration sections have to be made based and flush to different URIs.

Another approach we also support is the sending the original message tag in a configurable header. It's up to the receiver to do what it wants with that header field: parse it and use it as the tag for example.

To configure this behaviour, add this config:

Provided you are using Fluentd as data receiver, you can combine in_http and out_rewrite_tag_filter to make use of this HTTP header.

Notice how we override the tag, which is from URI path, with our custom header

Example : Add a header

Example : Sumo Logic HTTP Collector

Suggested configuration for Sumo Logic using json_lines with iso8601 timestamps. The PrivateKey is specific to a configured HTTP collector.

InfluxDB

The influxdb output plugin, allows to flush your records into a time series database. The following instructions assumes that you have a fully operational InfluxDB service running in your system.

Configuration Parameters

TLS / SSL

InfluxDB output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.

Getting Started

In order to start inserting records into an InfluxDB service, you can run the plugin from the command line or through the configuration file:

Command Line

The influxdb plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

Using the format specified, you could start Fluent Bit through:

Configuration File

In your main configuration file append the following Input & Output sections:

Tagging

Basic example of Tag_Keys usage:

With Auto_Tags=On in this example cause error, because every parsed field value type is string. Best usage of this option in metrics like record where one ore more field value is not string typed.

Testing

Before to start Fluent Bit, make sure the target database exists on InfluxDB, using the above example, we will insert the data into a fluentbit database.

1. Create database

Log into InfluxDB console:

Create the database:

Check the database exists:

2. Run Fluent Bit

The following command will gather CPU metrics from the system and send the data to InfluxDB database every five seconds:

Note that all records coming from the cpu input plugin, have a tag cpu, this tag is used to generate the measurement in InfluxDB

3. Query the data

From InfluxDB console, choose your database:

Now query some specific fields:

The CPU input plugin gather more metrics per CPU core, in the above example we just selected three specific metrics. The following query will give a full result:

4. View tags

Query tagged keys:

And now query method key values:

Kafka

Kafka output plugin allows to ingest your records into an service. This plugin use the official (built-in dependency)

Configuration Parameters

Setting rdkafka.log.connection.close to false and rdkafka.request.required.acks to 1 are examples of recommended settings of librdfkafka properties.

Getting Started

In order to insert records into Apache Kafka, you can run the plugin from the command line or through the configuration file:

Command Line

The kafka plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Input & Output sections:

Kafka REST Proxy

The kafka-rest output plugin, allows to flush your records into a server. The following instructions assumes that you have a fully operational Kafka REST Proxy and Kafka services running in your environment.

Configuration Parameters

TLS / SSL

Kafka REST Proxy output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.

Getting Started

In order to insert records into a Kafka REST Proxy service, you can run the plugin from the command line or through the configuration file:

Command Line

The kafka-rest plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Input & Output sections:

PostgreSQL

PostgreSQL is a very popular and versatile open source database management system that supports the SQL language and that is capable of storing both structured and unstructured data, such as JSON objects.

Given that Fluent Bit is designed to work with JSON objects, the pgsql output plugin allows users to send their data to a PostgreSQL database and store it using the JSONB type.

PostgreSQL 9.4 or higher is required.

Preliminary steps

According to the parameters you have set in the configuration file, the plugin will create the table defined by the table option in the database defined by the database option hosted on the server defined by the host option. It will use the PostgreSQL user defined by the user option, which needs to have the right privileges to create such a table in that database.

NOTE: If you are not familiar with how PostgreSQL's users and grants system works, you might find useful reading the recommended links in the "References" section at the bottom.

A typical installation normally consists of a self-contained database for Fluent Bit in which you can store the output of one or more pipelines. Ultimately, it is your choice to to store them in the same table, or in separate tables, or even in separate databases based on several factors, including workload, scalability, data protection and security.

In this example, for the sake of simplicity, we use a single table called fluentbit in a database called fluentbit that is owned by the user fluentbit. Feel free to use different names. Preferably, for security reasons, do not use the postgres user (which has SUPERUSER privileges).

Create the `fluentbit` user

Generate a robust random password (e.g. pwgen 20 1) and store it safely. Then, as postgres system user on the server where PostgreSQL is installed, execute:

createuser -P fluentbit

At the prompt, please provide the password that you previously generated.

As a result, the user fluentbit without superuser privileges will be created.

If you prefer, instead of the createuser application, you can directly use the SQL command CREATE USER.

Create the `fluentbit` database

As postgres system user, please run:

createdb -O fluentbit fluentbit

This will create a database called fluentbit owned by the fluentbit user. As a result, the fluentbit user will be able to safely create the data table.

Alternatively, you can use the SQL command CREATE DATABASE.

Connection

Make sure that the fluentbit user can connect to the fluentbit database on the specified target host. This might require you to properly configure the pg_hba.conf file.

Configuration Parameters

Libpq

Fluent Bit relies on libpq, the PostgreSQL native client API, written in C language. For this reason, default values might be affected by environment variables and compilation settings. The above table, in brackets, list the most common default values for each connection option.

For security reasons, it is advised to follow the directives included in the password file section.

Configuration Example

In your main configuration file add the following section:

[OUTPUT]
    Name          pgsql
    Match         *
    Host          172.17.0.2
    Port          5432
    User          fluentbit
    Password      YourCrazySecurePassword
    Database      fluentbit
    Table         fluentbit
    Timestamp_Key ts

The output table

The output plugin automatically creates a table with the name specified by the table configuration option and made up of the following fields:

tag TEXT
time TIMESTAMP WITHOUT TIMEZONE
data JSONB

As you can see, the timestamp does not contain any information about the time zone and it is therefore referred to the time zone used by the connection to PostgreSQL (timezone setting).

For more information on the JSONB data type in PostgreSQL, please refer to the JSON types page in the official documentation, where you can find instructions on how to index or query the objects (including jsonpath introduced in PostgreSQL 12).

Scalability

PostgreSQL 10 introduces support for declarative partitioning. In order to improve vertical scalability of the database, you can decide to partition your tables on time ranges (for example on a monthly basis). PostgreSQL supports also subpartitions, allowing you to even partition by hash your records (version 11+), and default partitions (version 11+).

For more information on horizontal partitioning in PostgreSQL, please refer to the Table partitioning page in the official documentation.

If you are starting now, our recommendation at the moment is to choose the latest major version of PostgreSQL.

There's more ...

PostgreSQL is a really powerful and extensible database engine. More expert users can indeed take advantage of BEFORE INSERT triggers on the main table and re-route records on normalised tables, depending on tags and content of the actual JSON objects.

For example, you can use Fluent Bit to send HTTP log records to the landing table defined in the configuration file. This table contains a BEFORE INSERT trigger (a function in plpgsql language) that normalises the content of the JSON object and that inserts the record in another table (with its own structure and partitioning model). This kind of triggers allow you to discard the record from the landing table by returning NULL.

References

Here follows a list of useful resources from the PostgreSQL documentation:

Loki

Loki is multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate.

The Fluent Bit loki built-in output plugin allows you to send your log or events to a Loki service. It support data enrichment with Kubernetes labels, custom label keys and Tenant ID within others.

Configuration Parameters

Labels

Loki store the record logs inside Streams, a stream is defined by a set of labels, at least one label is required.

Fluent Bit implements a flexible mechanism to set labels by using fixed key/value pairs of text but also allowing to set as labels certain keys that exists as part of the records that are being processed. Consider the following JSON record (pretty printed for readability):

{
    "key": 1,
    "sub": {
        "stream": "stdout",
        "id": "some id"
    },
    "kubernetes": {
        "labels": {
            "team": "Santiago Wanderers"
        }
    }
}

If you decide that your Loki Stream will be composed by two labels called job and the value of the record key called stream , your labels configuration properties might look as follows:

[OUTPUT]
    name   loki
    match  *
    labels job=fluentbit, $sub['stream']

As you can see the label job has the value fluentbit and the second label is configured to access the nested map called sub targeting the value of the key stream . Note that the second label name must starts with a $, that means that's a Record Accessor pattern so it provide you the ability to retrieve values from nested maps by using the key names.

When processing above's configuration, internally the ending labels for the stream in question becomes:

job="fluentbit", stream="stdout"

Another feature of Labels management is the ability to provide custom key names, using the same record accessor pattern we can specify the key name manually and let the value to be populated automatically at runtime, e.g:

[OUTPUT]
    name   loki
    match  *
    labels job=fluentbit, mystream=$sub['stream']

When processing that new configuration, the internal labels will be:

job="fluentbit", mystream="stdout"

Using the `label_keys` property

The additional configuration property called label_keys allow to specify multiple record keys that needs to be placed as part of the outgoing Stream Labels, yes, this is a similar feature than the one explained above in the labels property. Consider this as another way to set a record key in the Stream, but with the limitation that you cannot use a custom name for the key value.

The following configuration examples generate the same Stream Labels:

[OUTPUT]
    name       loki
    match      *
    labels     job=fluentbit
    label_keys $sub['stream']

the above configuration accomplish the same than this one:

[OUTPUT]
    name   loki
    match  *
    labels job=fluentbit, $sub['stream']

both will generate the following Streams label:

job="fluentbit", stream="stdout"

Kubernetes & Labels

Note that if you are running in a Kubernetes environment, you might want to enable the option auto_kubernetes_labels which will auto-populate the streams with the Pod labels for you. Consider the following configuration:

[OUTPUT]
    name                   loki
    match                  *
    labels                 job=fluentbit
    auto_kubernetes_labels on

Based in the JSON example provided above, the internal stream labels will be:

job="fluentbit", team="Santiago Wanderers"

Networking and TLS Configuration

This plugin inherit core Fluent Bit features to customize the network behavior and optionally enable TLS in the communication channel. For more details about the specific options available refer to the following articles:

Networking Setup: timeouts, keepalive and source address
Security & TLS: all about TLS configuration and certificates

Note that all options mentioned in the articles above must be enabled in the plugin configuration in question.

Getting Started

The following configuration example, will emit a dummy example record and ingest it on Loki . Copy and paste the following content in a file called out_loki.conf:

[SERVICE]
    flush     1
    log_level info

[INPUT]
    name      dummy
    dummy     {"key": 1, "sub": {"stream": "stdout", "id": "some id"}, "kubernetes": {"labels": {"team": "Santiago Wanderers"}}}
    samples   1

[OUTPUT]
    name                   loki
    match                  *
    host                   127.0.0.1
    port                   3100
    labels                 job=fluentbit    
    label_keys             $sub['stream']
    auto_kubernetes_labels on

run Fluent Bit with the new configuration file:

$ fluent-bit -c out_loki.conf

Fluent Bit output:

Fluent Bit v1.7.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/10/14 20:57:45] [ info] [engine] started (pid=809736)
[2020/10/14 20:57:45] [ info] [storage] version=1.0.6, initializing...
[2020/10/14 20:57:45] [ info] [storage] in-memory
[2020/10/14 20:57:45] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/10/14 20:57:45] [ info] [output:loki:loki.0] configured, hostname=127.0.0.1:3100
[2020/10/14 20:57:45] [ info] [sp] stream processor started
[2020/10/14 20:57:46] [debug] [http] request payload (272 bytes)
[2020/10/14 20:57:46] [ info] [output:loki:loki.0] 127.0.0.1:3100, HTTP status=204