1 of 26

Outputs

Amazon CloudWatch

The Amazon CloudWatch output plugin allows to ingest your records into the service. Support for CloudWatch Metrics is also provided via .

This is the documentation for the core Fluent Bit CloudWatch plugin written in C. It can replace the Golang Fluent Bit plugin released last year. The Golang plugin was named cloudwatch; this new high performance CloudWatch plugin is called cloudwatch_logs to prevent conflicts/confusion. Check the amazon repo for the Golang plugin for details on the deprecation/migration plan for the original plugin.

Configuration Parameters

Getting Started

In order to send records into Amazon Cloudwatch, you can run the plugin from the command line or through the configuration file:

Command Line

The cloudwatch plugin, can read the parameters from the command line through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Output section:

AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and these plugins.

GitHub

Docker Hub

Amazon ECR

You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:

Azure

Azure output plugin allows to ingest your records into Azure Log Analytics service.

To get more details about how to setup Azure Log Analytics, please refer to the following documentation: Azure Log Analytics

Configuration Parameters

Getting Started

In order to insert records into an Azure Log Analytics instance, you can run the plugin from the command line or through the configuration file:

Command Line

The azure plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

$ fluent-bit -i cpu -o azure -p customer_id=abc -p shared_key=def -m '*' -f 1

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name  cpu

[OUTPUT]
    Name        azure
    Match       *
    Customer_ID abc
    Shared_Key  def

BigQuery

BigQuery output plugin is an experimental plugin that allows you to stream records into Google Cloud BigQuery service. The implementation does not support the following, which would be expected in a full production version:

Application Default Credentials.
Data deduplication using insertId.
Template tables using templateSuffix.

Google Cloud Configuration

Fluent Bit streams data into an existing BigQuery table using a service account that you specify. Therefore, before using the BigQuery output plugin, you must create a service account, create a BigQuery dataset and table, authorize the service account to write to the table, and provide the service account credentials to Fluent Bit.

Creating a Service Account

To stream data into BigQuery, the first step is to create a Google Cloud service account for Fluent Bit:

Creating a Google Cloud Service Account

Creating a BigQuery Dataset and Table

Fluent Bit does not create datasets or tables for your data, so you must create these ahead of time. You must also grant the service account WRITER permission on the dataset:

Creating and using datasets

Within the dataset you will need to create a table for the data to reside in. You can follow the following instructions for creating your table. Pay close attention to the schema. It must match the schema of your output JSON. Unfortunately, since BigQuery does not allow dots in field names, you will need to use a filter to change the fields for many of the standard inputs (e.g, mem or cpu).

Creating and using tables

Retrieving Service Account Credentials

Fluent Bit BigQuery output plugin uses a JSON credentials file for authentication credentials. Download the credentials file by following these instructions:

Creating and Managing Service Account Keys

Configurations Parameters

Configuration File

If you are using a Google Cloud Credentials File, the following configuration is enough to get you started:

[INPUT]
    Name  dummy
    Tag   dummy

[OUTPUT]
    Name       bigquery
    Match      *
    dataset_id my_dataset
    table_id   dummy_table

Counter

Counter is a very simple plugin that counts how many records it's getting upon flush time. Plugin output is as follows:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

Datadog

The Datadog output plugin allows to ingest your logs into Datadog.

Before you begin, you need a Datadog account, a Datadog API key, and you need to activate Datadog Logs Management.

Configuration Parameters

Configuration File

Get started quickly with this configuration file:

[OUTPUT]
    Name        datadog
    Match       *
    Host        http-intake.logs.datadoghq.com
    TLS         on
    compress    gzip
    apikey      <my-datadog-api-key>
    dd_service  <my-app-service>
    dd_source   <my-app-source>
    dd_tags     team:logs,foo:bar

Troubleshooting

403 Forbidden

If you get a 403 Forbidden error response, double check that you have a valid Datadog API key and that you have activated Datadog Logs Management.

Elasticsearch

The es output plugin, allows to ingest your records into a Elasticsearch database. The following instructions assumes that you have a fully operational Elasticsearch service running in your environment.

Configuration Parameters

The parameters index and type can be confusing if you are new to Elastic, if you have used a common relational database before, they can be compared to the database and table concepts. Also see the FAQ below

TLS / SSL

Elasticsearch output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.

Getting Started

In order to insert records into a Elasticsearch service, you can run the plugin from the command line or through the configuration file:

Command Line

The es plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

es://host:port/index/type

Using the format specified, you could start Fluent Bit through:

$ fluent-bit -i cpu -t cpu -o es://192.168.2.3:9200/my_index/my_type \
    -o stdout -m '*'

which is similar to do:

$ fluent-bit -i cpu -t cpu -o es -p Host=192.168.2.3 -p Port=9200 \
    -p Index=my_index -p Type=my_type -o stdout -m '*'

Configuration File

In your main configuration file append the following Input & Output sections:

[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name  es
    Match *
    Host  192.168.2.3
    Port  9200
    Index my_index
    Type  my_type

About Elasticsearch field names

Some input plugins may generate messages where the field names contains dots, since Elasticsearch 2.0 this is not longer allowed, so the current es plugin replaces them with an underscore, e.g:

{"cpu0.p_cpu"=>17.000000}

becomes

{"cpu0_p_cpu"=>17.000000}

FAQ

Elasticsearch rejects requests saying "the final mapping would have more than 1 type"

Since Elasticsearch 6.0, you cannot create multiple types in a single index. This means that you cannot set up your configuration as below anymore.

[OUTPUT]
    Name  es
    Match foo.*
    Index search
    Type  type1

[OUTPUT]
    Name  es
    Match bar.*
    Index search
    Type  type2

If you see an error message like below, you'll need to fix your configuration to use a single type on each index.

Rejecting mapping update to [search] as the final mapping would have more than 1 type

For details, please read the official blog post on that issue.

Elasticsearch rejects requests saying "Document mapping type name can't start with '_'"

Fluent Bit v1.5 changed the default mapping type from flb_type to _doc, which matches the recommendation from Elasticsearch from version 6.2 forwards (see commit with rationale). This doesn't work in Elasticsearch versions 5.6 through 6.1 (see Elasticsearch discussion and fix). Ensure you set an explicit map (such as doc or flb_type) in the configuration, as seen on the last line:

[OUTPUT]
    Name  es
    Match *
    Host  vpc-test-domain-ke7thhzoo7jawsrhmm6mb7ite7y.us-west-2.es.amazonaws.com
    Port  443
    Index my_index
    AWS_Auth On
    AWS_Region us-west-2
    tls   On
    Type  doc

Fluent Bit + Amazon Elasticsearch Service

The Amazon ElasticSearch Service adds an extra security layer where HTTP requests must be signed with AWS Sigv4. Fluent Bit v1.5 introduced full support for Amazon ElasticSearch Service with IAM Authentication.

Fluent Bit supports sourcing AWS credentials from any of the standard sources (for example, an Amazon EKS IAM Role for a Service Account).

Example configuration:

[OUTPUT]
    Name  es
    Match *
    Host  vpc-test-domain-ke7thhzoo7jawsrhmm6mb7ite7y.us-west-2.es.amazonaws.com
    Port  443
    Index my_index
    Type  my_type
    AWS_Auth On
    AWS_Region us-west-2
    tls     On

Notice that the Port is set to 443, tls is enabled, and AWS_Region is set.

File

The file output plugin allows to write the data received through the input plugin to file.

Configuration Parameters

The plugin supports the following configuration parameters:

Format

out_file format

Output time, tag and json records. There is no configuration parameters for out_file.

plain format

Output the records as JSON (without additional tag and timestamp attributes). There is no configuration parameters for plain format.

csv format

Output the records as csv. Csv supports an additional configuration parameter.

ltsv format

Output the records as LTSV. LTSV supports an additional configuration parameter.

template format

Output the records using a custom format template.

This accepts a formatting template and fills placeholders using corresponding values in a record.

For example, if you set up the configuration as below:

You will get the following output:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

FlowCounter

FlowCounter is the protocol to count records. The flowcounter output plugin allows to count up records and its size.

Configuration Parameters

The plugin supports the following configuration parameters:

Getting Started

You can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit count up a data with the following options:

Configuration File

In your main configuration file append the following Input & Output sections:

Testing

Once Fluent Bit is running, you will see the reports in the output interface similar to this:

Forward

Forward is the protocol used by Fluentd to route messages between peers. The forward output plugin allows to provide interoperability between Fluent Bit and Fluentd. There are not configuration steps required besides to specify where Fluentd is located, it can be in the local host or a in a remote machine.

This plugin offers two different transports and modes:

Forward (TCP): It uses a plain TCP connection.
Secure Forward (TLS): when TLS is enabled, the plugin switch to Secure Forward mode.

Configuration Parameters

The following parameters are mandatory for either Forward for Secure Forward modes:

Secure Forward Mode Configuration Parameters

When using Secure Forward mode, the TLS mode requires to be enabled. The following additional configuration parameters are available:

Forward Setup

Before proceeding, make sure that Fluentd is installed in your system, if it's not the case please refer to the following Fluentd Installation document and go ahead with that.

Once Fluentd is installed, create the following configuration file example that will allow us to stream data into it:

<source>
  type forward
  bind 0.0.0.0
  port 24224
</source>

<match fluent_bit>
  type stdout
</match>

That configuration file specifies that it will listen for TCP connections on the port 24224 through the forward input type. Then for every message with a fluent_bit TAG, will print the message to the standard output.

In one terminal launch Fluentd specifying the new configuration file created (in_fluent-bit.conf):

$ fluentd -c test.conf
2017-03-23 11:50:43 -0600 [info]: reading config file path="test.conf"
2017-03-23 11:50:43 -0600 [info]: starting fluentd-0.12.33
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-mixin-config-placeholders' version '0.3.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-docker' version '0.1.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-elasticsearch' version '1.4.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-flatten-hash' version '0.2.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.0.4'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-influxdb' version '0.2.8'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-json-in-json' version '0.1.4'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-mongo' version '0.7.10'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-out-http' version '0.1.3'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-parser' version '0.6.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-record-reformer' version '0.7.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-stdin' version '0.1.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-td' version '0.10.27'
2017-03-23 11:50:43 -0600 [info]: adding match pattern="fluent_bit" type="stdout"
2017-03-23 11:50:43 -0600 [info]: adding source type="forward"
2017-03-23 11:50:43 -0600 [info]: using configuration file: <ROOT>
  <source>
    type forward
    bind 0.0.0.0
    port 24224
  </source>
  <match fluent_bit>
    type stdout
  </match>
</ROOT>
2017-03-23 11:50:43 -0600 [info]: listening fluent socket on 0.0.0.0:24224

Fluent Bit + Forward Setup

Now that Fluentd is ready to receive messages, we need to specify where the forward output plugin will flush the information using the following format:

bin/fluent-bit -i INPUT -o forward://HOST:PORT

If the TAG parameter is not set, the plugin will set the tag as fluent_bit. Keep in mind that TAG is important for routing rules inside Fluentd.

Using the CPU input plugin as an example we will flush CPU metrics to Fluentd:

$ bin/fluent-bit -i cpu -t fluent_bit -o forward://127.0.0.1:24224

Now on the Fluentd side, you will see the CPU metrics gathered in the last seconds:

2017-03-23 11:53:06 -0600 fluent_bit: {"cpu_p":0.0,"user_p":0.0,"system_p":0.0,"cpu0.p_cpu":0.0,"cpu0.p_user":0.0,"cpu0.p_system":0.0,"cpu1.p_cpu":0.0,"cpu1.p_user":0.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0}
2017-03-23 11:53:07 -0600 fluent_bit: {"cpu_p":2.25,"user_p":2.0,"system_p":0.25,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":1.0,"cpu1.p_user":1.0,"cpu1.p_system":0.0,"cpu2.p_cpu":1.0,"cpu2.p_user":1.0,"cpu2.p_system":0.0,"cpu3.p_cpu":3.0,"cpu3.p_user":2.0,"cpu3.p_system":1.0}
2017-03-23 11:53:08 -0600 fluent_bit: {"cpu_p":1.75,"user_p":1.0,"system_p":0.75,"cpu0.p_cpu":2.0,"cpu0.p_user":1.0,"cpu0.p_system":1.0,"cpu1.p_cpu":3.0,"cpu1.p_user":1.0,"cpu1.p_system":2.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":2.0,"cpu3.p_user":1.0,"cpu3.p_system":1.0}
2017-03-23 11:53:09 -0600 fluent_bit: {"cpu_p":4.75,"user_p":3.5,"system_p":1.25,"cpu0.p_cpu":4.0,"cpu0.p_user":3.0,"cpu0.p_system":1.0,"cpu1.p_cpu":5.0,"cpu1.p_user":4.0,"cpu1.p_system":1.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":5.0,"cpu3.p_user":4.0,"cpu3.p_system":1.0}

So we gathered CPU metrics and flushed them out to Fluentd properly.

Fluent Bit + Secure Forward Setup

DISCLAIMER: the following example do not consider the generation of certificates for a proper usage of production environments.

Secure Forward aims to provide a secure channel of communication with the remote Fluentd service using TLS. Above there is a minimalist configuration for testing purposes.

Fluent Bit

Paste this content in a file called flb.conf:

[SERVICE]
    Flush      5
    Daemon     off
    Log_Level  info

[INPUT]
    Name       cpu
    Tag        cpu_usage

[OUTPUT]
    Name          forward
    Match         *
    Host          127.0.0.1
    Port          24284
    Shared_Key    secret
    Self_Hostname flb.local
    tls           on
    tls.verify    off

Fluentd

Paste this content in a file called fld.conf:

<source>
  @type         secure_forward
  self_hostname myserver.local
  shared_key    secret
  secure no
</source>

<match **>
 @type stdout
</match>

If you're using Fluentd v1, set up it as below:

<source>
  @type forward
  <transport tls>
    cert_path /etc/td-agent/certs/fluentd.crt
    private_key_path /etc/td-agent/certs/fluentd.key
    private_key_passphrase password
  </transport>
  <security>
    self_hostname myserver.local
    shared_key secret
  </security>
</source>

<match **>
 @type stdout
</match>

Test Communication

Start Fluentd:

$ fluentd -c fld.conf

Start Fluent Bit:

$ fluent-bit -c flb.conf

After five seconds, Fluent Bit will write the records to Fluentd. In Fluentd output you will see a message like this:

2017-03-23 13:34:40 -0600 [info]: using configuration file: <ROOT>
  <source>
    @type secure_forward
    self_hostname myserver.local
    shared_key xxxxxx
    secure no
  </source>
  <match **>
    @type stdout
  </match>
</ROOT>
2017-03-23 13:34:41 -0600 cpu_usage: {"cpu_p":1.0,"user_p":0.75,"system_p":0.25,"cpu0.p_cpu":1.0,"cpu0.p_user":1.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":1.0,"cpu1.p_system":1.0,"cpu2.p_cpu":1.0,"cpu2.p_user":1.0,"cpu2.p_system":0.0,"cpu3.p_cpu":2.0,"cpu3.p_user":1.0,"cpu3.p_system":1.0}
2017-03-23 13:34:42 -0600 cpu_usage: {"cpu_p":1.75,"user_p":1.75,"system_p":0.0,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":2.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0}
2017-03-23 13:34:43 -0600 cpu_usage: {"cpu_p":1.75,"user_p":1.25,"system_p":0.5,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":2.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":0.0,"cpu3.p_system":1.0}
2017-03-23 13:34:44 -0600 cpu_usage: {"cpu_p":5.0,"user_p":3.25,"system_p":1.75,"cpu0.p_cpu":4.0,"cpu0.p_user":2.0,"cpu0.p_system":2.0,"cpu1.p_cpu":8.0,"cpu1.p_user":5.0,"cpu1.p_system":3.0,"cpu2.p_cpu":4.0,"cpu2.p_user":3.0,"cpu2.p_system":1.0,"cpu3.p_cpu":4.0,"cpu3.p_user":2.0,"cpu3.p_system":2.0}

GELF

GELF is Extended Log Format. The GELF output plugin allows to send logs in GELF format directly to a Graylog input using TLS, TCP or UDP protocols.

The following instructions assumes that you have a fully operational Graylog server running in your environment.

Configuration Parameters

According to , there are some mandatory and optional fields which are used by Graylog in GELF format. These fields are determined with Gelf\*_Key_ key in this plugin.

TLS / SSL

GELF output plugin supports TLS/SSL, for more details about the properties available and general configuration, please refer to the section.

Notes

If you're using Fluent Bit to collect Docker logs, note that Docker places your log in JSON under key log. So you can set log as your Gelf_Short_Message_Key to send everything in Docker logs to Graylog. In this case, you need your log value to be a string; so don't parse it using JSON parser.
The order of looking up the timestamp in this plugin is as follows:
1. Value of Gelf_Timestamp_Key provided in configuration
2. Value of timestamp key
4. Timestamp does not set by Fluent Bit. In this case, your Graylog server will set it to the current timestamp (now).
The version of GELF message is also mandatory and Fluent Bit sets it to 1.1 which is the current latest version of GELF.
If you use udp as transport protocol and set Compress to true, Fluent Bit compresses your packets in GZIP format, which is the default compression that Graylog offers. This can be used to trade more CPU load for saving network bandwidth.

Configuration File Example

If you're using Fluent Bit for shipping Kubernetes logs, you can use something like this as your configuration file:

By default, GELF tcp uses port 12201 and Docker places your logs in /var/log/containers directory. The logs are placed in value of the log key. For example, this is a log saved by Docker:

Now, this is what happens to this log:

Fluent Bit GELF plugin adds "version": "1.1" to it.
We used this data key as Gelf_Short_Message_Key; so GELF plugin changes it to short_message.
Timestamp is generated.

Finally, this is what our Graylog server input sees:

HTTP

The http output plugin allows to flush your records into a HTTP endpoint. For now the functionality is pretty basic and it issues a POST request with the data records in MessagePack (or JSON) format.

Configuration Parameters

TLS / SSL

HTTP output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.

Getting Started

In order to insert records into a HTTP server, you can run the plugin from the command line or through the configuration file:

Command Line

The http plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

http://host:port/something

Using the format specified, you could start Fluent Bit through:

$ fluent-bit -i cpu -t cpu -o http://192.168.2.3:80/something -m '*'

Configuration File

In your main configuration file, append the following Input & Output sections:

[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name  http
    Match *
    Host  192.168.2.3
    Port  80
    URI   /something

By default, the URI becomes tag of the message, the original tag is ignored. To retain the tag, multiple configuration sections have to be made based and flush to different URIs.

Another approach we also support is the sending the original message tag in a configurable header. It's up to the receiver to do what it wants with that header field: parse it and use it as the tag for example.

To configure this behaviour, add this config:

[OUTPUT]
    Name  http
    Match *
    Host  192.168.2.3
    Port  80
    URI   /something
    Format json
    header_tag  FLUENT-TAG

Provided you are using Fluentd as data receiver, you can combine in_http and out_rewrite_tag_filter to make use of this HTTP header.

<source>
  @type http
  add_http_headers true
</source>

<match something>
  @type rewrite_tag_filter
  <rule>
    key HTTP_FLUENT_TAG
    pattern /^(.*)$/
    tag $1
  </rule>
</match>

Notice how we override the tag, which is from URI path, with our custom header

Example : Add a header

[OUTPUT]
    Name           http
    Match          *
    Host           127.0.0.1
    Port           9000
    Header         X-Key-A Value_A
    Header         X-Key-B Value_B
    URI            /something

Example : Sumo Logic HTTP Collector

Suggested configuration for Sumo Logic using json_lines with iso8601 timestamps. The PrivateKey is specific to a configured HTTP collector.

[OUTPUT]
    Name             http
    Match            *
    Host             collectors.au.sumologic.com
    Port             443
    URI              /receiver/v1/http/[PrivateKey]
    Format           json_lines
    Json_date_key    timestamp
    Json_date_format iso8601

A sample Sumo Logic query for the CPU input. (Requires json_lines format with iso8601 date format for the timestamp field).

_sourcecategory="my_fluent_bit"
| json "cpu_p" as cpu
| timeslice 1m
| max(cpu) as cpu group by _timeslice

Kafka

Kafka output plugin allows to ingest your records into an service. This plugin use the official (built-in dependency)

Configuration Parameters

Setting rdkafka.log.connection.close to false and rdkafka.request.required.acks to 1 are examples of recommended settings of librdfkafka properties.

Getting Started

In order to insert records into Apache Kafka, you can run the plugin from the command line or through the configuration file:

Command Line

The kafka plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Input & Output sections:

Kafka REST Proxy

The kafka-rest output plugin, allows to flush your records into a server. The following instructions assumes that you have a fully operational Kafka REST Proxy and Kafka services running in your environment.

Configuration Parameters

TLS / SSL

Kafka REST Proxy output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the section.

Getting Started

In order to insert records into a Kafka REST Proxy service, you can run the plugin from the command line or through the configuration file:

Command Line

The kafka-rest plugin, can read the parameters from the command line in two ways, through the -p argument (property), e.g:

Configuration File

In your main configuration file append the following Input & Output sections:

Forward

This plugin offers two different transports and modes:

Forward (TCP): It uses a plain TCP connection.
Secure Forward (TLS): when TLS is enabled, the plugin switch to Secure Forward mode.

Configuration Parameters

The following parameters are mandatory for either Forward for Secure Forward modes:

Secure Forward Mode Configuration Parameters

When using Secure Forward mode, the TLS mode requires to be enabled. The following additional configuration parameters are available:

Forward Setup

Before proceeding, make sure that Fluentd is installed in your system, if it's not the case please refer to the following Fluentd Installation document and go ahead with that.

Once Fluentd is installed, create the following configuration file example that will allow us to stream data into it:

<source>
  type forward
  bind 0.0.0.0
  port 24224
</source>

<match fluent_bit>
  type stdout
</match>

In one terminal launch Fluentd specifying the new configuration file created (in_fluent-bit.conf):

$ fluentd -c test.conf
2017-03-23 11:50:43 -0600 [info]: reading config file path="test.conf"
2017-03-23 11:50:43 -0600 [info]: starting fluentd-0.12.33
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-mixin-config-placeholders' version '0.3.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-docker' version '0.1.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-elasticsearch' version '1.4.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-flatten-hash' version '0.2.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.0.4'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-influxdb' version '0.2.8'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-json-in-json' version '0.1.4'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-mongo' version '0.7.10'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-out-http' version '0.1.3'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-parser' version '0.6.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-record-reformer' version '0.7.0'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-stdin' version '0.1.1'
2017-03-23 11:50:43 -0600 [info]: gem 'fluent-plugin-td' version '0.10.27'
2017-03-23 11:50:43 -0600 [info]: adding match pattern="fluent_bit" type="stdout"
2017-03-23 11:50:43 -0600 [info]: adding source type="forward"
2017-03-23 11:50:43 -0600 [info]: using configuration file: <ROOT>
  <source>
    type forward
    bind 0.0.0.0
    port 24224
  </source>
  <match fluent_bit>
    type stdout
  </match>
</ROOT>
2017-03-23 11:50:43 -0600 [info]: listening fluent socket on 0.0.0.0:24224

Fluent Bit + Forward Setup

Now that Fluentd is ready to receive messages, we need to specify where the forward output plugin will flush the information using the following format:

bin/fluent-bit -i INPUT -o forward://HOST:PORT

If the TAG parameter is not set, the plugin will set the tag as fluent_bit. Keep in mind that TAG is important for routing rules inside Fluentd.

Using the CPU input plugin as an example we will flush CPU metrics to Fluentd:

$ bin/fluent-bit -i cpu -t fluent_bit -o forward://127.0.0.1:24224

Now on the Fluentd side, you will see the CPU metrics gathered in the last seconds:

2017-03-23 11:53:06 -0600 fluent_bit: {"cpu_p":0.0,"user_p":0.0,"system_p":0.0,"cpu0.p_cpu":0.0,"cpu0.p_user":0.0,"cpu0.p_system":0.0,"cpu1.p_cpu":0.0,"cpu1.p_user":0.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0}
2017-03-23 11:53:07 -0600 fluent_bit: {"cpu_p":2.25,"user_p":2.0,"system_p":0.25,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":1.0,"cpu1.p_user":1.0,"cpu1.p_system":0.0,"cpu2.p_cpu":1.0,"cpu2.p_user":1.0,"cpu2.p_system":0.0,"cpu3.p_cpu":3.0,"cpu3.p_user":2.0,"cpu3.p_system":1.0}
2017-03-23 11:53:08 -0600 fluent_bit: {"cpu_p":1.75,"user_p":1.0,"system_p":0.75,"cpu0.p_cpu":2.0,"cpu0.p_user":1.0,"cpu0.p_system":1.0,"cpu1.p_cpu":3.0,"cpu1.p_user":1.0,"cpu1.p_system":2.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":2.0,"cpu3.p_user":1.0,"cpu3.p_system":1.0}
2017-03-23 11:53:09 -0600 fluent_bit: {"cpu_p":4.75,"user_p":3.5,"system_p":1.25,"cpu0.p_cpu":4.0,"cpu0.p_user":3.0,"cpu0.p_system":1.0,"cpu1.p_cpu":5.0,"cpu1.p_user":4.0,"cpu1.p_system":1.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":5.0,"cpu3.p_user":4.0,"cpu3.p_system":1.0}

So we gathered CPU metrics and flushed them out to Fluentd properly.

Fluent Bit + Secure Forward Setup

DISCLAIMER: the following example do not consider the generation of certificates for a proper usage of production environments.

Secure Forward aims to provide a secure channel of communication with the remote Fluentd service using TLS. Above there is a minimalist configuration for testing purposes.

Fluent Bit

Paste this content in a file called flb.conf:

[SERVICE]
    Flush      5
    Daemon     off
    Log_Level  info

[INPUT]
    Name       cpu
    Tag        cpu_usage

[OUTPUT]
    Name          forward
    Match         *
    Host          127.0.0.1
    Port          24284
    Shared_Key    secret
    Self_Hostname flb.local
    tls           on
    tls.verify    off

Fluentd

Paste this content in a file called fld.conf:

<source>
  @type         secure_forward
  self_hostname myserver.local
  shared_key    secret
  secure no
</source>

<match **>
 @type stdout
</match>

If you're using Fluentd v1, set up it as below:

<source>
  @type forward
  <transport tls>
    cert_path /etc/td-agent/certs/fluentd.crt
    private_key_path /etc/td-agent/certs/fluentd.key
    private_key_passphrase password
  </transport>
  <security>
    self_hostname myserver.local
    shared_key secret
  </security>
</source>

<match **>
 @type stdout
</match>

Test Communication

Start Fluentd:

$ fluentd -c fld.conf

Start Fluent Bit:

$ fluent-bit -c flb.conf

After five seconds, Fluent Bit will write the records to Fluentd. In Fluentd output you will see a message like this:

2017-03-23 13:34:40 -0600 [info]: using configuration file: <ROOT>
  <source>
    @type secure_forward
    self_hostname myserver.local
    shared_key xxxxxx
    secure no
  </source>
  <match **>
    @type stdout
  </match>
</ROOT>
2017-03-23 13:34:41 -0600 cpu_usage: {"cpu_p":1.0,"user_p":0.75,"system_p":0.25,"cpu0.p_cpu":1.0,"cpu0.p_user":1.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":1.0,"cpu1.p_system":1.0,"cpu2.p_cpu":1.0,"cpu2.p_user":1.0,"cpu2.p_system":0.0,"cpu3.p_cpu":2.0,"cpu3.p_user":1.0,"cpu3.p_system":1.0}
2017-03-23 13:34:42 -0600 cpu_usage: {"cpu_p":1.75,"user_p":1.75,"system_p":0.0,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":2.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0}
2017-03-23 13:34:43 -0600 cpu_usage: {"cpu_p":1.75,"user_p":1.25,"system_p":0.5,"cpu0.p_cpu":3.0,"cpu0.p_user":3.0,"cpu0.p_system":0.0,"cpu1.p_cpu":2.0,"cpu1.p_user":2.0,"cpu1.p_system":0.0,"cpu2.p_cpu":0.0,"cpu2.p_user":0.0,"cpu2.p_system":0.0,"cpu3.p_cpu":1.0,"cpu3.p_user":0.0,"cpu3.p_system":1.0}
2017-03-23 13:34:44 -0600 cpu_usage: {"cpu_p":5.0,"user_p":3.25,"system_p":1.75,"cpu0.p_cpu":4.0,"cpu0.p_user":2.0,"cpu0.p_system":2.0,"cpu1.p_cpu":8.0,"cpu1.p_user":5.0,"cpu1.p_system":3.0,"cpu2.p_cpu":4.0,"cpu2.p_user":3.0,"cpu2.p_system":1.0,"cpu3.p_cpu":4.0,"cpu3.p_user":2.0,"cpu3.p_system":2.0}

PostgreSQL

PostgreSQL is a very popular and versatile open source database management system that supports the SQL language and that is capable of storing both structured and unstructured data, such as JSON objects.

Given that Fluent Bit is designed to work with JSON objects, the pgsql output plugin allows users to send their data to a PostgreSQL database and store it using the JSONB type.

PostgreSQL 9.4 or higher is required.

Preliminary steps

According to the parameters you have set in the configuration file, the plugin will create the table defined by the table option in the database defined by the database option hosted on the server defined by the host option. It will use the PostgreSQL user defined by the user option, which needs to have the right privileges to create such a table in that database.

NOTE: If you are not familiar with how PostgreSQL's users and grants system works, you might find useful reading the recommended links in the "References" section at the bottom.

A typical installation normally consists of a self-contained database for Fluent Bit in which you can store the output of one or more pipelines. Ultimately, it is your choice to to store them in the same table, or in separate tables, or even in separate databases based on several factors, including workload, scalability, data protection and security.

In this example, for the sake of simplicity, we use a single table called fluentbit in a database called fluentbit that is owned by the user fluentbit. Feel free to use different names. Preferably, for security reasons, do not use the postgres user (which has SUPERUSER privileges).

Create the `fluentbit` user

Generate a robust random password (e.g. pwgen 20 1) and store it safely. Then, as postgres system user on the server where PostgreSQL is installed, execute:

createuser -P fluentbit

At the prompt, please provide the password that you previously generated.

As a result, the user fluentbit without superuser privileges will be created.

If you prefer, instead of the createuser application, you can directly use the SQL command CREATE USER.

Create the `fluentbit` database

As postgres system user, please run:

createdb -O fluentbit fluentbit

This will create a database called fluentbit owned by the fluentbit user. As a result, the fluentbit user will be able to safely create the data table.

Alternatively, you can use the SQL command CREATE DATABASE.

Connection

Make sure that the fluentbit user can connect to the fluentbit database on the specified target host. This might require you to properly configure the pg_hba.conf file.

Configuration Parameters

Libpq

Fluent Bit relies on libpq, the PostgreSQL native client API, written in C language. For this reason, default values might be affected by environment variables and compilation settings. The above table, in brackets, list the most common default values for each connection option.

For security reasons, it is advised to follow the directives included in the password file section.

Configuration Example

In your main configuration file add the following section:

[OUTPUT]
    Name          pgsql
    Match         *
    Host          172.17.0.2
    Port          5432
    User          fluentbit
    Password      YourCrazySecurePassword
    Database      fluentbit
    Table         fluentbit
    Timestamp_Key ts

The output table

The output plugin automatically creates a table with the name specified by the table configuration option and made up of the following fields:

tag TEXT
time TIMESTAMP WITHOUT TIMEZONE
data JSONB

As you can see, the timestamp does not contain any information about the time zone and it is therefore referred to the time zone used by the connection to PostgreSQL (timezone setting).

For more information on the JSONB data type in PostgreSQL, please refer to the JSON types page in the official documentation, where you can find instructions on how to index or query the objects (including jsonpath introduced in PostgreSQL 12).

Scalability

PostgreSQL 10 introduces support for declarative partitioning. In order to improve vertical scalability of the database, you can decide to partition your tables on time ranges (for example on a monthly basis). PostgreSQL supports also subpartitions, allowing you to even partition by hash your records (version 11+), and default partitions (version 11+).

For more information on horizontal partitioning in PostgreSQL, please refer to the Table partitioning page in the official documentation.

If you are starting now, our recommendation at the moment is to choose the latest major version of PostgreSQL.

There's more ...

PostgreSQL is a really powerful and extensible database engine. More expert users can indeed take advantage of BEFORE INSERT triggers on the main table and re-route records on normalised tables, depending on tags and content of the actual JSON objects.

For example, you can use Fluent Bit to send HTTP log records to the landing table defined in the configuration file. This table contains a BEFORE INSERT trigger (a function in plpgsql language) that normalises the content of the JSON object and that inserts the record in another table (with its own structure and partitioning model). This kind of triggers allow you to discard the record from the landing table by returning NULL.

References

Here follows a list of useful resources from the PostgreSQL documentation: