1 of 7

Data Pipeline

Input

The way to gather data from your sources

Fluent Bit provides different Input Plugins to gather information from different sources, some of them just collect data from log files while others can gather metrics information from the operating system. There are many plugins for different needs.

When an input plugin is loaded, an internal instance is created. Every instance has its own and independent configuration. Configuration keys are often called properties.

Every input plugin has its own documentation section where it's specified how it can be used and what properties are available.

For more details, please refer to the Input Plugins section.

Parser

Convert Unstructured to Structured messages

Dealing with raw strings or unstructured messages is a constant pain; having a structure is highly desired. Ideally we want to set a structure to the incoming data by the Input Plugins as soon as they are collected:

The Parser allows you to convert from unstructured to structured data. As a demonstrative example consider the following Apache (HTTP Server) log entry:

The above log line is a raw string without format, ideally we would like to give it a structure that can be processed later easily. If the proper configuration is used, the log entry could be converted to:

Filter

Modify, Enrich or Drop your records

In production environments we want to have full control of the data we are collecting, filtering is an important feature that allows us to alter the data before delivering it to some destination.

Filtering is implemented through plugins, so each filter available could be used to match, exclude or enrich your logs with some specific metadata.

We support many filters, A common use case for filtering is Kubernetes deployments. Every Pod log needs to get the proper metadata associated

Very similar to the input plugins, Filters run in an instance context, which has its own independent configuration. Configuration keys are often called properties.

For more details about the Filters available and their usage, please refer to the Filters section.

Buffer

Data processing with reliability

Previously defined in the Buffering concept section, the buffer phase in the pipeline aims to provide a unified and persistent mechanism to store your data, either using the primary in-memory model or using the filesystem based mode.

The buffer phase already contains the data in an immutable state, meaning, no other filter can be applied.

Note that buffered data is not raw text, it's in Fluent Bit's internal binary representation.

Fluent Bit offers a buffering mechanism in the file system that acts as a backup system to avoid data loss in case of system failures.

Router

Create flexible routing rules

Routing is a core feature that allows to route your data through Filters and finally to one or multiple destinations. The router relies on the concept of and rules

There are two important concepts in Routing:

Tag
Match

When the data is generated by the input plugins, it comes with a Tag (most of the time the Tag is configured manually), the Tag is a human-readable indicator that helps to identify the data source.

In order to define where the data should be routed, a Match rule must be specified in the output configuration.

Consider the following configuration example that aims to deliver CPU metrics to an Elasticsearch database and Memory metrics to the standard output interface:

Note: the above is a simple example demonstrating how Routing is configured.

Routing works automatically reading the Input Tags and the Output Match rules. If some data has a Tag that doesn't match upon routing time, the data is deleted.

Routing with Wildcard

Routing is flexible enough to support wildcard in the Match pattern. The below example defines a common destination for both sources of data:

The match rule is set to my_* which means it will match any Tag that starts with my_.

Output

Destinations for your data: databases, cloud services and more!

The output interface allows us to define destinations for the data. Common destinations are remote services, local file system or standard interface with others. Outputs are implemented as plugins and there are many available.

When an output plugin is loaded, an internal instance is created. Every instance has its own independent configuration. Configuration keys are often called properties.

Every output plugin has its own documentation section specifying how it can be used and what properties are available.

For more details, please refer to the Output Plugins section.