Fluent Bit: Official Manual
SlackGitHubCommunity MeetingsSandboxWebinars
4.0
4.0
  • Fluent Bit Documentation
  • About
    • What is Fluent Bit?
    • A Brief History of Fluent Bit
    • Fluentd and Fluent Bit
    • License
    • Sandbox and Lab Resources
  • Concepts
    • Key Concepts
    • Buffering
    • Data Pipeline
      • Input
      • Parser
      • Filter
      • Buffer
      • Router
      • Output
  • Installation
    • Getting Started with Fluent Bit
    • Upgrade Notes
    • Supported Platforms
    • Requirements
    • Sources
      • Download Source Code
      • Build and Install
      • Build with Static Configuration
    • Linux Packages
      • Amazon Linux
      • Redhat / CentOS
      • Debian
      • Ubuntu
      • Raspbian / Raspberry Pi
    • Docker
    • Containers on AWS
    • Amazon EC2
    • Kubernetes
    • macOS
    • Windows
    • Yocto / Embedded Linux
    • Buildroot / Embedded Linux
  • Administration
    • Configuring Fluent Bit
      • YAML Configuration
        • Service
        • Parsers
        • Multiline Parsers
        • Pipeline
        • Plugins
        • Upstream Servers
        • Environment Variables
        • Includes
      • Classic mode
        • Format and Schema
        • Configuration File
        • Variables
        • Commands
        • Upstream Servers
        • Record Accessor
      • Unit Sizes
      • Multiline Parsing
    • Transport Security
    • Buffering and Storage
    • Backpressure
    • Scheduling and Retries
    • Networking
    • Memory Management
    • Monitoring
    • Multithreading
    • HTTP Proxy
    • Hot Reload
    • Troubleshooting
    • Performance Tips
    • AWS credentials
  • Local Testing
    • Validating your Data and Structure
    • Running a Logging Pipeline Locally
  • Data Pipeline
    • Pipeline Monitoring
    • Inputs
      • Collectd
      • CPU Log Based Metrics
      • Disk I/O Log Based Metrics
      • Docker Events
      • Docker Log Based Metrics
      • Dummy
      • Elasticsearch
      • Exec
      • Exec Wasi
      • Ebpf
      • Fluent Bit Metrics
      • Forward
      • Head
      • Health
      • HTTP
      • Kafka
      • Kernel Logs
      • Kubernetes Events
      • Memory Metrics
      • MQTT
      • Network I/O Log Based Metrics
      • NGINX Exporter Metrics
      • Node Exporter Metrics
      • OpenTelemetry
      • Podman Metrics
      • Process Exporter Metrics
      • Process Log Based Metrics
      • Prometheus Remote Write
      • Prometheus Scrape Metrics
      • Random
      • Serial Interface
      • Splunk
      • Standard Input
      • StatsD
      • Syslog
      • Systemd
      • Tail
      • TCP
      • Thermal
      • UDP
      • Windows Event Log
      • Windows Event Log (winevtlog)
      • Windows Exporter Metrics
    • Parsers
      • Configuring Parser
      • JSON
      • Regular Expression
      • LTSV
      • Logfmt
      • Decoders
    • Processors
      • Content Modifier
      • Labels
      • Metrics Selector
      • OpenTelemetry Envelope
      • Sampling
      • SQL
      • Filters as processors
      • Conditional processing
    • Filters
      • AWS Metadata
      • CheckList
      • ECS Metadata
      • Expect
      • GeoIP2 Filter
      • Grep
      • Kubernetes
      • Log to Metrics
      • Lua
      • Parser
      • Record Modifier
      • Modify
      • Multiline
      • Nest
      • Nightfall
      • Rewrite Tag
      • Standard Output
      • Sysinfo
      • Throttle
      • Type Converter
      • Tensorflow
      • Wasm
    • Outputs
      • Amazon CloudWatch
      • Amazon Kinesis Data Firehose
      • Amazon Kinesis Data Streams
      • Amazon S3
      • Azure Blob
      • Azure Data Explorer
      • Azure Log Analytics
      • Azure Logs Ingestion API
      • Counter
      • Dash0
      • Datadog
      • Dynatrace
      • Elasticsearch
      • File
      • FlowCounter
      • Forward
      • GELF
      • Google Chronicle
      • Google Cloud BigQuery
      • HTTP
      • InfluxDB
      • Kafka
      • Kafka REST Proxy
      • LogDNA
      • Loki
      • Microsoft Fabric
      • NATS
      • New Relic
      • NULL
      • Observe
      • OpenObserve
      • OpenSearch
      • OpenTelemetry
      • Oracle Log Analytics
      • PostgreSQL
      • Prometheus Exporter
      • Prometheus Remote Write
      • SkyWalking
      • Slack
      • Splunk
      • Stackdriver
      • Standard Output
      • Syslog
      • TCP and TLS
      • Treasure Data
      • Vivo Exporter
      • WebSocket
  • Stream Processing
    • Introduction to Stream Processing
    • Overview
    • Changelog
    • Getting Started
      • Fluent Bit + SQL
      • Check Keys and NULL values
      • Hands On 101
  • Fluent Bit for Developers
    • C Library API
    • Ingest Records Manually
    • Golang Output Plugins
    • WASM Filter Plugins
    • WASM Input Plugins
    • Developer guide for beginners on contributing to Fluent Bit
Powered by GitBook
On this page
  • Chunks, memory, filesystem, and backpressure
  • Backpressure
  • Chunks
  • Irrecoverable chunks
  • Configuration
  • Service section configuration
  • Input Section Configuration
  • Output Section Configuration

Was this helpful?

Export as PDF
  1. Administration

Buffering and Storage

Last updated 19 days ago

Was this helpful?

collects, parses, filters, and ships logs to a central place. A critical piece of this workflow is the ability to do buffering: a mechanism to place processed data into a temporary location until is ready to be shipped.

By default when Fluent Bit processes data, it uses Memory as a primary and temporary place to store the records. There are scenarios where it would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities.

Choosing the right configuration is critical and the behavior of the service can be conditioned based in the backpressure settings. Before jumping into the configuration it helps to understand the relationship between chunks, memory,filesystem, and backpressure.

Chunks, memory, filesystem, and backpressure

Understanding chunks, buffering, and backpressure is critical for a proper configuration.

Backpressure

See for a full explanation.

Chunks

When an input plugin source emits records, the engine groups the records together in a chunk. A chunk's size usually is around 2 MB. By configuration, the engine decides where to place this chunk. By default, all chunks are created only in memory.

Irrecoverable chunks

There are two scenarios where Fluent Bit marks chunks as irrecoverable:

  • When Fluent Bit encounters a bad layout in a chunk. A bad layout is a chunk that doesn't conform to the expected format.

  • When Fluent Bit encounters an incorrect or invalid chunk header size.

In both scenarios Fluent Bit logs an error message and then discards the irrecoverable chunks.

Buffering and memory

As mentioned previously, chunks generated by the engine are placed in memory by default, but this is configurable.

If memory is the only mechanism set for the input plugin, it will store as much data as possible in memory. This is the fastest mechanism with the least system overhead. However, if the service isn't able to deliver the records fast enough, Fluent Bit memory usage increases as it accumulates more data than it can deliver.

In a high load environment with backpressure, having high memory usage risks getting killed by the kernel's OOM Killer. To work around this backpressure scenario, limit the amount of memory in records that an input plugin can register using themem_buf_limit property. If a plugin has queued more than the mem_buf_limit, it won't be able to ingest more until that data can be delivered or flushed properly. In this scenario the input plugin in question is paused. When the input is paused, records won't be ingested until the plugin resumes. For some inputs, such as TCP and tail, pausing the input will almost certainly lead to log loss. For the tail input, Fluent Bit can save its current offset in the current file it's reading, and pick back up when the input resumes.

Look for messages in the Fluent Bit log output like:

[input] tail.1 paused (mem buf overlimit)
[input] tail.1 resume (mem buf overlimit)

Using mem_buf_limit is good for certain scenarios and environments. It helps to control the memory usage of the service. However, if a file rotates while the plugin is paused, data can be lost since it won't be able to register new records. This can happen with any input source plugin. The goal ofmem_buf_limit is memory control and survival of the service.

For a full data safety guarantee, use filesystem buffering.

Here is an example input definition:

[INPUT]
    Name          tcp
    Listen        0.0.0.0
    Port          5170
    Format        none
    Tag           tcp-logs
    Mem_Buf_Limit 50MB

If this input uses more than 50 MB memory to buffer logs, you will get a warning like this in the Fluent Bit logs:

[input] tcp.1 paused (mem buf overlimit)

mem_buf_Limit applies only when storage.type is set to the default value ofmemory.

Filesystem buffering

Filesystem buffering helps with backpressure and overall memory control. Enable it using storage.type filesystem.

Memory and filesystem buffering mechanisms aren't mutually exclusive. Enabling filesystem buffering for your input plugin source can improve both performance and data safety.

Fluent Bit controls the number of chunks that are up in memory by using the filesystem buffering mechanism to deal with high memory usage and backpressure.

By default, the engine allows a total of 128 chunks up in memory in total, considering all chunks. This value is controlled by the service propertystorage.max_chunks_up. The active chunks that are up are ready for delivery and are still receiving records. Any other remaining chunk is in a down state, which means that it's only in the filesystem and won't be up in memory unless it's ready to be delivered. Chunks are never much larger than 2 MB, so with the default storage.max_chunks_up value of 128, each input is limited to roughly 256 MB of memory.

If the input plugin has enabled storage.type as filesystem, when reaching thestorage.max_chunks_up threshold, instead of the plugin being paused, all new data will go to chunks that are down in the filesystem. This lets you control memory usage by the service and also provides a guarantee that the service won't lose any data. By default, the enforcement of the storage.max_chunks_up limit is best-effort. Fluent Bit can only append new data to chunks that are up. When the limit is reached chunks will be temporarily brought up in memory to ingest new data, and then put to a down state afterwards. In general, Fluent Bit works to keep the total number of up chunks at or below storage.max_chunks_up.

If storage.pause_on_chunks_overlimit is enabled (default is off), the input plugin pauses upon exceeding storage.max_chunks_up. With this option,storage.max_chunks_up becomes a hard limit for the input. When the input is paused, records won't be ingested until the plugin resumes. For some inputs, such as TCP and tail, pausing the input will almost certainly lead to log loss. For the tail input, Fluent Bit can save its current offset in the current file it's reading, and pick back up when the input is resumed.

Look for messages in the Fluent Bit log output like:

[input] tail.1 paused (storage buf overlimit
[input] tail.1 resume (storage buf overlimit

Limiting filesystem space for chunks

Fluent Bit implements the concept of logical queues. Based on its tag, a chunk can be routed to multiple destinations. Fluent Bit keeps an internal reference from where a chunk was created and where it needs to go.

It's common to find cases where multiple destinations with different response times exist for a chunk, or one of the destinations is generating backpressure.

To limit the amount of filesystem chunks logically queueing, Fluent Bit v1.6 and later includes the storage.total_limit_size configuration property for output This property limits the total size in bytes of chunks that can exist in the filesystem for a certain logical output destination. If one of the destinations reaches the configured storage.total_limit_size, the oldest chunk from its queue for that logical output destination will be discarded to make room for new data.

Configuration

The storage layer configuration takes place in three sections:

  • Service

  • Input

  • Output

The known Service section configures a global environment for the storage layer, the Input sections define which buffering mechanism to use, and the Output defines limits for the logical filesystem queues.

Service section configuration

Key
Description
Default

storage.path

Set an optional location in the file system to store streams and chunks of data. If this parameter isn't set, Input plugins can only use in-memory buffering.

none

storage.sync

normal

storage.checksum

Enable the data integrity check when writing and reading data from the filesystem. The storage layer uses the CRC32 algorithm. Accepted values: Off, On.

Off

storage.max_chunks_up

If the input plugin has enabled filesystem storage type, this property sets the maximum number of chunks that can be up in memory. Use this setting to control memory usage when you enable storage.type filesystem.

128

storage.backlog.mem_limit

If storage.path is set, Fluent Bit looks for data chunks that weren't delivered and are still in the storage layer. These are called backlog data. Backlog chunks are filesystem chunks that were left over from a previous Fluent Bit run; chunks that couldn't be sent before exit that Fluent Bit will pick up when restarted. Fluent Bit will check the storage.backlog.mem_limit value against the current memory usage from all up chunks for the input. If the up chunks currently consume less memory than the limit, it will bring the backlog chunks up into memory so they can be sent by outputs.

5M

storage.metrics

off

storage.delete_irrecoverable_chunks

Off

A Service section will look like this:

[SERVICE]
    flush                     1
    log_Level                 info
    storage.path              /var/log/flb-storage/
    storage.sync              normal
    storage.checksum          off
    storage.backlog.mem_limit 5M

This configuration sets an optional buffering mechanism where the route to the data is /var/log/flb-storage/. It uses normal synchronization mode, without running a checksum and up to a maximum of 5 MB of memory when processing backlog data.

Input Section Configuration

Optionally, any Input plugin can configure their storage preference. The following table describes the options available:

Key
Description
Default

storage.type

Specifies the buffering mechanism to use. Accepted values: memory, filesystem.

memory

storage.pause_on_chunks_overlimit

Specifies if the input plugin should pause (stop ingesting new data) when the storage.max_chunks_up value is reached.

off

The following example configures a service offering filesystem buffering capabilities and two input plugins being the first based in filesystem and the second with memory only.

[SERVICE]
    flush                     1
    log_Level                 info
    storage.path              /var/log/flb-storage/
    storage.sync              normal
    storage.checksum          off
    storage.max_chunks_up     128
    storage.backlog.mem_limit 5M

[INPUT]
    name          cpu
    storage.type  filesystem

[INPUT]
    name          mem
    storage.type  memory

Output Section Configuration

If certain chunks are filesystem storage.type based, it's possible to control the size of the logical queue for an output plugin. The following table describes the options available:

Key
Description
Default

storage.total_limit_size

Limit the maximum disk space size in bytes for buffering chunks in the filesystem for the current output logical destination.

none

The following example creates records with CPU usage samples in the filesystem which are delivered to Google Stackdriver service while limiting the logical queue (buffering) to 5M:

[SERVICE]
    flush                     1
    log_Level                 info
    storage.path              /var/log/flb-storage/
    storage.sync              normal
    storage.checksum          off
    storage.max_chunks_up     128
    storage.backlog.mem_limit 5M

[INPUT]
    name                      cpu
    storage.type              filesystem

[OUTPUT]
    name                      stackdriver
    match                     *
    storage.total_limit_size  5M

If Fluent Bit is offline because of a network issue, it will continue buffering CPU samples, keeping a maximum of 5 MB of the newest data.

Enabling filesystem buffering changes the behavior of the engine. Upon chunk creation, the engine stores the content in memory and also maps a copy on disk through . The newly created chunk is active in memory, backed up on disk, and called to beup, which means the chunk content is up in memory.

The Service section refers to the section defined in the main:

Configure the synchronization mode used to store the data in the file system. Using full increases the reliability of the filesystem buffer and ensures that data is guaranteed to be synced to the filesystem even if Fluent Bit crashes. On Linux, full corresponds with the MAP_SYNC option for . Accepted values: normal, full.

If http_server option is enabled in the main [SERVICE] section, this option registers a new endpoint where internal metrics of the storage layer can be consumed. For more details refer to the section.

When enabled, will be deleted during runtime, and any other irrecoverable chunk located in the configured storage path directory will be deleted when Fluent-Bit starts. Accepted values: 'Off, 'On.

mmap(2)
configuration file
memory mapped files
Monitoring
irrecoverable chunks
Fluent Bit
Backpressure
Chunk definition