Blob

The Blob input plugin monitors a directory and processes binary (blob) files. It scans the specified path at regular intervals, reads binary files, and forwards them as records through the Fluent Bit pipeline. This plugin processes binary log files, artifacts, or any binary data that needs to be collected and forwarded to outputs.

Configuration parameters

The plugin supports the following configuration parameters:

Key

Description

Default

alias

Sets an alias for multiple instances of the same input plugin. This helps when you need to run multiple blob input instances with different configurations.

none

database_file

Specify a database file to keep track of processed files and their state. This enables the plugin to resume processing from the last known position if Fluent Bit is restarted.

none

exclude_pattern

Set one or multiple shell patterns separated by commas to exclude files matching certain criteria. For example, exclude_pattern *.tmp,*.bak will exclude temporary and backup files from processing.

none

log_level

Specifies the log level for this input plugin. If not set here, the plugin uses the global log level specified in the service section. Valid values: off, error, warn, info, debug, trace.

info

log_suppress_interval

Suppresses log messages from this input plugin that appear similar within a specified time interval. Set to 0 to disable suppression. The value must be specified in seconds. This helps reduce log noise when the same error or warning occurs repeatedly.

0

mem_buf_limit

Set a memory buffer limit for the input plugin instance in bytes. If the limit is reached, the plugin will pause until the buffer is drained. If set to 0, the buffer limit is disabled. If the plugin has enabled filesystem buffering, this limit won't apply. The value must be according to the Unit Size specification.

0

path

Path to scan for blob (binary) files. Supports wildcards and glob patterns. For example, /var/log/binaries/*.bin or /data/artifacts/**/*.dat. This is a required parameter.

none

routable

If true, the data generated by the plugin can be forwarded to other plugins or outputs. If false, the data will be discarded. This is used for testing or when you want to process data but not forward it.

true

scan_refresh_interval

Set the interval time to scan for new files. The plugin periodically scans the specified path for new or modified files. The value must be specified according to the Unit Size specification (2s, 30m, 1h).

2s

storage.pause_on_chunks_overlimit

Enable pausing on an input when it reaches its chunks limit. When enabled, the plugin will pause processing if the number of chunks exceeds the limit, preventing memory issues during backpressure scenarios.

false

storage.type

Sets the storage type for this input. Options: filesystem (persists data to disk), memory (stores data in memory only), or memrb (memory ring buffer). For production environments with high data volumes, consider using filesystem to prevent data loss during restarts.

memory

tag

Set a tag for the events generated by this input plugin. Tags are used for routing records to specific outputs. Supports tag expansion with wildcards.

none

threaded

Indicates whether to run this input in its own thread. When enabled, the plugin runs in a separate thread, which can improve performance for I/O-bound operations.

false

threaded.ring_buffer.capacity

Set custom ring buffer capacity when the input runs in threaded mode. This determines how many records can be buffered in the ring buffer before blocking.

1024

threaded.ring_buffer.window

Set custom ring buffer window percentage for threaded inputs. This controls when the ring buffer is considered "full" and triggers backpressure handling.

5

upload_success_action

Action to perform on the file after successful upload. Supported values: delete (delete the file), add_suffix (rename file by appending a suffix), emit_log (emit a log record with a custom message). When set to add_suffix, use upload_success_suffix to specify the suffix. When set to emit_log, use upload_success_message to specify the message.

none

upload_success_suffix

Suffix to append to the filename after successful upload. Only used when upload_success_action is set to add_suffix. For example, if set to .processed, a file named data.bin will be renamed to data.bin.processed.

none

upload_success_message

Message to emit as a log record after successful upload. Only used when upload_success_action is set to emit_log. This can be used for debugging or monitoring purposes.

none

upload_failure_action

Action to perform on the file after upload failure. Supported values: delete (delete the file), add_suffix (rename file by appending a suffix), emit_log (emit a log record with a custom message). When set to add_suffix, use upload_failure_suffix to specify the suffix. When set to emit_log, use upload_failure_message to specify the message.

none

upload_failure_suffix

Suffix to append to the filename after upload failure. Only used when upload_failure_action is set to add_suffix. For example, if set to .failed, a file named data.bin will be renamed to data.bin.failed.

none

upload_failure_message

Message to emit as a log record after upload failure. Only used when upload_failure_action is set to emit_log. This can be used for debugging or monitoring purposes.

none

How it works

The Blob input plugin periodically scans the specified directory path for binary files. When a new or modified file is detected, the plugin reads the file content and creates records that are forwarded through the Fluent Bit pipeline. The plugin can track processed files using a database file, allowing it to resume from the last known position after a restart.

Binary file content is typically included in the output records, and the exact format depends on the output plugin configuration. The plugin generates one or more records per file, depending on the file size and configuration.

Database file

The database file enables the plugin to track which files have been processed and maintain state across Fluent Bit restarts. This is similar to how the Tail input plugin uses a database file.

When a database file is specified:

The plugin stores information about processed files, including file paths and processing status
On restart, the plugin can skip files that were already processed
The database is backed by SQLite3 and will create additional files (.db-shm and .db-wal) when using write-ahead logging mode

It's recommended to use a unique database file for each blob input instance to avoid conflicts. For example:

pipeline:
  inputs:
    - name: blob
      path: /var/log/binaries/*.bin
      database_file: /var/lib/fluent-bit/blob.db

Use cases

The Blob input plugin common use cases are:

Binary log files: Processing binary-formatted log files that can't be read as text
Artifact collection: Collecting binary artifacts or build outputs for analysis or archival
File monitoring: Monitoring directories for new binary files and forwarding them to storage or analysis systems
Data pipeline integration: Integrating binary data sources into your Fluent Bit data pipeline

Get started

You can run the plugin from the command line or through a configuration file.

Command line

Run the plugin from the command line using the following command:

fluent-bit -i blob --prop "path=[SOME_PATH_TO_BINARY_FILES]" -o stdout

which returns results like the following:

...
[2025/11/05 17:39:32.818356000] [ info] [input:blob:blob.0] initializing
[2025/11/05 17:39:32.818362000] [ info] [input:blob:blob.0] storage_strategy='memory' (memory only)
...

Configuration file

In your main configuration file append the following:

pipeline:
  inputs:
    - name: blob
      path: '/path/to/binary/files/*.bin'

  outputs:
    - name: stdout
      match: '*'

Examples

Basic configuration with database tracking

This example shows how to configure the blob plugin with a database file to track processed files:

pipeline:
  inputs:
    - name: blob
      path: /var/log/binaries/*.bin
      database_file: /var/lib/fluent-bit/blob.db
      scan_refresh_interval: 10s
      tag: blob.files

  outputs:
    - name: stdout
      match: '*'

[INPUT]
  Name                blob
  Path                /var/log/binaries/*.bin
  Database_File       /var/lib/fluent-bit/blob.db
  Scan_Refresh_Interval 10s
  Tag                 blob.files

[OUTPUT]
  Name   stdout
  Match  *

Configuration with file exclusion and storage

This example excludes certain file patterns and uses filesystem storage for better reliability:

pipeline:
  inputs:
    - name: blob
      path: /data/artifacts/**/*
      exclude_pattern: *.tmp,*.bak,*.old
      storage.type: filesystem
      storage.pause_on_chunks_overlimit: true
      mem_buf_limit: 50M
      tag: artifacts

  outputs:
    - name: stdout
      match: '*'

[INPUT]
  Name                        blob
  Path                        /data/artifacts/**/*
  Exclude_Pattern            *.tmp,*.bak,*.old
  Storage.Type                filesystem
  Storage.Pause_On_Chunks_Overlimit true
  Mem_Buf_Limit              50M
  Tag                         artifacts

[OUTPUT]
  Name   stdout
  Match  *

Configuration with file actions after upload

This example renames files after successful upload and handles failures:

pipeline:
  inputs:
    - name: blob
      path: /var/log/binaries/*.bin
      database_file: /var/lib/fluent-bit/blob.db
      upload_success_action: add_suffix
      upload_success_suffix: .processed
      upload_failure_action: add_suffix
      upload_failure_suffix: .failed
      tag: blob.data

  outputs:
    - name: stdout
      match: '*'

[INPUT]
  Name                  blob
  Path                  /var/log/binaries/*.bin
  Database_File         /var/lib/fluent-bit/blob.db
  Upload_Success_Action add_suffix
  Upload_Success_Suffix .processed
  Upload_Failure_Action add_suffix
  Upload_Failure_Suffix .failed
  Tag                   blob.data

[OUTPUT]
  Name   stdout
  Match  *

PreviousInputs NextCollectd

Last updated 1 day ago

Was this helpful?