Fluent Bit: Official Manual
SlackGitHubCommunity MeetingsSandbox and LabsWebinars
3.1
3.1
  • Fluent Bit v3.1 Documentation
  • About
    • What is Fluent Bit?
    • A Brief History of Fluent Bit
    • Fluentd & Fluent Bit
    • License
  • Concepts
    • Key Concepts
    • Buffering
    • Data Pipeline
      • Input
      • Parser
      • Filter
      • Buffer
      • Router
      • Output
  • Installation
    • Getting Started with Fluent Bit
    • Upgrade Notes
    • Supported Platforms
    • Requirements
    • Sources
      • Download Source Code
      • Build and Install
      • Build with Static Configuration
    • Linux Packages
      • Amazon Linux
      • Redhat / CentOS
      • Debian
      • Ubuntu
      • Raspbian / Raspberry Pi
    • Docker
    • Containers on AWS
    • Amazon EC2
    • Kubernetes
    • macOS
    • Windows
    • Yocto / Embedded Linux
    • Buildroot / Embedded Linux
  • Administration
    • Configuring Fluent Bit
      • Classic mode
        • Format and Schema
        • Configuration File
        • Variables
        • Commands
        • Upstream Servers
        • Record Accessor
      • YAML Configuration
        • Configuration File
      • Unit Sizes
      • Multiline Parsing
    • Transport Security
    • Buffering & Storage
    • Backpressure
    • Scheduling and Retries
    • Networking
    • Memory Management
    • Monitoring
    • Multithreading
    • HTTP Proxy
    • Hot Reload
    • Troubleshooting
  • Local Testing
    • Validating your Data and Structure
    • Running a Logging Pipeline Locally
  • Data Pipeline
    • Pipeline Monitoring
    • Inputs
      • Collectd
      • CPU Log Based Metrics
      • Disk I/O Log Based Metrics
      • Docker Log Based Metrics
      • Docker Events
      • Dummy
      • Elasticsearch
      • Exec
      • Exec Wasi
      • Fluent Bit Metrics
      • Forward
      • Head
      • HTTP
      • Health
      • Kafka
      • Kernel Logs
      • Kubernetes Events
      • Memory Metrics
      • MQTT
      • Network I/O Log Based Metrics
      • NGINX Exporter Metrics
      • Node Exporter Metrics
      • Podman Metrics
      • Process Log Based Metrics
      • Process Exporter Metrics
      • Prometheus Scrape Metrics
      • Prometheus Remote Write
      • Random
      • Serial Interface
      • Splunk
      • Standard Input
      • StatsD
      • Syslog
      • Systemd
      • Tail
      • TCP
      • Thermal
      • UDP
      • OpenTelemetry
      • Windows Event Log
      • Windows Event Log (winevtlog)
      • Windows Exporter Metrics
    • Parsers
      • Configuring Parser
      • JSON
      • Regular Expression
      • LTSV
      • Logfmt
      • Decoders
    • Processors
      • Content Modifier
      • Labels
      • Metrics Selector
      • OpenTelemetry Envelope
      • SQL
    • Filters
      • AWS Metadata
      • CheckList
      • ECS Metadata
      • Expect
      • GeoIP2 Filter
      • Grep
      • Kubernetes
      • Log to Metrics
      • Lua
      • Parser
      • Record Modifier
      • Modify
      • Multiline
      • Nest
      • Nightfall
      • Rewrite Tag
      • Standard Output
      • Sysinfo
      • Throttle
      • Type Converter
      • Tensorflow
      • Wasm
    • Outputs
      • Amazon CloudWatch
      • Amazon Kinesis Data Firehose
      • Amazon Kinesis Data Streams
      • Amazon S3
      • Azure Blob
      • Azure Data Explorer
      • Azure Log Analytics
      • Azure Logs Ingestion API
      • Counter
      • Datadog
      • Elasticsearch
      • File
      • FlowCounter
      • Forward
      • GELF
      • Google Chronicle
      • Google Cloud BigQuery
      • HTTP
      • InfluxDB
      • Kafka
      • Kafka REST Proxy
      • LogDNA
      • Loki
      • Microsoft Fabric
      • NATS
      • New Relic
      • NULL
      • OpenObserve
      • Observe
      • Oracle Log Analytics
      • OpenSearch
      • OpenTelemetry
      • PostgreSQL
      • Prometheus Exporter
      • Prometheus Remote Write
      • SkyWalking
      • Slack
      • Splunk
      • Stackdriver
      • Standard Output
      • Syslog
      • TCP & TLS
      • Treasure Data
      • Vivo Exporter
      • WebSocket
  • Stream Processing
    • Introduction to Stream Processing
    • Overview
    • Changelog
    • Getting Started
      • Fluent Bit + SQL
      • Check Keys and NULL values
      • Hands On! 101
  • Fluent Bit for Developers
    • C Library API
    • Ingest Records Manually
    • Golang Output Plugins
    • WASM Filter Plugins
    • WASM Input Plugins
    • Developer guide for beginners on contributing to Fluent Bit
Powered by GitBook
On this page
  • Understand structure and configuration
  • Test the configuration
  • Production deployment

Was this helpful?

Export as PDF
  1. Local Testing

Validating your Data and Structure

Last updated 8 months ago

Was this helpful?

Fluent Bit is a powerful log processing tool that supports mulitple sources and formats. In addition, it provides filters that can be used to perform custom modifications. As your pipeline grows, it's important to validate your data and structure.

Fluent Bit users are encouraged to integrate data validation in their contininuous integration (CI) systems.

In a normal production environment, inputs, filters, and outputs are defined in the configuration. Fluent Bit provides the filter, which can be used to validate keys and values from your records and take action when an exception is found.

A simplified view of the data processing pipeline is as follows:

Understand structure and configuration

Consider the following pipeline, where your source of data is a file with JSON content and two filters:

  • to exclude certain records

  • to alter the record content by adding and removing specific keys.

Add data validation between each step to ensure your data structure is correct.

This example uses the expect filter.

Expect filters set rules aiming to validate criteria like:

  • Does the record contain a key A?

  • Does the record not contain key A?

  • Does the record key A value equal NULL?

  • Is the record key A value not NULL?

  • Does the record key A value equal B?

Test the configuration

Consider a JSON file data.log with the following content:

{"color": "blue", "label": {"name": null}}
{"color": "red", "label": {"name": "abc"}, "meta": "data"}
{"color": "green", "label": {"name": "abc"}, "meta": null}

The following Fluent Bit configuration file configures a pipeline to consume the log, while applying an expect filter to validate that the keys color and label exist:

[SERVICE]
    flush        1
    log_level    info
    parsers_file parsers.conf

[INPUT]
    name        tail
    path        ./data.log
    parser      json
    exit_on_eof on

# First 'expect' filter to validate that our data was structured properly
[FILTER]
    name        expect
    match       *
    key_exists  color
    key_exists  $label['name']
    action      exit

[OUTPUT]
    name        stdout
    match       *

If the JSON parser fails or is missing in the tail input (parser json), the expect filter triggers the exit action.

To extend the pipeline, add a grep filter to match records that map label containing a key called name with value the abc, and add an expect filter to re-validate that condition:

[SERVICE]
    flush        1
    log_level    info
    parsers_file parsers.conf

[INPUT]
    name         tail
    path         ./data.log
    parser       json
    exit_on_eof  on

# First 'expect' filter to validate that our data was structured properly
[FILTER]
    name       expect
    match      *
    key_exists color
    key_exists label
    action     exit

# Match records that only contains map 'label' with key 'name' = 'abc'
[FILTER]
    name       grep
    match      *
    regex      $label['name'] ^abc$

# Check that every record contains 'label' with a non-null value
[FILTER]
    name       expect
    match      *
    key_val_eq $label['name'] abc
    action     exit

# Append a new key to the record using an environment variable
[FILTER]
    name       record_modifier
    match      *
    record     hostname ${HOSTNAME}

# Check that every record contains 'hostname' key
[FILTER]
    name       expect
    match      *
    key_exists hostname
    action     exit

[OUTPUT]
    name       stdout
    match      *

Production deployment

When deploying in production, consider removing the expect filters from your configuration. These filters are unneccesary unless you need 100% coverage of checks at runtime.

Every expect filter configuration exposes rules to validate the content of your records using .

Expect
grep
record_modifier
configuration properties