Validating your Data and Structure
Last updated
Last updated
Fluent Bit is a powerful log processing tool that supports mulitple sources and formats. In addition, it provides filters that can be used to perform custom modifications. As your pipeline grows, it's important to validate your data and structure.
Fluent Bit users are encouraged to integrate data validation in their contininuous integration (CI) systems.
In a normal production environment, inputs, filters, and outputs are defined in the configuration. Fluent Bit provides the Expect filter, which can be used to validate keys
and values
from your records and take action when an exception is found.
A simplified view of the data processing pipeline is as follows:
Consider the following pipeline, where your source of data is a file with JSON content and two filters:
grep to exclude certain records
record_modifier to alter the record content by adding and removing specific keys.
Add data validation between each step to ensure your data structure is correct.
This example uses the expect
filter.
Expect
filters set rules aiming to validate criteria like:
Does the record contain a key A
?
Does the record not contain key A
?
Does the record key A
value equal NULL
?
Is the record key A
value not NULL
?
Does the record key A
value equal B
?
Every expect
filter configuration exposes rules to validate the content of your records using configuration properties.
Consider a JSON file data.log
with the following content:
The following Fluent Bit configuration file configures a pipeline to consume the log, while applying an expect
filter to validate that the keys color
and label
exist:
If the JSON parser fails or is missing in the tail
input (parser json
), the expect
filter triggers the exit
action.
To extend the pipeline, add a grep filter to match records that map label
containing a key called name
with value the abc
, and add an expect
filter to re-validate that condition:
When deploying in production, consider removing the expect
filters from your configuration. These filters are unneccesary unless you need 100% coverage of checks at runtime.