Fluent Bit: Official Manual
SlackGitHubCommunity MeetingsSandbox and LabsWebinars
1.5
1.5
  • Fluent Bit v1.5 Documentation
  • About
    • What is Fluent Bit ?
    • A Brief History of Fluent Bit
    • Fluentd & Fluent Bit
    • License
  • Concepts
    • Key Concepts
    • Buffering
    • Data Pipeline
      • Input
      • Parser
      • Filter
      • Buffer
      • Router
      • Output
  • Installation
    • Upgrade Notes
    • Supported Platforms
    • Requirements
    • Sources
      • Download Source Code
      • Build and Install
      • Build with Static Configuration
    • Linux Packages
      • Amazon Linux
      • Redhat / CentOS
      • Debian
      • Ubuntu
      • Raspbian / Raspberry Pi
    • Docker
    • Containers on AWS
    • Amazon EC2
    • Kubernetes
    • Yocto / Embedded Linux
    • Windows
  • Administration
    • Configuring Fluent Bit
      • Format and Schema
      • Configuration File
      • Variables
      • Commands
      • Upstream Servers
      • Unit Sizes
      • Record Accessor
    • Security
    • Buffering & Storage
    • Backpressure
    • Scheduling and Retries
    • Networking
    • Memory Management
    • Monitoring
    • Dump Internals / Signal
  • Local Testing
    • Validating your Data and Structure
    • Running a Logging Pipeline Locally
  • Data Pipeline
    • Inputs
      • Collectd
      • CPU Metrics
      • Disk I/O Metrics
      • Docker Events
      • Dummy
      • Exec
      • Forward
      • Head
      • Health
      • Kernel Logs
      • Memory Metrics
      • MQTT
      • Network I/O Metrics
      • Process
      • Random
      • Serial Interface
      • Standard Input
      • Syslog
      • Systemd
      • Tail
      • TCP
      • Thermal
      • Windows Event Log
    • Parsers
      • JSON
      • Regular Expression
      • LTSV
      • Logfmt
      • Decoders
    • Filters
      • AWS Metadata
      • Expect
      • Grep
      • Kubernetes
      • Lua
      • Parser
      • Record Modifier
      • Rewrite Tag
      • Standard Output
      • Throttle
      • Nest
      • Modify
    • Outputs
      • Amazon CloudWatch
      • Azure
      • BigQuery
      • Counter
      • Datadog
      • Elasticsearch
      • File
      • FlowCounter
      • Forward
      • GELF
      • HTTP
      • InfluxDB
      • Kafka
      • Kafka REST Proxy
      • LogDNA
      • NATS
      • New Relic
      • NULL
      • PostgreSQL
      • Stackdriver
      • Standard Output
      • Splunk
      • Syslog
      • TCP & TLS
      • Treasure Data
  • Stream Processing
    • Introduction to Stream Processing
    • Overview
    • Changelog
    • Getting Started
      • Fluent Bit + SQL
      • Check Keys and NULL values
      • Hands On! 101
  • Fluent Bit for Developers
    • C Library API
    • Ingest Records Manually
    • Golang Output Plugins
    • Developer guide for beginners on contributing to Fluent Bit
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Data Pipeline
  2. Parsers

Regular Expression

Last updated 4 years ago

Was this helpful?

The regex parser allows to define a custom Ruby Regular Expression that will use a named capture feature to define which content belongs to which key name.

Fluent Bit uses regular expression library on Ruby mode, for testing purposes you can use the following web editor to test your expressions:

Important: do not attempt to add multiline support in your regular expressions if you are using input plugin since each line is handled as a separated entity. Instead use Tail support configuration feature.

Security Warning: Onigmo is a backtracking regex engine. You need to be careful not to use expensive regex patterns, or Onigmo can take very long time to perform pattern matching. For details, please read the article on OWASP.

Note: understanding how regular expressions works is out of the scope of this content.

From a configuration perspective, when the format is set to regex, is mandatory and expected that a Regex configuration key exists.

The following parser configuration example aims to provide rules that can be applied to an Apache HTTP Server log entry:

[PARSER]
    Name   apache
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

As an example, takes the following Apache HTTP Server log entry:

192.168.2.20 - - [29/Jul/2015:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395

The above content do not provide a defined structure for Fluent Bit, but enabling the proper parser we can help to make a structured representation of it:

[1154104030, {"host"=>"192.168.2.20",
              "user"=>"-",
              "method"=>"GET",
              "path"=>"/cgi-bin/try/",
              "code"=>"200",
              "size"=>"3395",
              "referer"=>"",
              "agent"=>""
              }
]

A common pitfall is that you cannot use characters other than alphabets, numbers and underscore in group names. For example, a group name like (?<user-name>.*) will cause an error due to containing an invalid character (-).

In order to understand, learn and test regular expressions like the example above, we suggest you try the following Ruby Regular Expression Editor:

Onigmo
http://rubular.com/
Tail
Multiline
"ReDoS"
http://rubular.com/r/X7BH0M4Ivm