Regular expression format

Use the regular expression parser format to create custom parsers with Ruby regular expressions. These regular expressions use named capture to define which content belongs to which key name.

Use Tail multiline when you need to support regular expressions across multiple lines from a tail. The Tail input plugin treats each line as a separate entity.

This parser uses Onigmo, which is a backtracking regular expression's engine. When using complex regular expression patterns, Onigmo can take a long time to perform pattern matching. This can cause a regular expression denial of service (ReDoS).

Setting the format to regular expressions requires a regex configuration key.

For available configuration parameters, see Configuring custom parsers.

Configuration parameters

The regex parser supports the following format-specific configuration parameter:

Key

Description

Default

skip_empty_values

If enabled, the parser ignores empty values of the record.

true

Fluent Bit uses the Onigmo regular expression library in Ruby mode.

You can use only alphanumeric characters and underscore in group names. For example, a group name like (?<user-name>.*) causes an error due to the invalid dash (-) character. Use the Rubular web editor to test your expressions.

The following parser configuration example provides rules that can be applied to an Apache HTTP Server log entry:

parsers:
  - name: apache
    format: regex
    regex: '^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$'
    time_key: time
    time_format: '%d/%b/%Y:%H:%M:%S %z'
    types: pid:integer size:integer

[PARSER]
  Name   apache
  Format regex
  Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
  Time_Key time
  Time_Format %d/%b/%Y:%H:%M:%S %z
  Types code:integer size:integer

As an example, review the following Apache HTTP Server log entry:

192.168.2.20 - - [29/Jul/2015:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395

This log entry doesn't provide a defined structure for Fluent Bit. Enabling the proper parser can help to make a structured representation of the entry:

[1154104030, {"host"=>"192.168.2.20",
              "user"=>"-",
              "method"=>"GET",
              "path"=>"/cgi-bin/try/",
              "code"=>"200",
              "size"=>"3395",
              "referer"=>"",
              "agent"=>""
              }
]

PreviousLTSV format NextDecoder settings

Last updated 4 days ago

Was this helpful?

hashtagConfiguration parameters

Configuration parameters