# Multiline parsing

![](https://static.scarf.sh/a.png?x-pxid=e19a4c14-a9e4-4163-8f3a-52196eb9a585)

In an ideal world, applications might log their messages within a single line, but in reality applications generate multiple log messages that sometimes belong to the same context. Processing this information can be complex, like in application stack traces, which always have multiple log lines.

Fluent Bit v1.8 implemented a unified Multiline core capability to solve corner cases.

## Concepts

The Multiline parser engine exposes two ways to configure and use the feature:

* Built-in multiline parser
* Configurable multiline parser

### Built-in multiline parsers

Fluent Bit exposes certain pre-configured parsers (built-in) to solve specific multiline parser cases. For example:

| Parser   | Description                                                                                                                                                                                                                                                                                                                                                                     |
| -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `docker` | Process a log entry generated by a Docker container engine. This parser supports the concatenation of large log entries split by Docker. If you use this parser, and you also want to concatenate loglines like stacktraces, you can add the [multiline filter](https://docs.fluentbit.io/manual/4.1/data-pipeline/filters/multiline-stacktrace) to specify additional parsers. |
| `cri`    | Process a log entry generated by CRI-O container engine. Like the `docker` parser, it supports concatenation of log entries.                                                                                                                                                                                                                                                    |
| `go`     | Process log entries generated by a Go based language application and perform concatenation if multiline messages are detected.                                                                                                                                                                                                                                                  |
| `python` | Process log entries generated by a Python based language application and perform concatenation if multiline messages are detected.                                                                                                                                                                                                                                              |
| `java`   | Process log entries generated by a Google Cloud Java language application and perform concatenation if multiline messages are detected.                                                                                                                                                                                                                                         |

### Configurable multiline parsers

You can define your own Multiline parsers with their own rules, using a configuration file.

A multiline parser is defined in a `parsers configuration file` by using a `[MULTILINE_PARSER]` section definition. The multiline parser must have a unique name and a type, plus other configured properties associated with each type.

To understand which multiline parser type is required for your use case you have to know the conditions in the content that determine the beginning of a multiline message, and the continuation of subsequent lines. Fluent Bit provides a regular expression-based configuration that supports states to handle from the most cases.

| Property        | Description                                                                                                                                                                                                                                                                                                                                                                                | Default |
| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------- |
| `name`          | Specify a unique name for the multiline parser definition. A good practice is to prefix the name with the word `multiline_` to avoid confusion with normal parser definitions.                                                                                                                                                                                                             | *none*  |
| `type`          | Set the multiline mode. Fluent Bit supports the type `regex`.                                                                                                                                                                                                                                                                                                                              | *none*  |
| `parser`        | Name of a pre-defined parser that must be applied to the incoming content before applying the regular expression rule. If no parser is defined, it's assumed that's a raw text and not a structured message. When a parser is applied to a raw text, the regular expression is applied against a specific key of the structured message by using the `key_content` configuration property. | *none*  |
| `key_content`   | For an incoming structured message, specify the key that contains the data that should be processed by the regular expression and possibly concatenated.                                                                                                                                                                                                                                   | *none*  |
| `flush_timeout` | Timeout in milliseconds to flush a non-terminated multiline buffer.                                                                                                                                                                                                                                                                                                                        | `5s`    |
| `rule`          | Configure a rule to match a multiline pattern. The rule has a [specific format](#rules-definition). Multiple rules can be defined.                                                                                                                                                                                                                                                         | *none*  |

#### Lines and states

Before configuring your parser you need to know the answer to the following questions:

1. What's the regular expression (`regex`) that matches the first line of a multiline message?
2. What are the regular expressions (`regex`) that match the continuation lines of a multiline message?

When matching a regular expression, you must to define `states`. Some states define the start of a multiline message while others are states for the continuation of multiline messages. You can have multiple `continuation states` definitions to solve complex cases.

The first regular expression that matches the start of a multiline message is called `start_state`. Other regular expression continuation lines can have different state names.

#### Rules definition

A rule specifies how to match a multiline pattern and perform the concatenation. A rule is defined by 3 specific components:

* state name
* regular expression pattern
* next state

A rule might be defined as follows (comments added to simplify the definition) in the corresponding YAML and classic configuration examples:

{% tabs %}
{% tab title="parsers\_multiline.yaml" %}

```yaml
# rules |   state name  | regex pattern                  | next state
# ------|---------------|--------------------------------------------
rules:
  - state: start_state
    regex: '/([a-zA-Z]+ \d+ \d+\:\d+\:\d+)(.*)/'
    next_state:  cont
  - state: cont
    regex: '/^\s+at.*/'
    next_state: cont
```

{% endtab %}

{% tab title="parsers\_multiline.conf" %}

```
# rules   |   state name   | regex pattern                   | next state
# --------|----------------|---------------------------------------------
rule         "start_state"   "/([a-zA-Z]+ \d+ \d+\:\d+\:\d+)(.*)/"   "cont"
rule         "cont"          "/^\s+at.*/"                      "cont"
```

{% endtab %}
{% endtabs %}

This example defines two rules. Each rule has its own state name, regular expression patterns, and the next state name. Every field that composes a rule must be inside double quotes.

The first rule of a state name must be `start_state`. The regular expression pattern must match the first line of a multiline message, and a next state must be set to specify what the possible continuation lines look like.

{% hint style="info" %}
To simplify the configuration of regular expressions, you can use the [Rubular](https://rubular.com/r/NDuyKwlTGOvq2g) web site. This link uses the regular expression described in the previous example, plus a log line that matches the pattern:
{% endhint %}

#### Configuration example

The following example provides a full Fluent Bit configuration file for multiline parsing by using the definition explained previously. It's provided in following YAML and classic configuration examples:

{% tabs %}
{% tab title="fluent-bit.yaml" %}
This is the primary Fluent Bit YAML configuration file. It includes the `parsers_multiline.yaml` and tails the file `test.log` by applying the multiline parser `multiline-regex-test`. Then it sends the processing to the standard output.

```yaml
service:
  flush: 1
  log_level: info
  parsers_file: parsers_multiline.yaml

pipeline:
  inputs:
    - name: tail
      path: test.log
      read_from_head: true
      multiline.parser: multiline-regex-test

  outputs:
    - name: stdout
      match: '*'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}
This is the primary Fluent Bit classic configuration file. It includes the `parsers_multiline.conf` and tails the file `test.log` by applying the multiline parser `multiline-regex-test`. Then it sends the processing to the standard output.

```
[SERVICE]
  flush        1
  log_level    info
  parsers_file parsers_multiline.conf

[INPUT]
  name             tail
  path             test.log
  read_from_head   true
  multiline.parser multiline-regex-test

[OUTPUT]
  name             stdout
  match            *
```

{% endtab %}

{% tab title="parsers\_multiline.yaml" %}
This file defines a multiline parser for the YAML configuration example.

```yaml
multiline_parsers:
  - name: multiline-regex-test
    type: regex
    flush_timeout: 1000
    #
    # Regex rules for multiline parsing
    # ---------------------------------
    #
    # configuration hints:
    #
    #  - first state always has the name: start_state
    #  - every field in the rule must be inside double quotes
    #
    # rules |   state name  | regex pattern                  | next state
    # ------|---------------|--------------------------------------------
    rules:
      - state: start_state
        regex: '/([a-zA-Z]+ \d+ \d+\:\d+\:\d+)(.*)/'
        next_state: cont
      - state: cont
        regex: '/^\s+at.*/'
        next_state: cont
```

{% endtab %}

{% tab title="parsers\_multiline.conf" %}
This second file defines a multiline parser for the classic configuration example.

```
[MULTILINE_PARSER]
  name          multiline-regex-test
  type          regex
  flush_timeout 1000
  #
  # Regex rules for multiline parsing
  # ---------------------------------
  #
  # configuration hints:
  #
  #  - first state always has the name: start_state
  #  - every field in the rule must be inside double quotes
  #
  # rules |   state name  | regex pattern                  | next state
  # ------|---------------|--------------------------------------------
  rule      "start_state"   "/([a-zA-Z]+ \d+ \d+\:\d+\:\d+)(.*)/"  "cont"
  rule      "cont"          "/^\s+at.*/"                     "cont"
```

{% endtab %}

{% tab title="test.log" %}
The example log file with multiline content:

```
single line...
Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)
another line...

```

{% endtab %}
{% endtabs %}

By running Fluent Bit with the corresponding configuration file you will obtain the following output:

```shell
# For YAML configuration.
$ ./fluent-bit --config fluent-bit.yaml

# For classic configuration.
$ ./fluent-bit --config fluent-bit.conf

...
[0] tail.0: [[1750332967.679671000, {}], {"log"=>"single line...
"}]
[1] tail.0: [[1750332967.679677000, {}], {"log"=>"Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)
"}]
[2] tail.0: [[1750332967.679677000, {}], {"log"=>"another line...
"}]
```

The lines that didn't match a pattern aren't considered as part of the multiline message, while the ones that matched the rules were concatenated properly.

## Limitations

The multiline parser is a very powerful feature, but it has some limitations that you should be aware of:

* The multiline parser isn't affected by the `buffer_max_size` configuration option, allowing the composed log record to grow beyond this size. The `skip_long_lines` option won't be applied to multiline messages.
* It's not possible to get the time key from the body of the multiline message. However, it can be extracted and set as a new key by using a filter.

### Additional configuration for buffer limit of multiline

Fluent Bit now supports a configuration key to **limit the memory used during multiline concatenation**.

| Property                 | Description                                                                                                                                                                                                                                                                                                                               | Default |
| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `multiline_buffer_limit` | Sets the maximum size of the in-memory buffer used while assembling a multiline message. When the accumulated size exceeds this limit, the message is truncated and a `multiline_truncated: true` metadata field is attached to the emitted record. A value of `0` disables the limit. Accepts unit suffixes like `KiB`, `MiB`, or `GiB`. | `2MiB`  |

```
service:
  multiline_buffer_limit  2MiB   # default; limit concatenated multiline message size
```

If the limit is reached, Fluent Bit flushes the partial record with `multiline_truncated: true` metadata immediately to prevent Out-Of-Memory (OOM) conditions. This option ensures predictable memory usage in Kubernetes or other constrained environments, particularly when using built-in parsers such as `cri` or `docker`.

## Get structured data from multiline message

Fluent-bit supports the `/pat/m` option. It allows `.` matches a new line, which can be used to parse multiline logs.

The following example retrieves `date` and `message` from concatenated logs.

Example files content:

{% tabs %}
{% tab title="fluent-bit.yaml" %}
This is the primary Fluent Bit YAML configuration file. It includes the `parsers_multiline.conf` and tails the file `test.log` by applying the multiline parser `multiline-regex-test`. It also parses concatenated log by applying parser `named-capture-test`. Then it sends the processing to the standard output.

```yaml
service:
  flush: 1
  log_level: info
  parsers_file: parsers_multiline.yaml

pipeline:
  inputs:
    - name: tail
      path: test.log
      read_from_head: true
      multiline.parser: multiline-regex-test

  filters:
    - name: parser
      match: '*'
      key_name: log
      parser: named-capture-test

  outputs:
    - name: stdout
      match: '*'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}
This is the primary Fluent Bit classic configuration file. It includes the `parsers_multiline.conf` and tails the file `test.log` by applying the multiline parser `multiline-regex-test`. It also parses concatenated log by applying parser `named-capture-test`. Then it sends the processing to the standard output.

```
[SERVICE]
  flush        1
  log_level    info
  parsers_file parsers_multiline.conf

[INPUT]
  name             tail
  path             test.log
  read_from_head   true
  multiline.parser multiline-regex-test

[FILTER]
  name             parser
  match            *
  key_name         log
  parser           named-capture-test

[OUTPUT]
  name             stdout
  match            *
```

{% endtab %}

{% tab title="parsers\_multiline.yaml" %}
This file defines a multiline parser for the YAML example.

```yaml
multiline_parsers:
  - name: multiline-regex-test
    type: regex
    flush_timeout: 1000
    #
    # Regex rules for multiline parsing
    # ---------------------------------
    #
    # configuration hints:
    #
    #  - first state always has the name: start_state
    #  - every field in the rule must be inside double quotes
    #
    # rules |   state name  | regex pattern                  | next state
    # ------|---------------|--------------------------------------------
    rules:
      - state: start_state
        regex: '/([a-zA-Z]+ \d+ \d+\:\d+\:\d+)(.*)/'
        next_state:  cont

      - state: cont
        regex: '/^\s+at.*/'
        next_state: cont

parsers:
  - name: named-capture-test
    format: regex
    regex: '/^(?<date>[a-zA-Z]+ \d+ \d+\:\d+\:\d+) (?<message>.*)/m'
```

{% endtab %}

{% tab title="parsers\_multiline.conf" %}
This file defines a multiline parser for the classic example.

```
[MULTILINE_PARSER]
  name          multiline-regex-test
  type          regex
  flush_timeout 1000
  #
  # Regex rules for multiline parsing
  # ---------------------------------
  #
  # configuration hints:
  #
  #  - first state always has the name: start_state
  #  - every field in the rule must be inside double quotes
  #
  # rules |   state name  | regex pattern                  | next state
  # ------|---------------|--------------------------------------------
  rule      "start_state"   "/([a-zA-Z]+ \d+ \d+\:\d+\:\d+)(.*)/"  "cont"
  rule      "cont"          "/^\s+at.*/"                     "cont"

[PARSER]
  Name named-capture-test
  Format regex
  Regex /^(?<date>[a-zA-Z]+ \d+ \d+\:\d+\:\d+) (?<message>.*)/m
```

{% endtab %}

{% tab title="test.log" %}
The example log file with multiline content:

```
single line...
Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)
another line...
```

{% endtab %}
{% endtabs %}

By running Fluent Bit with the corresponding configuration file you will obtain:

```shell
# For YAML configuration.
$ ./fluent-bit --config fluent-bit.yaml

# For classic configuration
$ ./fluent-bit --config fluent-bit.conf

[0] tail.0: [[1750333602.460984000, {}], {"log"=>"single line...
"}]
[1] tail.0: [[1750333602.460998000, {}], {"date"=>"Dec 14 06:41:08", "message"=>"Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)
"}]
[2] tail.0: [[1750333602.460998000, {}], {"log"=>"another line...
"}]
```
