OpenSearch

Send logs to Amazon OpenSearch Service

The opensearch output plugin, allows to ingest your records into an OpenSearch database. The following instructions assumes that you have a fully operational OpenSearch service running in your environment.

Configuration Parameters

The parameters index and type can be confusing if you are new to OpenSearch, if you have used a common relational database before, they can be compared to the database and table concepts. Also see the FAQ below

TLS / SSL

OpenSearch output plugin supports TLS/SSL, for more details about the properties available and general configuration, please refer to the TLS/SSL section.

write_operation

The write_operation can be any of:

Please note, Id_Key or Generate_ID is required in update, and upsert scenario.

Getting Started

In order to insert records into an OpenSearch service, you can run the plugin from the command line or through the configuration file:

Command Line

The opensearch plugin, can read the parameters from the command line in two ways, through the -p argument (property) or setting them directly through the service URI. The URI format is the following:

es://host:port/index/type

Using the format specified, you could start Fluent Bit through:

$ fluent-bit -i cpu -t cpu -o es://192.168.2.3:9200/my_index/my_type \
    -o stdout -m '*'

which is similar to do:

$ fluent-bit -i cpu -t cpu -o opensearch -p Host=192.168.2.3 -p Port=9200 \
    -p Index=my_index -p Type=my_type -o stdout -m '*'

Configuration File

In your main configuration file append the following Input & Output sections. You can visualize this configuration here

[INPUT]
    Name  cpu
    Tag   cpu

[OUTPUT]
    Name  opensearch
    Match *
    Host  192.168.2.3
    Port  9200
    Index my_index
    Type  my_type

About OpenSearch field names

Some input plugins may generate messages where the field names contains dots. This opensearch plugin replaces them with an underscore, e.g:

{"cpu0.p_cpu"=>17.000000}

becomes

{"cpu0_p_cpu"=>17.000000}

FAQ

Logstash_Prefix_Key

The following snippet demonstrates using the namespace name as extracted by the kubernetes filter as logstash preifix:

[OUTPUT]
    Name opensearch
    Match *
    # ...
    Logstash_Prefix logstash
    Logstash_Prefix_Key $kubernetes['namespace_name']
    # ...

For records that do nor have the field kubernetes.namespace_name, the default prefix, logstash will be used.

Fluent Bit + Amazon OpenSearch Service

The Amazon OpenSearch Service adds an extra security layer where HTTP requests must be signed with AWS Sigv4. This plugin supports Amazon OpenSearch Service with IAM Authentication.

See here for details on how AWS credentials are fetched.

Example configuration:

[OUTPUT]
    Name  opensearch
    Match *
    Host  vpc-test-domain-ke7thhzoo7jawsrhmm6mb7ite7y.us-west-2.es.amazonaws.com
    Port  443
    Index my_index
    Type  my_type
    AWS_Auth On
    AWS_Region us-west-2
    tls     On

Notice that the Port is set to 443, tls is enabled, and AWS_Region is set.

Action/metadata contains an unknown parameter type

Similarly to Elastic Cloud, OpenSearch in version 2.0 and above needs to have type option being removed by setting Suppress_Type_Name On.

Without this you will see errors like:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_type]"}],"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_type]"},"status":400}

Fluent-Bit + Amazon OpenSearch Serverless

Amazon OpenSearch Serverless is an offering that eliminates your need to manage OpenSearch clusters. All existing Fluent Bit OpenSearch output plugin options work with OpenSearch Serverless. For Fluent Bit, the only difference is that you must specify the service name as aoss (Amazon OpenSearch Serverless) when you enable AWS_Auth:

AWS_Auth On
AWS_Region <aws-region>
AWS_Service_Name aoss

Data Access Permissions

When sending logs to OpenSearch Serverless, your AWS IAM entity needs OpenSearch Serverless Data Access permisions. Give your IAM entity the following data access permissions to your serverless collection:

aoss:CreateIndex
aoss:UpdateIndex
aoss:WriteDocument

With data access permissions, IAM policies are not needed to access the collection.

Issues with the OpenSearch cluster

Occasionally the Fluent Bit service may generate errors without any additional detail in the logs to explain the source of the issue, even with the service's log_level attribute set to Debug.

For example, in this scenario the logs show that a connection was successfully established with the OpenSearch domain, and yet an error is still returned:

[2023/07/10 19:26:00] [debug] [http_client] not using http_proxy for header
[2023/07/10 19:26:00] [debug] [output:opensearch:opensearch.5] Signing request with AWS Sigv4
[2023/07/10 19:26:00] [debug] [aws_credentials] Requesting credentials from the EC2 provider..
[2023/07/10 19:26:00] [debug] [output:opensearch:opensearch.5] HTTP Status=200 URI=/_bulk
[2023/07/10 19:26:00] [debug] [upstream] KA connection #137 to [MY_OPENSEARCH_DOMAIN]:443 is now available
[2023/07/10 19:26:00] [debug] [out flush] cb_destroy coro_id=1746
[2023/07/10 19:26:00] [debug] [task] task_id=2 reached retry-attempts limit 5/5
[2023/07/10 19:26:00] [error] [engine] chunk '7578-1689017013.184552017.flb' cannot be retried: task_id=2, input=tail.6 > output=opensearch.5
[2023/07/10 19:26:00] [debug] [task] destroy task=0x7fd1cc4d5ad0 (task_id=2)

This behavior could be indicative of a hard-to-detect issue with index shard usage in the OpenSearch domain.

While OpenSearch index shards and disk space are related, they are not directly tied to one another.

OpenSearch domains are limited to 1000 index shards per data node, regardless of the size of the nodes. And, importantly, shard usage is not proportional to disk usage: an individual index shard can hold anywhere from a few kilobytes to dozens of gigabytes of data.

In other words, depending on the way index creation and shard allocation are configured in the OpenSearch domain, all of the available index shards could be used long before the data nodes run out of disk space and begin exhibiting disk-related performance issues (e.g. nodes crashing, data corruption, or the dashboard going offline).

The primary issue that arises when a domain is out of available index shards is that new indexes can no longer be created (though logs can still be added to existing indexes).

When that happens, the Fluent Bit OpenSearch output may begin showing confusing behavior. For example:

  • Errors suddenly appear (outputs were previously working and there were no changes to the Fluent Bit configuration when the errors began)

  • Errors are not consistently occurring (some logs are still reaching the OpenSearch domain)

  • The Fluent Bit service logs show errors, but without any detail as to the root cause

If any of those symptoms are present, consider using the OpenSearch domain's API endpoints to troubleshoot possible shard issues.

Running this command will show both the shard count and disk usage on all of the nodes in the domain.

GET _cat/allocation?v

Index creation issues will begin to appear if any hot data nodes have around 1000 shards OR if the total number of shards spread across hot and ultrawarm data nodes in the cluster is greater than 1000 times the total number of nodes (e.g., in a cluster with 6 nodes, the maximum shard count would be 6000).

Alternatively, running this command to manually create a new index will return an explicit error related to shard count if the maximum has been exceeded.

PUT <index-name>

There are multiple ways to resolve excessive shard usage in an OpenSearch domain such as deleting or combining indexes, adding more data nodes to the cluster, or updating the domain's index creation and sharding strategy. Consult the OpenSearch documentation for more information on how to use these strategies.

Last updated