# Amazon S3

![AWS logo](https://913035255-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fs0q292yqb4jDBrN2Li9x%2Fuploads%2Fgit-blob-cb97215e83d07d55b4ad66dcfdd9b0e83eb3a9d8%2Fimage%20\(9\).png?alt=media)

The *Amazon S3* output plugin lets you ingest records into the [S3](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html) cloud object store.

The plugin can upload data to S3 using the [multipart upload API](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html) or [`PutObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html). Multipart is the default and is recommended. Fluent Bit will stream data in a series of *parts*. This limits the amount of data buffered on disk at any point in time. By default, every time 5 MiB of data have been received, a new part will be uploaded. The plugin can create files up to gigabytes in size from many small chunks or parts using the multipart API. All aspects of the upload process are configurable.

The plugin lets you specify a maximum file size, and a timeout for uploads. A file will be created in S3 when the maximum size or the timeout is reached, whichever comes first.

Records are stored in files in S3 as newline delimited JSON.

See [AWS Credentials](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b0edb2f9acd7cdfdbc3/administration/aws-credentials.md) for details about fetching AWS credentials.

{% hint style="warning" %}
The [Prometheus success/retry/error metrics values](https://docs.fluentbit.io/manual/4.1/administration/monitoring) output by the built-in HTTP server in Fluent Bit are meaningless for S3 output. S3 has its own buffering and retry mechanisms. The Fluent Bit AWS S3 maintainers acknowledge this feature gap, and you can [track issue progress on GitHub](https://github.com/fluent/fluent-bit/issues/6141).
{% endhint %}

## Configuration parameters

| Key                                   | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Default                                   |
| ------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------- |
| `alias`                               | Sets an alias, use for multiple instances of the same output plugin.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | *none*                                    |
| `authorization_endpoint_bearer_token` | Authorization endpoint bearer token.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | *none*                                    |
| `authorization_endpoint_password`     | Authorization endpoint basic authentication password.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | *none*                                    |
| `authorization_endpoint_url`          | Authorization endpoint URL.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | *none*                                    |
| `authorization_endpoint_username`     | Authorization endpoint basic authentication username.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | *none*                                    |
| `auto_retry_requests`                 | Immediately retry failed requests to AWS services once. This option doesn't affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which can help improve throughput during transient network issues.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | `true`                                    |
| `blob_database_file`                  | Absolute path to a database file to be used to store blob files contexts.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | *none*                                    |
| `bucket`                              | S3 Bucket name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | *none*                                    |
| `canned_acl`                          | [Predefined Canned ACL policy](https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl) for S3 objects.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | *none*                                    |
| `compression`                         | Compression type for S3 objects. `gzip`, `arrow`, `parquet` and `zstd` are the supported values, `arrow` and `parquet` are only available if Apache Arrow was enabled at compile time. Defaults to no compression.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | *none*                                    |
| `content_type`                        | A standard MIME type for the S3 object, set as the Content-Type HTTP header.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | *none*                                    |
| `endpoint`                            | Custom endpoint for the S3 API. Endpoints can contain scheme and port.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | *none*                                    |
| `external_id`                         | Specify an external ID for the STS API. Can be used with the `role_arn` parameter if your role requires an external ID.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | *none*                                    |
| `file_delivery_attempt_limit`         | File delivery attempt limit.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | `1`                                       |
| `host`                                | IP address or hostname of the target HTTP server.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | `127.0.0.1`                               |
| `json_date_format`                    | Specify the format of the date. Accepted values: `double`, `epoch`, `epoch_ms`, `iso8601` (2018-05-30T09:39:52.000681Z), `_java_sql_timestamp_` (2018-05-30 09:39:52.000681).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | *none*                                    |
| `json_date_key`                       | Specify the name of the date key in the output record. To disable the time key, set the value to `false`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | `date`                                    |
| `log_level`                           | Specifies the log level for output plugin. If not set here, plugin uses global log level in `service` section.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | `info`                                    |
| `log_supress_interval`                | Suppresses log messages from output plugin that appear similar within a specified time interval. `0` no suppression.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | `0`                                       |
| `match`                               | Set a tag pattern to match records that output should process. Exact matches or wildcards (for example `*`).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | *none*                                    |
| `match_regex`                         | Set a regular expression to match tags for output routing. This allows more flexible matching compared to wildcards.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | *none*                                    |
| `net.connect_timeout`                 | Set maximum time allowed to establish a connection, this time includes the TLS handshake.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | `10s`                                     |
| `net.connect_timeout_log_error`       | On connection timeout, specify if it should log an error. When disabled, the timeout is logged as a debug message.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | `true`                                    |
| `net.keepalive_max_recycle`           | Set maximum number of times a keepalive connection can be used before it retries.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | `2000`                                    |
| `net.dns.mode`                        | Select the primary DNS connection type (TCP or UDP).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | *none*                                    |
| `net.dns.prefer_ipv4`                 | Select the primary DNS resolver type (LEGACY or ASYNC).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | *none*                                    |
| `net.dns.prefer_ipv6`                 | Prioritize IPv6 DNS results when trying to establish a connection.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | *none*                                    |
| `net.io_timeout`                      | Set maximum time a connection can stay idle while assigned.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | `0s`                                      |
| `net.max_worker_connections`          | Set the maximum number of active TCP connections that can be used per worker thread.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | `0`                                       |
| `net.proxy_env_ignore`                | Ignore the environment variables `HTTP_PROXY`, `HTTPS_PROXY` and `NO_PROXY` when set.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | `false`                                   |
| `net.source_address`                  | Specify network address to bind for data traffic.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | *none*                                    |
| `net.tcp_keepalive`                   | Enable or disable Keepalive support.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | `off`                                     |
| `net.tcp_keepalive_time`              | Interval between the last data packet sent and the first TCP keepalive probe.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | `-1`                                      |
| `net.tcp_keepalive_interval`          | Interval between TCP keepalive probes when no response is received on a `keepidle` probe.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | `-1`                                      |
| `net.tcp_keepalive_probes`            | Number of unacknowledged probes to consider a connection dead.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | `-1`                                      |
| `log_key`                             | By default, the whole log record will be sent to S3. When specifying a key name with this option, only the value of that key sends to S3. For example, when using Docker you can specify `log_key log` and only the log message sends to S3.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | *none*                                    |
| `part_delivery_attempt_limit`         | File part delivery attempt limit.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | `1`                                       |
| `part_size`                           | Size of each part when uploading blob files.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | `25M`                                     |
| `port`                                | TCP port of the target HTTP server.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | `80`                                      |
| `preserve_data_ordering`              | When an upload request fails, the last received chunk might swap with a later chunk, resulting in data shuffling. This feature prevents shuffling by using a queue logic for uploads.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | `true`                                    |
| `profile`                             | Option to specify an AWS Profile for credentials.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | *none*                                    |
| `region`                              | The AWS region of your S3 bucket.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | `us-east-1`                               |
| `retry_limit`                         | Set retry limit for output plugin when delivery fails. Integer, `no_limits`, `false`, or `off` to disable, or `no_retries` to disable retries entirely.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | `1`                                       |
| `role_arn`                            | ARN of an IAM role to assume (for example, for cross account access.)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | *none*                                    |
| `s3_key_format`                       | Format string for keys in S3. This option supports a UUID, strftime time formatters, a syntax for selecting parts of the Fluent log tag using a syntax inspired by the `rewrite_tag` filter. Add `$UUID` in the format string to insert a random string. Add `$INDEX` in the format string to insert an integer that increments each upload. The `$INDEX` value saves in the `store_dir`. Add `$TAG` in the format string to insert the full log tag. Add `$TAG[0]` to insert the first part of the tag in the S3 key. The tag is split into parts using the characters specified with the `s3_key_format_tag_delimiters` option. Add the extension directly after the last piece of the format string to insert a key suffix. To specify a key suffix in `use_put_object` mode, you must specify `$UUID`. See [S3 Key Format](#s3-key-format-and-tag-delimiters). Time in `s3_key` is the timestamp of the first record in the S3 file. | `/fluent-bit-logs/$TAG/%Y/%m/%d/%H/%M/%S` |
| `s3_key_format_tag_delimiters`        | A series of characters which will be used to split the tag into `parts` for use with the s3\_key\_format option.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | `.`                                       |
| `send_content_md5`                    | Send the Content-MD5 header with `PutObject` and UploadPart requests, as is required when Object Lock is enabled.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | `false`                                   |
| `static_file_path`                    | Disables behavior where UUID string appends to the end of the S3 key name when `$UUID` isn't provided in `s3_key_format`. `$UUID`, time formatters, `$TAG`, and other dynamic key formatters all work as expected while this feature is set to true.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | `false`                                   |
| `store_dir`                           | Directory to locally buffer data before sending. Plugin uses the S3 Multipart upload API to send data in chunks of 5 MB at a time.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | `/tmp/fluent-bit/s3`                      |
| `store_dir_limit_size`                | S3 plugin has its own buffering system with files in the `store_dir`. Use the `store_dir_limit_size` to limit the amount of data S3 buffers in the `store_dir` to limit disk usage. If the limit is reached, data will be discarded. Default is 0 which means unlimited.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | `0`                                       |
| `storage_class`                       | Specify the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html#AmazonS3-PutObject-request-header-StorageClass) for S3 objects. If this option isn't specified, objects store with the default `STANDARD` storage class.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | *none*                                    |
| `sts_endpoint`                        | Custom endpoint for the STS API.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | *none*                                    |
| `tls`                                 | Enable or disable TLS/SSL support.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | `off`                                     |
| `tls.ca_file`                         | Absolute path to CA certificate file.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | *none*                                    |
| `tls.ca_path`                         | Absolute path to scan for certificate files.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | *none*                                    |
| `tls.crt_file`                        | Absolute path to Certificate file.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | *none*                                    |
| `tls.ciphers`                         | Specify TLS ciphers up to TLSv1.2.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | *none*                                    |
| `tls.debug`                           | Set TLS debug level. Accepts `0` (No debug), `1`(Error), `2` (State change), `3` (Informational) and `4` (Verbose).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | `1`                                       |
| `tls.key_file`                        | Absolute path to private Key file.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | *none*                                    |
| `tls.key_passwd`                      | Optional password for tls.key\_file file.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | *none*                                    |
| `tls.max_version`                     | Specify the maximum version of TLS.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | *none*                                    |
| `tls.min_version`                     | Specify the minimum version of TLS.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | *none*                                    |
| `tls.verify_hostname`                 | Enable or disable to verify hostname.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | `off`                                     |
| `tls.vhost`                           | Hostname to be used for TLS SNI extension.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | *none*                                    |
| `tls.verify`                          | Force certificate validation.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | `on`                                      |
| `tls.windows.certstore_name`          | Sets the `certstore` name on an output (Windows).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | *none*                                    |
| `tls.windows.use_enterprise_store`    | Sets whether using enterprise `certstore` or not on an output (Windows).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | *none*                                    |
| `total_file_size`                     | Specify file size in S3. Minimum size is `1M`. With `use_put_object On` the maximum size is `1G`. With multipart uploads, the maximum size is `50G`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | `100000000`                               |
| `upload_chunk_size`                   | The size of each part for multipart uploads. Default: 5M, Max: 50M, Min: 5M.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | `5242880`                                 |
| `upload_part_freshness_timeout`       | Maximum lifespan of an uncommitted file part.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | `6D`                                      |
| `upload_parts_timeout`                | Timeout to upload parts of a blob file.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | `10m`                                     |
| `upload_timeout`                      | When this amount of time elapses, Fluent Bit uploads and creates a new file in S3. Set to `60m` to upload a new file every hour.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | `10m`                                     |
| `use_put_object`                      | Use the S3 `PutObject` API instead of the multipart upload API. When enabled, the key extension is only available when `$UUID` is specified in `s3_key_format`. If `$UUID` isn't included, a random string appends format string and the key extension can't be customized.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | `false`                                   |

## TLS / SSL

To skip TLS verification, set `tls.verify` as `false`. For more details about the properties available and general configuration, refer to [TLS/SSL](https://docs.fluentbit.io/manual/4.1/administration/transport-security).

## Permissions

The plugin requires the following AWS IAM permissions:

```json
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "s3:PutObject"
    ],
    "Resource": "*"
  }]
}
```

## Differences between S3 and other Fluent Bit outputs

The S3 output plugin is used to upload large files to an Amazon S3 bucket, while most other outputs which send many requests to upload data in batches of a few megabytes or less.

When Fluent Bit receives logs, it stores them in chunks, either in memory or the filesystem depending on your settings. Chunks are usually around 2 MB in size. Fluent Bit sends chunks, in order, to each output that matches their tag. Most outputs then send the chunk immediately to their destination. A chunk is sent to the output's `flush` callback function, which must return one of `FLB_OK`, `FLB_RETRY`, or `FLB_ERROR`. Fluent Bit keeps count of the return values from each output's `flush` callback function. These counters are the data source for Fluent Bit error, retry, and success metrics available in Prometheus format through its monitoring interface.

The S3 output plugin conforms to the Fluent Bit output plugin specification. Because S3's use case is to upload large files (over 2 MB), its behavior is different. S3's `flush` callback function buffers the incoming chunk to the filesystem, and returns an `FLB_OK`. This means Prometheus metrics available from the Fluent Bit HTTP server are meaningless for S3. In addition, the `storage.total_limit_size` parameter isn't meaningful for S3 since it has its own buffering system in the `store_dir`. Instead, use `store_dir_limit_size`. S3 requires a writeable filesystem. Running Fluent Bit on a read-only filesystem won't work with the S3 output.

S3 uploads primarily initiate using the S3 [`timer`](https://docs.aws.amazon.com/iotevents/latest/apireference/API_iotevents-data_Timer.html) callback function, which runs separately from its `flush`.

S3 has its own buffering system and its own callback to upload data, so the normal sequential data ordering of chunks provided by the Fluent Bit engine can be compromised. S3 has the `presevere_data_ordering` option which ensures data is uploaded in the original order it was collected by Fluent Bit.

### Summary: uniqueness in the Amazon S3 plugin

* The HTTP Monitoring interface output metrics aren't meaningful for S3. AWS understands that this is non-ideal. See the [open issue and design](https://github.com/fluent/fluent-bit/issues/6141) to allow S3 to manage its own output metrics.
* You must use `store_dir_limit_size` to limit the space on disk used by S3 buffer files.
* The original ordering of data inputted to Fluent Bit might not be preserved unless you enable `preserve_data_ordering On`.

## S3 key format and tag delimiters

In Fluent Bit, all logs have an associated tag. The `s3_key_format` option lets you inject the tag into the S3 key using the following syntax:

* `$TAG`: The full tag.
* `$TAG[n]`: The nth part of the tag (index starting at zero). This syntax is copied from the rewrite tag filter. By default, tag parts are separated with dots, but you can change this with `s3_key_format_tag_delimiters`.

In the following example, assume the date is `January 1st, 2020 00:00:00` and the tag associated with the logs in question is `my_app_name-logs.prod`:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: my-bucket
      region: us-west-2
      total_file_size: 250M
      s3_key_format: '/$TAG[2]/$TAG[0]/%Y/%m/%d/%H/%M/%S/$UUID.gz'
      s3_key_format_tag_delimiters: '.-'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name                         s3
  Match                        *
  bucket                       my-bucket
  region                       us-west-2
  total_file_size              250M
  s3_key_format                /$TAG[2]/$TAG[0]/%Y/%m/%d/%H/%M/%S/$UUID.gz
  s3_key_format_tag_delimiters .-
```

{% endtab %}
{% endtabs %}

With the delimiters as `.` and `-`, the tag splits into parts as follows:

* `$TAG[0]` = `my_app_name`
* `$TAG[1]` = `logs`
* `$TAG[2]` = `prod`

The key in S3 will be `/prod/my_app_name/2020/01/01/00/00/00/bgdHN1NM.gz`.

### Allowing a file extension in the Amazon S3 key format with `$UUID`

The Fluent Bit S3 output was designed to ensure that previous uploads will never be overwritten by a subsequent upload. The `s3_key_format` supports time formatters, `$UUID`, and `$INDEX`. `$INDEX` is special because it's saved in the `store_dir`. If you restart Fluent Bit with the same disk, it can continue incrementing the index from its last value in the previous run.

For files uploaded with the `PutObject` API, the S3 output requires that a unique random string be present in the S3 key. Many of the use cases for `PutObject` uploads involve a short time period between uploads, so a timestamp in the S3 key might not be unique enough between uploads. For example, if you only specify minute granularity timestamps in the S3 key, with a small upload size, it's possible to have two uploads that have timestamps set in the same minute. This requirement can be disabled with `static_file_path On`.

The `PutObject` API is used in these cases:

* When you explicitly set `use_put_object On`.
* On startup when the S3 output finds old buffer files in the `store_dir` from a previous run and attempts to send all of them at once.
* On shutdown. To prevent data loss the S3 output attempts to send all currently buffered data at once.

You should always specify `$UUID` somewhere in your S3 key format. Otherwise, if the `PutObject` API is used, S3 appends a random eight-character UUID to the end of your S3 key. This means that a file extension set at the end of an S3 key will have the random UUID appended to it. Disable this with `static_file_path On`.

This example attempts to set a `.gz` extension without specifying `$UUID`:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: my-bucket
      region: us-west-2
      total_file_size: 50M
      use_put_object: off
      compression: gzip
      s3_key_format: '/$TAG/%Y/%m/%d/%H_%M_%S.gz'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name              s3
  Match             *
  bucket            my-bucket
  region            us-west-2
  total_file_size   50M
  use_put_object    Off
  compression       gzip
  s3_key_format     /$TAG/%Y/%m/%d/%H_%M_%S.gz
```

{% endtab %}
{% endtabs %}

In the case where pending data is uploaded on shutdown, if the tag was `app`, the S3 key in the S3 bucket might be:

```
/app/2022/12/25/00_00_00.gz-apwgylqg
```

The S3 output appended a random string to the file extension, since this upload on shutdown used the `PutObject` API.

To disable this behavior, use one of the following methods:

* Use `static_file_path`:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: my-bucket
      region: us-west-2
      total_file_size: 50M
      use_put_object: off
      compression: gzip
      s3_key_format: '/$TAG/%Y/%m/%d/%H_%M_%S.gz'
      static_file_path: on
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name               s3
  Match              *
  bucket             my-bucket
  region             us-west-2
  total_file_size    50M
  use_put_object     Off
  compression        gzip
  s3_key_format      /$TAG/%Y/%m/%d/%H_%M_%S.gz
  static_file_path   On
```

{% endtab %}
{% endtabs %}

* Explicitly define where the random UUID will go in the S3 key format:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: my-bucket
      region: us-west-2
      total_file_size: 50M
      use_put_object: off
      compression: gzip
      s3_key_format: '/$TAG/%Y/%m/%d/%H_%M_%S/$UUID.gz'
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name                 s3
  Match                *
  bucket               my-bucket
  region               us-west-2
  total_file_size      50M
  use_put_object       Off
  compression          gzip
  s3_key_format        /$TAG/%Y/%m/%d/%H_%M_%S/$UUID.gz
```

{% endtab %}
{% endtabs %}

## Reliability

The `store_dir` is used to temporarily store data before upload. If Fluent Bit stops suddenly, it will try to send all data and complete all uploads before it shuts down. If it can not send some data, on restart it will look in the `store_dir` for existing data and try to send it.

Multipart uploads are ideal for most use cases because they allow the plugin to upload data in small chunks over time. For example, 1 GB file can be created from 200 5 MB chunks. Although the file size in S3 will be 1 GB, only 5 MB will be buffered on disk at any one point in time.

One drawback to multipart uploads is that the file and data aren't visible in S3 until the upload is completed with a [CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html) call. The plugin attempts to make this call whenever Fluent Bit is shut down to ensure your data is available in S3. It also stores metadata about each upload in the `store_dir`, ensuring that uploads can be completed when Fluent Bit restarts (assuming it has access to persistent disk and the `store_dir` files will still be present on restart).

### Using S3 without persisted disk

If you run Fluent Bit in an environment without persistent disk, or without the ability to restart Fluent Bit and give it access to the data stored in the `store_dir` from previous executions, some considerations apply. This might occur if you run Fluent Bit on [AWS Fargate](https://aws.amazon.com/fargate/).

In these situations, Fluent Bits recommend using the `PutObject` API and sending data frequently, to avoid local buffering as much as possible. This will limit data loss in the event Fluent Bit is killed unexpectedly. The following settings are recommended for this use case:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: your-bucket
      region: us-east-1
      total_file_size: 1M
      upload_timeout: 1m
      use_put_object: on
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name               s3
  Match              *
  bucket             your-bucket
  region             us-east-1
  total_file_size    1M
  upload_timeout     1m
  use_put_object     On
```

{% endtab %}
{% endtabs %}

## S3 multipart uploads

With `use_put_object Off` (default), S3 will attempt to send files using multipart uploads. For each file, S3 first calls [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html), then a series of calls to [UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html) for each fragment (targeted to be `upload_chunk_size` bytes), and finally [CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html) to create the final file in S3.

### Fallback to `PutObject`

S3 [requires](https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html) each [UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html) fragment to be at least 5,242,880 bytes, otherwise the upload is rejected.

The S3 output must sometimes fall back to the [`PutObject` API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_%60PutObject%60.html).

Uploads are triggered by these settings:

* `total_file_size` and `upload_chunk_size`: When S3 has buffered data in the `store_dir` that meets the desired `total_file_size` (for `use_put_object On`) or the `upload_chunk_size` (for Multipart), it will trigger an upload operation.
* `upload_timeout`: Whenever locally buffered data has been present on the filesystem in the `store_dir` longer than the configured `upload_timeout`, it will be sent even when the desired byte size hasn't been reached. If you configure a small `upload_timeout`, your files can be smaller than the `total_file_size`. The timeout is evaluated against the time at which S3 started buffering data for each unique tag (the time when new data was buffered for the unique tag after the last upload). The timeout is also evaluated against the [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html) time, so a multipart upload will be completed after `upload_timeout` has elapsed, even if the desired size hasn't yet been reached.

If your `upload_timeout` triggers an upload before the pending buffered data reaches the `upload_chunk_size`, it might be too small for a multipart upload. S3 will fallback to use the [`PutObject` API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html).

When you enable compression, S3 applies the compression algorithm at send time. The size settings trigger uploads based on the size of buffered data, not the final compressed size. It's possible that after compression, buffered data no longer meets the required minimum S3 [UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html) size. If this occurs, you will see a log message like:

```
[ info] [output:s3:s3.0] Pre-compression upload_chunk_size= 5630650, After compression, chunk is only 1063320 bytes, the chunk was too small, using PutObject to upload
```

If you encounter this frequently, use the numbers in the messages to guess your compression factor. In this example, the buffered data was reduced from 5,630,650 bytes to 1,063,320 bytes. The compressed size is one-fifth the actual data size. Configuring `upload_chunk_size 30M` should ensure each part is large enough after compression to be over the minimum required part size of 5,242,880 bytes.

The S3 API allows the last part in an upload to be less than the 5,242,880 byte minimum. If a part is too small for an existing upload, the S3 output will upload that part and then complete the upload.

### `upload_timeout` constrains total multipart upload time for a single file

The `upload_timeout` evaluated against the [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html) time. A multipart upload will be completed after `upload_timeout` elapses, even if the desired size hasn't yet been reached.

### Completing uploads

When [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html) is called, an `UploadID` is returned. S3 stores these IDs for active uploads in the `store_dir`. Until [CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html) is called, the uploaded data isn't visible in S3.

On shutdown, S3 output attempts to complete all pending uploads. If an upload fails to complete, the ID remains buffered in the `store_dir` in a directory called `multipart_upload_metadata`. If you restart the S3 output with the same `store_dir` it will discover the old UploadIDs and complete the pending uploads. The [S3 documentation](https://aws.amazon.com/blogs/aws-cloud-financial-management/discovering-and-deleting-incomplete-multipart-uploads-to-lower-amazon-s3-costs/) has suggestions on discovering and deleting or completing dangling uploads in your buckets.

## Usage with MinIO

[MinIO](https://min.io/) is a high-performance, S3 compatible object storage, and you can build your app with S3 capability without S3.

The following example runs a MinIO server at `localhost:9000`, and creates a bucket of `your-bucket`.

Example:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: your-bucket
      endpoint: http://localhost:9000
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name     s3
  Match    *
  bucket   your-bucket
  endpoint http://localhost:9000
```

{% endtab %}
{% endtabs %}

The records store in the MinIO server.

## Usage with Google Cloud

You can send your S3 output to Google. You must generate HMAC keys on GCS and use those keys for `access-key` and `access-secret`.

Example:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: your-bucket
      endpoint: https://storage.googleapis.com
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name     s3
  Match    *
  bucket   your-bucket
  endpoint https://storage.googleapis.com
```

{% endtab %}
{% endtabs %}

## Get started

To send records into Amazon S3, run the plugin from the command line or through the configuration file.

### Command line

The S3 plugin reads parameters from the command line through the `-p` argument:

```shell
fluent-bit -i cpu -o s3 -p bucket=my-bucket -p region=us-west-2 -p -m '*' -f 1
```

### Configuration file

In your main configuration file append the following section:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: your-bucket
      region: us-east-1
      store_dir: /home/ec2-user/buffer
      total_file_size: 50M
      upload_timeout: 10m
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name            s3
  Match           *
  bucket          your-bucket
  region          us-east-1
  store_dir       /home/ec2-user/buffer
  total_file_size 50M
  upload_timeout  10m
```

{% endtab %}
{% endtabs %}

An example using `PutObject` instead of multipart:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:

  outputs:
    - name: s3
      match: '*'
      bucket: your-bucket
      region: us-east-1
      store_dir: /home/ec2-user/buffer
      use_put_object: on
      total_file_size: 10M
      upload_timeout: 10m
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[OUTPUT]
  Name            s3
  Match           *
  bucket          your-bucket
  region          us-east-1
  store_dir       /home/ec2-user/buffer
  use_put_object  On
  total_file_size 10M
  upload_timeout  10m
```

{% endtab %}
{% endtabs %}

## AWS for Fluent Bit

Amazon distributes a container image with Fluent Bit and plugins.

### GitHub

[github.com/aws/aws-for-fluent-bit](https://github.com/aws/aws-for-fluent-bit)

### Amazon ECR Public Gallery

Images are available in the Amazon ECR Public Gallery as [aws-for-fluent-bit](https://gallery.ecr.aws/aws-observability/aws-for-fluent-bit).

You can download images with different tags using the following command:

```shell
docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:<tag>
```

For example, you can pull the image with latest version with:

```shell
docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest
```

If you see errors for image pull limits, try signing in to public ECR with your AWS credentials:

```shell
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws
```

See the [Amazon ECR Public official documentation](https://docs.aws.amazon.com/AmazonECR/latest/public/get-set-up-for-amazon-ecr.html) for more details.

### Docker Hub

[amazon/aws-for-fluent-bit](https://hub.docker.com/r/amazon/aws-for-fluent-bit/tags) is also available from the Docker Hub.

### Amazon ECR

Use Fluent Bit SSM Public Parameters to find the Amazon ECR image URI in your region:

```shell
aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/
```

For more information, see the [AWS for Fluent Bit GitHub repository](https://github.com/aws/aws-for-fluent-bit#public-images).

## Advanced usage

### Use Apache Arrow for in-memory data processing

With Fluent Bit v1.8 or later, the Amazon S3 plugin includes the support for [Apache Arrow](https://arrow.apache.org/). Support isn't enabled by default, and has a dependency on a shared version of `libarrow`.

To use this feature, `FLB_ARROW` must be turned on at compile time. Use the following commands:

```shell
cd build/
cmake -DFLB_ARROW=On ..
cmake --build .
```

After being compiled, Fluent Bit can upload incoming data to S3 in Apache Arrow format.

For example:

{% tabs %}
{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:
  inputs:
    - name: cpu

  outputs:
    - name: s3
      bucket: your-bucket-name
      total_file_size: 1M
      use_put_object: on
      upload_timeout: 60s
      compression: arrow
```

{% endtab %}

{% tab title="fluent-bit.conf" %}

```
[INPUT]
  Name cpu

[OUTPUT]
  Name s3
  Bucket your-bucket-name
  total_file_size 1M
  use_put_object On
  upload_timeout 60s
  Compression arrow
```

{% endtab %}
{% endtabs %}

Setting `Compression` to `arrow` makes Fluent Bit convert payload into Apache Arrow format.

Load, analyze, and process stored data using popular data processing tools such as Python pandas, Apache Spark and Tensorflow.

The following example uses `pyarrow` to analyze the uploaded data:

```
>>> import pyarrow.feather as feather
>>> import pyarrow.fs as fs
>>>
>>> s3 = fs.S3FileSystem()
>>> file = s3.open_input_file("my-bucket/fluent-bit-logs/cpu.0/2021/04/27/09/36/15-object969o67ZF")
>>> df = feather.read_feather(file)
>>> print(df.head())
   date  cpu_p  user_p  system_p  cpu0.p_cpu  cpu0.p_user  cpu0.p_system
0  2021-04-27T09:33:53.539346Z    1.0     1.0       0.0         1.0          1.0            0.0
1  2021-04-27T09:33:54.539330Z    0.0     0.0       0.0         0.0          0.0            0.0
2  2021-04-27T09:33:55.539305Z    1.0     0.0       1.0         1.0          0.0            1.0
3  2021-04-27T09:33:56.539430Z    0.0     0.0       0.0         0.0          0.0            0.0
4  2021-04-27T09:33:57.539803Z    0.0     0.0       0.0         0.0          0.0            0.0
```
