AWS S3 Rehydration Receiver
Rehydrates OTLP from AWS S3 that was stored using the awss3exporter.
Important Note
This is not a traditional receiver that continually produces data but rather rehydrates all objects found within a specified time range. Once all of the objects have been rehydrated in that time range the receiver will stop producing data. After the receiver has detected three consecutive empty polls it will stop polling for new objects in order to prevent unnecessary API calls.
Minimum Agent Versions
Supported Pipelines
How it works
-
The receiver polls S3 for all objects in the specified bucket.
-
The receiver will parse each object's path to determine if it matches a path created by the AWS S3 Exporter.
-
If the object path is from the exporter, the receiver will parse the timestamp represented by the path.
-
If the timestamp is within the configured range the receiver will download the object and parse its contents into OTLP data.
a. The receiver will process both uncompressed JSON objects and objects compressed with gzip.
Configuration
Field |
Type |
Default |
Required |
Description |
region |
string |
|
true |
The AWS Region of the bucket to rehydrate from. |
s3_bucket |
string |
|
true |
The name of the bucket to rehydrate from. |
s3_prefix |
string |
|
false |
The prefix for the S3 key (root directory inside bucket). Should match the s3_prefix value of the AWS S3 Exporter. |
starting_time |
string |
|
true |
The UTC start time that represents the start of the time range to rehydrate from. Must be in the form YYYY-MM-DDTHH:MM . |
ending_time |
string |
|
true |
The UTC end time that represents the end of the time range to rehydrate from. Must be in the form YYYY-MM-DDTHH:MM . |
delete_on_read |
bool |
false |
false |
If true the object will be deleted after being rehydrated. |
role_arn |
string |
|
false |
The Role ARN to be assumed, this will be used over credentials if specified. |
poll_interval |
string |
1m |
false |
How often to read a new set of objects. This value is mostly to control how often the object API is called to ensure once rehydration is done the receiver isn't making too many API calls. |
poll_timeout |
string |
30s |
false |
The timeout used when reading objects from AWS. |
storage |
string |
|
false |
The component ID of a storage extension. The storage extension prevents duplication of data after a collector restart by remembering which objects were previously rehydrated. |
AWS Credential Configuration
Credentials are not configured in the receiver but rather in the environment.
Follow the guidelines for the
credential configuration.
Example Configuration
Basic Configuration
This configuration specifies a region
, s3_bucket
, starting_time
, and ending_time
.
This will rehydrate all objects in the bucket my-bucket
that have a path that represents they were created between 1:00pm
and 2:30pm
UTC time on October 1, 2023
.
Such a path could look like the following:
year=2023/month=10/day=01/hour=13/minute=30/metrics_12345.json
year=2023/month=10/day=01/hour=13/minute=30/logs_12345.json
year=2023/month=10/day=01/hour=13/minute=30/traces_12345.json
awss3rehydration:
region: "us-east-2"
s3_bucket: "my-bucket"
starting_time: 2023-10-01T13:00
ending_time: 2023-10-01T14:30
Using Storage Extension Configuration
This configuration shows using a storage extension to track rehydration progress over agent restarts. The storage
field is set to the component ID of the storage extension.
extensions:
file_storage:
directory: $OIQ_OTEL_COLLECTOR_HOME/storage
receivers:
awss3rehydration:
region: "us-east-2"
s3_bucket: "my-bucket"
starting_time: 2023-10-01T13:00
ending_time: 2023-10-01T14:30
storage: "file_storage"
Root Folder Configuration
This configuration specifies an additional field s3_prefix
to match the s3_prefix
value of the AWS S3 Exporter.
The s3_prefix
value in the exporter will prefix the object path with the root folder and it needs to be accounted for in the rehydration receiver.
Such a path could look like the following:
root/year=2023/month=10/day=01/hour=13/minute=30/metrics_12345.json
root/year=2023/month=10/day=01/hour=13/minute=30/logs_12345.json
root/year=2023/month=10/day=01/hour=13/minute=30/traces_12345.json
awss3rehydration:
region: "us-east-2"
s3_bucket: "my-bucket"
starting_time: 2023-10-01T13:00
ending_time: 2023-10-01T14:30
s3_prefix: "root"
Delete on read Configuration
This configuration enables the delete_on_read
functionality which will delete an object from AWS after it has been successfully rehydrated into OTLP data and sent onto the next component in the pipeline.
awss3rehydration:
region: "us-east-2"
s3_bucket: "my-bucket"
starting_time: 2023-10-01T13:00
ending_time: 2023-10-01T14:30
delete_on_read: true