S3 File Source
Emit lines of flat files stored in s3 based object storage as cloud events.


About
S3 File Source is kubernetes job dynamically created by Job Trigger ksvc.
Configuration
Following are the configuration options for the file source. These values can be statically mentioned
as a environment variable or as a query parameter to the job trigger knative service.
- S3_URL
- S3_BUCKET
- S3_REGION
- S3_ACCESS_KEY
- S3_SECRET_KEY
- S3_FILE_NAME
- CHUNK_SIZE - Bytes to be downloaded each time from s3 (Default - 50mb)
- SINK_DUMP_COUNT - Number of lines to be sent to the sink in one request (Default - 100)
- SINK_RETRY_COUNT - Number of times to retry on failure (Default - 3)
- SINK_RETRY_INTERVAL - Seconds to wait before next retry (Default - 1)
Example
Please check out the example page on how to use s3 file source
Sample S3 File source
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: s3-file-source-svc
namespace: demo
spec:
template:
spec:
containers:
- env:
- name: JOB_SPEC
value: "{\"Image\": \"murugappans/s3-file-source:v1\",\"Name\": \"s3-file-source-job\", \"EnvFromSecretorCM\": [{\"Name\": \"s3-access\",\"Type\": \"Secret\"}]}"
image: murugappans/job-trigger:v1
Limitation
Currently the file processing is sequential, I am working on making it parallel.