Documentation ¶
Overview ¶
An AppEngine service that copies files from one CloudStorage location to another and publishes the names of the files (in their new location) to a PubSub topic.
This could be used as the first step of a data pipline that brings log files into GCP. An external (to GCP) service would copy the logs into a well known location on CloudStorage. An AppEngine cron job would make a call to this service with the right parameters, which would then move the files to a staging area and publish them to a queue to be consumed by other workers.
The handler will listen at:
http://PROJECT_ID.appspot.com/tasks/filepublisher
The paramters required by the handler defined in this service are:
topic - name of the pubsub topic to which names of staged files should be published dst_bucket - CloudStorage bucket name where files should be staged dst_path - path in the CloudStorage bucket where files should be staged src_bucket - CloudStorage bucket where files can be found src_prefix - prefix used to identify files in the source bucket dry_run - if true, will only show what action would've been taken (TBD)
An example call could look like:
http://PROJECT_ID.appspot.com/tasks/filepublisher?topic=MY_TOPIC&dst_bucket=MY_BUCKET&dst_path=staged&src_bucket=MY_BUCKET&src_prefix=inbound
To run the service locally for development the project ID must be specified in the environment and 'go run' can be used:
GCLOUD_PROJECT=<PROJECT_ID> go run filepublisher.go