Documentation ¶
Overview ¶
Package dataflowlib translates a Beam pipeline model to the Dataflow API job model, for submission to Google Cloud Dataflow.
Index ¶
- func Execute(ctx context.Context, raw *pipepb.Pipeline, opts *JobOptions, ...) (string, error)
- func Fixup(p *pipepb.Pipeline) (*pipepb.Pipeline, error)
- func NewClient(ctx context.Context, endpoint string) (*df.Service, error)
- func PrintJob(ctx context.Context, job *df.Job)
- func StageFile(ctx context.Context, project, url, filename string) error
- func StageModel(ctx context.Context, project, modelURL string, model []byte) error
- func Submit(ctx context.Context, client *df.Service, project, region string, job *df.Job) (*df.Job, error)
- func Translate(p *pipepb.Pipeline, opts *JobOptions, workerURL, jarURL, modelURL string) (*df.Job, error)
- func WaitForCompletion(ctx context.Context, client *df.Service, project, region, jobID string) error
- type JobOptions
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Execute ¶
func Execute(ctx context.Context, raw *pipepb.Pipeline, opts *JobOptions, workerURL, jarURL, modelURL, endpoint string, async bool) (string, error)
Execute submits a pipeline as a Dataflow job.
func Fixup ¶
Fixup proto pipeline with Dataflow quirks.
func NewClient ¶
NewClient creates a new dataflow client with default application credentials and CloudPlatformScope. The Dataflow endpoint is optionally overridden.
func StageFile ¶
StageFile uploads a file to GCS.
func StageModel ¶
StageModel uploads the pipeline model to GCS as a unique object.
func Submit ¶
func Submit(ctx context.Context, client *df.Service, project, region string, job *df.Job) (*df.Job, error)
Submit submits a prepared job to Cloud Dataflow.
Types ¶
type JobOptions ¶
type JobOptions struct { // Name is the job name. Name string // Experiments are additional experiments. Experiments []string // Pipeline options Options runtime.RawOptions Project string Region string Zone string Network string Subnetwork string NoUsePublicIPs bool NumWorkers int64 MachineType string Labels map[string]string ServiceAccountEmail string // Autoscaling settings Algorithm string MaxNumWorkers int64 TempLocation string // Worker is the worker binary override. Worker string // WorkerJar is a custom worker jar. WorkerJar string TeardownPolicy string }
JobOptions capture the various options for submitting jobs to Dataflow.
Click to show internal directories.
Click to hide internal directories.