pipeline

package
v0.8.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2015 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Overview

package pipeline implements a system for running data pipelines on top of the filesystem

Index

Constants

This section is empty.

Variables

View Source
var ArgCount = errors.New("Illegal argument count.")
View Source
var Cancelled = errors.New("cancelled")

Functions

func RunPipelines

func RunPipelines(pipelineDir, inRepo, outRepo, commit, branch, shard string) error

RunPipelines lets you easily run the Pipelines in one line if you don't care about cancelling them.

Types

type Pipeline

type Pipeline struct {
	// contains filtered or unexported fields
}

func NewPipeline

func NewPipeline(name, dataRepo, outRepo, commit, branch, shard string) *Pipeline

func (*Pipeline) Cancel

func (p *Pipeline) Cancel() error

Cancel stops a pipeline by force before it's finished

func (*Pipeline) Finish

func (p *Pipeline) Finish() error

Finish makes the final commit for the pipeline

func (*Pipeline) Image

func (p *Pipeline) Image(image string) error

Image sets the image that is being used for computations.

func (*Pipeline) Input

func (p *Pipeline) Input(name string) error

Import makes a dataset available for computations in the container.

func (*Pipeline) Run

func (p *Pipeline) Run(cmd []string) error

Run runs a command in the container, it assumes that `branch` has already been created. Notice that any failure in this function leads to the branch having uncommitted dirty changes. This state needs to be cleaned up before the pipeline is rerun. The reason we don't do it here is that even if we try our best the process crashing at the wrong time could still leave it in an inconsistent state.

func (*Pipeline) RunPachFile

func (p *Pipeline) RunPachFile(r io.Reader) error

func (*Pipeline) Shuffle

func (p *Pipeline) Shuffle(dir string) error

func (*Pipeline) Start

func (p *Pipeline) Start() error

Start gets an outRepo ready to be used. This is where clean up of dirty state from a crash happens.

type Runner

type Runner struct {
	// contains filtered or unexported fields
}

func NewRunner

func NewRunner(pipelineDir, inRepo, outPrefix, commit, branch, shard string) *Runner

func (*Runner) Cancel

func (r *Runner) Cancel() error

func (*Runner) Run

func (r *Runner) Run() error

Run runs all of the pipelines it finds in pipelineDir. Returns the first error it encounters.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL