cmd

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 10, 2022 License: MIT Imports: 10 Imported by: 0

README

cmd

Contains applications (apps) used in Substation deployments. Apps are organized by either the infrastructure they are deployed to (e.g., AWS) or the source of the data (e.g., file, http).

Any app that implements the ingest, transform, load (ITL) functionality of Substation is named substation and shares the same configuration file format (see build/config/ for more information).

app.go

Contains the core Substation application code. This code can be used to create new Substation applications.

design

Substation operates using a system of goroutines and channels:

  • data ingest, transform, and load are handled by unique goroutines
  • data streams between goroutines using a pipeline pattern
  • errors in any goroutine interrupt the application

This execution model was chosen for its ability to support horizontal scaling, high-latency data processing, and efficient delivery of data.

aws/lambda

Contains Substation apps deployed as AWS Lambda functions. More information is available in cmd/aws/lambda/README.md.

file/substation

Reads and processes data stored in a local file. The app can be deployed anywhere, including non-container infrastructure, and is recommended for local testing and development.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetScanMethod added in v0.5.0

func GetScanMethod() string

GetScanMethod retrieves a scan method from the SUBSTATION_SCAN_METHOD environment variable. This impacts the behavior of bufio scanners that are used throughout the application to read files. The options for this variable are:

- "bytes" (https://pkg.go.dev/bufio#Scanner.Bytes)

- "text" (https://pkg.go.dev/bufio#Scanner.Text)

If the environment variable is missing, then the default method is "text".

Types

type Channels

type Channels struct {
	Done      chan struct{}
	Transform *config.Channel
	Sink      *config.Channel
}

Channels contains channels used by the app for managing state and sending encapsulated data between goroutines:

- Done: signals that all data processing (ingest, transform, load) is complete; this is always invoked by the Sink goroutine

- Transform: sends encapsulated data from the source application to the Transform goroutines

- Sink: sends encapsulated data from the Transform goroutines to the Sink goroutine

type Config added in v0.5.0

type Config struct {
	Transform config.Config
	Sink      config.Config
}

Config is the shared application configuration for all apps.

type Substation

type Substation struct {
	Channels Channels
	Config   Config
	// contains filtered or unexported fields
}

Substation is the application core that manages all data processing and flow control.

func New added in v0.5.0

func New() *Substation

New returns an initialized Substation app. If an error occurs during initialization, then this function will panic.

Concurrency is controlled using the SUBSTATION_CONCURRENCY environment variable and defaults to the number of CPUs on the host. In native Substation applications, this value determines the number of transform goroutines; if set to 1, then multi-core processing is not enabled.

func (*Substation) Block

func (sub *Substation) Block(ctx context.Context, group *errgroup.Group) error

Block blocks the handler from returning until one of these conditions is met:

- a data processing error occurs

- the request times out (or is otherwise cancelled)

- all data processing is successful

This is usually the final call made by main() in a cmd invoking the app.

func (*Substation) Concurrency added in v0.5.0

func (sub *Substation) Concurrency() int

Concurrency returns the concurrency setting of the app.

func (*Substation) Send added in v0.5.0

func (sub *Substation) Send(capsule config.Capsule)

Send writes encapsulated data into the Transform channel.

func (*Substation) Sink

func (sub *Substation) Sink(ctx context.Context, wg *sync.WaitGroup) error

Sink is the data sink method for the app. Data is input on the Sink channel and sent to the configured sink. The Sink goroutine completes when the Sink channel is closed and all data is flushed.

func (*Substation) Transform

func (sub *Substation) Transform(ctx context.Context, wg *sync.WaitGroup) error

Transform is the data transformation method for the app. Data is input on the Transform channel, transformed by a Transform interface (see: internal/transform), and output on the Sink channel. All Transform goroutines complete when the Transform channel is closed and all data is flushed.

func (*Substation) WaitSink added in v0.5.0

func (sub *Substation) WaitSink(wg *sync.WaitGroup)

WaitSink closes the sink channel and blocks until data load is complete.

func (*Substation) WaitTransform added in v0.5.0

func (sub *Substation) WaitTransform(wg *sync.WaitGroup)

WaitTransform closes the transform channel and blocks until data processing is complete.

Directories

Path Synopsis
aws
file

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL