internal

package
v3.0.0-...-7ba4d6b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 17, 2024 License: Apache-2.0, BSD-3-Clause, MIT Imports: 35 Imported by: 0

README

Prism internal packages

Go has a mechanism for "internal" packages to prevent use of implementation details outside of their intended use.

This mechanism is used thoroughly for Prism to ensure we can make changes to the runner's internals without worrying about the exposed surface changes breaking non-compliant users.

Structure

Here's a loose description of the current structure of the runner. Leaf packages should not depend on other parts of the runner. Runner packages can and do depend on other parts of the SDK, such as for Coder handling.

config contains configuration parsing and handling. Leaf package. Handler configurations are registered by dependant packages.

urns contains beam URN strings pulled from the protos. Leaf package.

engine contains the core manager for handling elements, watermarks, and windowing strategies. Determines bundle readiness, and stages to execute. Leaf package.

jobservices contains GRPC service handlers for job management and submission. Should only depend on the config and urns packages.

worker contains interactions with FnAPI services to communicate with worker SDKs. Leaf package except for dependency on engine.TentativeData which will likely be removed at some point.

internal AKA the package in this directory root. Contains fhe job execution flow. Jobs are sent to it from jobservices, and those jobs are then executed by coordinating with the engine and worker packages, and handlers urn. Most configurable behavior is determined here.

web contains a web server to visualize aspects of the runner, and should never be depended on by the other packages for the runner. It will primarily abstract access to pipeline data through a JobManagement API client. If no other option exists, it may use direct access to exported data from the other packages.

Testing

The sub packages should have reasonable Unit Test coverage in their own directories, but most features will be exercised via executing pipelines in this package.

For the time being test DoFns should be added to standard build in order to validate execution coverage, in particular for Combine and Splittable DoFns.

Eventually these behaviors should be covered by using Prism in the main SDK tests.

Documentation

Overview

Package internal is where the less separable parts of the runner are put together in order to execute pipelines, and validate that beam features are implemented, and configured appropriately for the variant a pipeline is using.

Importantly, it is the package that contains unit test pipelines to exercise beam features in a pipeline context, rather than simpler mechanical unit tests as in the sub packages.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Combine

func Combine(config any) *combine

func ParDo

func ParDo(config any) *pardo

func RunPipeline

func RunPipeline(j *jobservices.Job)

RunPipeline starts the main thread fo executing this job. It's analoguous to the manager side process for a distributed pipeline. It will begin "workers"

func Runner

func Runner(config any) *runner

Types

type CombineCharacteristic

type CombineCharacteristic struct {
	EnableLifting bool // Sets whether a combine composite does combiner lifting or not.
}

CombineCharacteristic holds the configuration for Combines.

type ParDoCharacteristic

type ParDoCharacteristic struct {
	DisableSDF bool // Sets whether a pardo supports SDFs or not.
}

ParDoCharacteristic holds the configuration for ParDos.

type RunnerCharacteristic

type RunnerCharacteristic struct {
	SDKFlatten   bool // Sets whether we should force an SDK side flatten.
	SDKGBK       bool // Sets whether the GBK should be handled by the SDK, if possible by the SDK.
	SDKReshuffle bool // Sets whether we should use the SDK backup implementation to handle a Reshuffle.
}

RunnerCharacteristic holds the configuration for Runner based transforms, such as GBKs, Flattens.

Directories

Path Synopsis
Package config defines and handles the parsing and provision of configurations for the runner.
Package config defines and handles the parsing and provision of configurations for the runner.
Package engine handles the operational components of a runner, to track elements, watermarks, timers, triggers etc
Package engine handles the operational components of a runner, to track elements, watermarks, timers, triggers etc
Package jobservices handles services necessary WRT handling jobs from SDKs.
Package jobservices handles services necessary WRT handling jobs from SDKs.
Package urns handles extracting urns from all the protos.
Package urns handles extracting urns from all the protos.
Package web serves a web UI for Prism.
Package web serves a web UI for Prism.
Package worker handles interactions with SDK side workers, representing the worker services, communicating with those services, and SDK environments.
Package worker handles interactions with SDK side workers, representing the worker services, communicating with those services, and SDK environments.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL