synthetic

package
v3.0.0-...-7ba4d6b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 17, 2024 License: Apache-2.0, BSD-3-Clause, MIT Imports: 9 Imported by: 0

Documentation

Overview

Package synthetic contains transforms for creating synthetic pipelines. Synthetic pipelines are pipelines that simulate the behavior of possible pipelines in order to test performance, splitting, liquid sharding, and various other infrastructure used for running pipelines. This category of tests is not concerned with the correctness of the elements themselves, but needs to simulate transforms that output many elements throughout varying pipeline shapes.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Source

func Source(s beam.Scope, col beam.PCollection) beam.PCollection

Source creates a synthetic source transform that emits randomly generated KV<[]byte, []byte> elements.

This transform accepts a PCollection of SourceConfig, where each SourceConfig determines the synthetic source's behavior for producing a batch of elements. This allows multiple batches of elements to be produced with different behavior, in order to simulate a source transform that reads from multiple differently behaving sources, such as a file read that received small files and large files.

The recommended way to create SourceConfigs is via the SourceConfigBuilder. Usage example:

cfgs := beam.Create(s,
    synthetic.DefaultSourceConfig().NumElements(1000).Build(),
    synthetic.DefaultSourceConfig().NumElements(5000).InitialSplits(2).Build())
src := synthetic.Source(s, cfgs)

func SourceSingle

func SourceSingle(s beam.Scope, cfg SourceConfig) beam.PCollection

SourceSingle creates a synthetic source transform that emits randomly generated KV<[]byte, []byte> elements.

This transform is a version of Source for when only one SourceConfig is needed. This transform accepts one SourceConfig which determines the synthetic source's behavior.

The recommended way to create SourceConfigs are via the SourceConfigBuilder. Usage example:

src := synthetic.SourceSingle(s,
    synthetic.DefaultSourceConfig().NumElements(5000).InitialSplits(2).Build())

func Step

Step creates a synthetic step transform that receives KV<[]byte, []byte> elements from other synthetic transforms, and outputs KV<[]byte, []byte> elements based on its inputs.

This function accepts a StepConfig to configure the behavior of the synthetic step, including whether that step is implemented as a splittable or non-splittable DoFn.

The recommended way to create StepConfigs is via the StepConfigBuilder. Usage example:

cfg := synthetic.DefaultStepConfig().OutputPerInput(10).FilterRatio(0.5).Build()
step := synthetic.Step(s, cfg, input)

Types

type SourceConfig

type SourceConfig struct {
	NumElements    int64   `json:"num_records" beam:"num_records"`
	InitialSplits  int64   `json:"initial_splits" beam:"initial_splits"`
	KeySize        int64   `json:"key_size" beam:"key_size"`
	ValueSize      int64   `json:"value_size" beam:"value_size"`
	NumHotKeys     int64   `json:"num_hot_keys" beam:"num_hot_keys"`
	HotKeyFraction float64 `json:"hot_key_fraction" beam:"hot_key_fraction"`
}

SourceConfig is a struct containing all the configuration options for a synthetic source. It should be created via a SourceConfigBuilder, not by directly initializing it (the fields are public to allow encoding).

type SourceConfigBuilder

type SourceConfigBuilder struct {
	// contains filtered or unexported fields
}

SourceConfigBuilder is used to initialize SourceConfigs. See SourceConfigBuilder's methods for descriptions of the fields in a SourceConfig and how they can be set. The intended approach for using this builder is to begin by calling the DefaultSourceConfig function, followed by calling setters, followed by calling Build.

Usage example:

cfg := synthetic.DefaultSourceConfig().NumElements(5000).InitialSplits(2).Build()

func DefaultSourceConfig

func DefaultSourceConfig() *SourceConfigBuilder

DefaultSourceConfig creates a SourceConfigBuilder set with intended defaults for the SourceConfig fields. This function is the intended starting point for initializing a SourceConfig and should always be used to create SourceConfigBuilders.

To see descriptions of the various SourceConfig fields and their defaults, see the methods to SourceConfigBuilder.

func (*SourceConfigBuilder) Build

func (b *SourceConfigBuilder) Build() SourceConfig

Build constructs the SourceConfig initialized by this builder. It also performs error checking on the fields, and panics if any have been set to invalid values.

func (*SourceConfigBuilder) BuildFromJSON

func (b *SourceConfigBuilder) BuildFromJSON(jsonData []byte) SourceConfig

BuildFromJSON constructs the SourceConfig by populating it with the parsed JSON. Panics if there is an error in the syntax of the JSON or if the input contains unknown object keys.

An example of valid JSON object:

{
	 "num_records": 5,
	 "key_size": 5,
	 "value_size": 5,
	 "num_hot_keys": 5,
}

func (*SourceConfigBuilder) HotKeyFraction

func (b *SourceConfigBuilder) HotKeyFraction(val float64) *SourceConfigBuilder

HotKeyFraction determines the value of hot key fraction.

Valid values are floating point numbers from 0 to 1.

func (*SourceConfigBuilder) InitialSplits

func (b *SourceConfigBuilder) InitialSplits(val int) *SourceConfigBuilder

InitialSplits determines the number of initial splits to perform in the source's SplitRestriction method. Restrictions in synthetic sources represent the number of elements being emitted, and this split is performed evenly across that number of elements.

Each resulting restriction will have at least 1 element in it, and each element being emitted will be contained in exactly one restriction. That means that if the desired number of splits is greater than the number of elements N, then N initial restrictions will be created, each containing 1 element.

Valid values are in the range of [1, ...] and the default value is 1. Values of 0 (and below) are invalid as they would result in dropping elements that are expected to be emitted.

func (*SourceConfigBuilder) KeySize

func (b *SourceConfigBuilder) KeySize(val int) *SourceConfigBuilder

KeySize determines the size of the key of elements for the source to generate.

Valid values are in the range of [1, ...] and the default value is 8.

func (*SourceConfigBuilder) NumElements

func (b *SourceConfigBuilder) NumElements(val int) *SourceConfigBuilder

NumElements is the number of elements for the source to generate and emit.

Valid values are in the range of [1, ...] and the default value is 1. Values of 0 (and below) are invalid as they result in sources that emit no elements.

func (*SourceConfigBuilder) NumHotKeys

func (b *SourceConfigBuilder) NumHotKeys(val int) *SourceConfigBuilder

NumHotKeys determines the number of keys with the same value among generated keys.

Valid values are in the range of [0, ...] and the default value is 0.

func (*SourceConfigBuilder) ValueSize

func (b *SourceConfigBuilder) ValueSize(val int) *SourceConfigBuilder

ValueSize determines the size of the value of elements for the source to generate.

Valid values are in the range of [1, ...] and the default value is 8.

type StepConfig

type StepConfig struct {
	OutputPerInput int
	FilterRatio    float64
	Splittable     bool
	InitialSplits  int
}

StepConfig is a struct containing all the configuration options for a synthetic step. It should be created via a StepConfigBuilder, not by directly initializing it (the fields are public to allow encoding).

type StepConfigBuilder

type StepConfigBuilder struct {
	// contains filtered or unexported fields
}

StepConfigBuilder is used to initialize StepConfigs. See StepConfigBuilder's methods for descriptions of the fields in a StepConfig and how they can be set. The intended approach for using this builder is to begin by calling the DefaultStepConfig function, followed by calling setters, followed by calling Build.

Usage example:

cfg := synthetic.DefaultStepConfig().OutputPerInput(10).FilterRatio(0.5).Build()

func DefaultStepConfig

func DefaultStepConfig() *StepConfigBuilder

DefaultStepConfig creates a StepConfig with intended defaults for the StepConfig fields. This function is the intended starting point for initializing a StepConfig and should always be used to create StepConfigBuilders.

To see descriptions of the various StepConfig fields and their defaults, see the methods to StepConfigBuilder.

func (*StepConfigBuilder) Build

func (b *StepConfigBuilder) Build() StepConfig

Build constructs the StepConfig initialized by this builder. It also performs error checking on the fields, and panics if any have been set to invalid values.

func (*StepConfigBuilder) FilterRatio

func (b *StepConfigBuilder) FilterRatio(val float64) *StepConfigBuilder

FilterRatio indicates the random chance that an input will be filtered out, meaning that no outputs will get emitted for it. For example, a FilterRatio of 0.25 means that 25% of inputs will be filtered out, a FilterRatio of 0 means no elements are filtered, and a FilterRatio of 1.0 means every element is filtered.

In a non-splittable step, this is performed on each input element, meaning all outputs for that element would be filtered. In a splittable step, this is performed on each input restriction instead of the entire element, meaning that some outputs for an element may be filtered and others kept.

Note that even when elements are filtered out, the work associated with processing those elements is still performed, which differs from setting an OutputPerInput of 0. Also note that if a

Valid values are in the range if [0.0, 1.0], and the default value is 0. In order to avoid precision errors, invalid values do not cause errors. Instead, values below 0 are functionally equivalent to 0, and values above 1 are functionally equivalent to 1.

func (*StepConfigBuilder) InitialSplits

func (b *StepConfigBuilder) InitialSplits(val int) *StepConfigBuilder

InitialSplits is only applicable if Splittable is set to true, and determines the number of initial splits to perform in the step's SplitRestriction method. Restrictions in synthetic steps represent the number of elements to emit for each input element, as defined by the OutputPerInput config field, and this split is performed evenly across that number of elements.

Each resulting restriction will have at least 1 element in it, and each element being emitted will be contained in exactly one restriction. That means that if the desired number of splits is greater than the OutputPerInput N, then N initial restrictions will be created, each containing 1 element.

Valid values are in the range of [1, ...] and the default value is 1. Values of 0 (and below) are invalid as they would result in dropping elements that are expected to be emitted.

func (*StepConfigBuilder) OutputPerInput

func (b *StepConfigBuilder) OutputPerInput(val int) *StepConfigBuilder

OutputPerInput is the number of outputs to emit per input received. Each output is identical to the original input. A value of 0 drops all inputs and produces no output.

Valid values are in the range of [0, ...] and the default value is 1. Values below 0 are invalid as they have no logical meaning for this field.

func (*StepConfigBuilder) Splittable

func (b *StepConfigBuilder) Splittable(val bool) *StepConfigBuilder

Splittable indicates whether the step should use the splittable DoFn or non-splittable DoFn implementation.

Splittable steps will split along restrictions representing the number of OutputPerInput for each element, so it is most useful for steps with a high OutputPerInput. Conversely, if OutputPerInput is 1, then there is no way to split restrictions further, so making the step splittable will do nothing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL