Documentation ¶
Overview ¶
Package synthetic contains transforms for creating synthetic pipelines. Synthetic pipelines are pipelines that simulate the behavior of possible pipelines in order to test performance, splitting, liquid sharding, and various other infrastructure used for running pipelines. This category of tests is not concerned with the correctness of the elements themselves, but needs to simulate transforms that output many elements throughout varying pipeline shapes.
Index ¶
- func Source(s beam.Scope, col beam.PCollection) beam.PCollection
- func SourceSingle(s beam.Scope, cfg SourceConfig) beam.PCollection
- func Step(s beam.Scope, cfg StepConfig, col beam.PCollection) beam.PCollection
- type SourceConfig
- type SourceConfigBuilder
- func (b *SourceConfigBuilder) Build() SourceConfig
- func (b *SourceConfigBuilder) BuildFromJSON(jsonData []byte) SourceConfig
- func (b *SourceConfigBuilder) HotKeyFraction(val float64) *SourceConfigBuilder
- func (b *SourceConfigBuilder) InitialSplits(val int) *SourceConfigBuilder
- func (b *SourceConfigBuilder) KeySize(val int) *SourceConfigBuilder
- func (b *SourceConfigBuilder) NumElements(val int) *SourceConfigBuilder
- func (b *SourceConfigBuilder) NumHotKeys(val int) *SourceConfigBuilder
- func (b *SourceConfigBuilder) ValueSize(val int) *SourceConfigBuilder
- type StepConfig
- type StepConfigBuilder
- func (b *StepConfigBuilder) Build() StepConfig
- func (b *StepConfigBuilder) FilterRatio(val float64) *StepConfigBuilder
- func (b *StepConfigBuilder) InitialSplits(val int) *StepConfigBuilder
- func (b *StepConfigBuilder) OutputPerInput(val int) *StepConfigBuilder
- func (b *StepConfigBuilder) Splittable(val bool) *StepConfigBuilder
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Source ¶
func Source(s beam.Scope, col beam.PCollection) beam.PCollection
Source creates a synthetic source transform that emits randomly generated KV<[]byte, []byte> elements.
This transform accepts a PCollection of SourceConfig, where each SourceConfig determines the synthetic source's behavior for producing a batch of elements. This allows multiple batches of elements to be produced with different behavior, in order to simulate a source transform that reads from multiple differently behaving sources, such as a file read that received small files and large files.
The recommended way to create SourceConfigs is via the SourceConfigBuilder. Usage example:
cfgs := beam.Create(s, synthetic.DefaultSourceConfig().NumElements(1000).Build(), synthetic.DefaultSourceConfig().NumElements(5000).InitialSplits(2).Build()) src := synthetic.Source(s, cfgs)
func SourceSingle ¶
func SourceSingle(s beam.Scope, cfg SourceConfig) beam.PCollection
SourceSingle creates a synthetic source transform that emits randomly generated KV<[]byte, []byte> elements.
This transform is a version of Source for when only one SourceConfig is needed. This transform accepts one SourceConfig which determines the synthetic source's behavior.
The recommended way to create SourceConfigs are via the SourceConfigBuilder. Usage example:
src := synthetic.SourceSingle(s, synthetic.DefaultSourceConfig().NumElements(5000).InitialSplits(2).Build())
func Step ¶
func Step(s beam.Scope, cfg StepConfig, col beam.PCollection) beam.PCollection
Step creates a synthetic step transform that receives KV<[]byte, []byte> elements from other synthetic transforms, and outputs KV<[]byte, []byte> elements based on its inputs.
This function accepts a StepConfig to configure the behavior of the synthetic step, including whether that step is implemented as a splittable or non-splittable DoFn.
The recommended way to create StepConfigs is via the StepConfigBuilder. Usage example:
cfg := synthetic.DefaultStepConfig().OutputPerInput(10).FilterRatio(0.5).Build() step := synthetic.Step(s, cfg, input)
Types ¶
type SourceConfig ¶
type SourceConfig struct { NumElements int64 `json:"num_records" beam:"num_records"` InitialSplits int64 `json:"initial_splits" beam:"initial_splits"` KeySize int64 `json:"key_size" beam:"key_size"` ValueSize int64 `json:"value_size" beam:"value_size"` NumHotKeys int64 `json:"num_hot_keys" beam:"num_hot_keys"` HotKeyFraction float64 `json:"hot_key_fraction" beam:"hot_key_fraction"` }
SourceConfig is a struct containing all the configuration options for a synthetic source. It should be created via a SourceConfigBuilder, not by directly initializing it (the fields are public to allow encoding).
type SourceConfigBuilder ¶
type SourceConfigBuilder struct {
// contains filtered or unexported fields
}
SourceConfigBuilder is used to initialize SourceConfigs. See SourceConfigBuilder's methods for descriptions of the fields in a SourceConfig and how they can be set. The intended approach for using this builder is to begin by calling the DefaultSourceConfig function, followed by calling setters, followed by calling Build.
Usage example:
cfg := synthetic.DefaultSourceConfig().NumElements(5000).InitialSplits(2).Build()
func DefaultSourceConfig ¶
func DefaultSourceConfig() *SourceConfigBuilder
DefaultSourceConfig creates a SourceConfigBuilder set with intended defaults for the SourceConfig fields. This function is the intended starting point for initializing a SourceConfig and should always be used to create SourceConfigBuilders.
To see descriptions of the various SourceConfig fields and their defaults, see the methods to SourceConfigBuilder.
func (*SourceConfigBuilder) Build ¶
func (b *SourceConfigBuilder) Build() SourceConfig
Build constructs the SourceConfig initialized by this builder. It also performs error checking on the fields, and panics if any have been set to invalid values.
func (*SourceConfigBuilder) BuildFromJSON ¶
func (b *SourceConfigBuilder) BuildFromJSON(jsonData []byte) SourceConfig
BuildFromJSON constructs the SourceConfig by populating it with the parsed JSON. Panics if there is an error in the syntax of the JSON or if the input contains unknown object keys.
An example of valid JSON object:
{ "num_records": 5, "key_size": 5, "value_size": 5, "num_hot_keys": 5, }
func (*SourceConfigBuilder) HotKeyFraction ¶
func (b *SourceConfigBuilder) HotKeyFraction(val float64) *SourceConfigBuilder
HotKeyFraction determines the value of hot key fraction.
Valid values are floating point numbers from 0 to 1.
func (*SourceConfigBuilder) InitialSplits ¶
func (b *SourceConfigBuilder) InitialSplits(val int) *SourceConfigBuilder
InitialSplits determines the number of initial splits to perform in the source's SplitRestriction method. Restrictions in synthetic sources represent the number of elements being emitted, and this split is performed evenly across that number of elements.
Each resulting restriction will have at least 1 element in it, and each element being emitted will be contained in exactly one restriction. That means that if the desired number of splits is greater than the number of elements N, then N initial restrictions will be created, each containing 1 element.
Valid values are in the range of [1, ...] and the default value is 1. Values of 0 (and below) are invalid as they would result in dropping elements that are expected to be emitted.
func (*SourceConfigBuilder) KeySize ¶
func (b *SourceConfigBuilder) KeySize(val int) *SourceConfigBuilder
KeySize determines the size of the key of elements for the source to generate.
Valid values are in the range of [1, ...] and the default value is 8.
func (*SourceConfigBuilder) NumElements ¶
func (b *SourceConfigBuilder) NumElements(val int) *SourceConfigBuilder
NumElements is the number of elements for the source to generate and emit.
Valid values are in the range of [1, ...] and the default value is 1. Values of 0 (and below) are invalid as they result in sources that emit no elements.
func (*SourceConfigBuilder) NumHotKeys ¶
func (b *SourceConfigBuilder) NumHotKeys(val int) *SourceConfigBuilder
NumHotKeys determines the number of keys with the same value among generated keys.
Valid values are in the range of [0, ...] and the default value is 0.
func (*SourceConfigBuilder) ValueSize ¶
func (b *SourceConfigBuilder) ValueSize(val int) *SourceConfigBuilder
ValueSize determines the size of the value of elements for the source to generate.
Valid values are in the range of [1, ...] and the default value is 8.
type StepConfig ¶
StepConfig is a struct containing all the configuration options for a synthetic step. It should be created via a StepConfigBuilder, not by directly initializing it (the fields are public to allow encoding).
type StepConfigBuilder ¶
type StepConfigBuilder struct {
// contains filtered or unexported fields
}
StepConfigBuilder is used to initialize StepConfigs. See StepConfigBuilder's methods for descriptions of the fields in a StepConfig and how they can be set. The intended approach for using this builder is to begin by calling the DefaultStepConfig function, followed by calling setters, followed by calling Build.
Usage example:
cfg := synthetic.DefaultStepConfig().OutputPerInput(10).FilterRatio(0.5).Build()
func DefaultStepConfig ¶
func DefaultStepConfig() *StepConfigBuilder
DefaultStepConfig creates a StepConfig with intended defaults for the StepConfig fields. This function is the intended starting point for initializing a StepConfig and should always be used to create StepConfigBuilders.
To see descriptions of the various StepConfig fields and their defaults, see the methods to StepConfigBuilder.
func (*StepConfigBuilder) Build ¶
func (b *StepConfigBuilder) Build() StepConfig
Build constructs the StepConfig initialized by this builder. It also performs error checking on the fields, and panics if any have been set to invalid values.
func (*StepConfigBuilder) FilterRatio ¶
func (b *StepConfigBuilder) FilterRatio(val float64) *StepConfigBuilder
FilterRatio indicates the random chance that an input will be filtered out, meaning that no outputs will get emitted for it. For example, a FilterRatio of 0.25 means that 25% of inputs will be filtered out, a FilterRatio of 0 means no elements are filtered, and a FilterRatio of 1.0 means every element is filtered.
In a non-splittable step, this is performed on each input element, meaning all outputs for that element would be filtered. In a splittable step, this is performed on each input restriction instead of the entire element, meaning that some outputs for an element may be filtered and others kept.
Note that even when elements are filtered out, the work associated with processing those elements is still performed, which differs from setting an OutputPerInput of 0. Also note that if a
Valid values are in the range if [0.0, 1.0], and the default value is 0. In order to avoid precision errors, invalid values do not cause errors. Instead, values below 0 are functionally equivalent to 0, and values above 1 are functionally equivalent to 1.
func (*StepConfigBuilder) InitialSplits ¶
func (b *StepConfigBuilder) InitialSplits(val int) *StepConfigBuilder
InitialSplits is only applicable if Splittable is set to true, and determines the number of initial splits to perform in the step's SplitRestriction method. Restrictions in synthetic steps represent the number of elements to emit for each input element, as defined by the OutputPerInput config field, and this split is performed evenly across that number of elements.
Each resulting restriction will have at least 1 element in it, and each element being emitted will be contained in exactly one restriction. That means that if the desired number of splits is greater than the OutputPerInput N, then N initial restrictions will be created, each containing 1 element.
Valid values are in the range of [1, ...] and the default value is 1. Values of 0 (and below) are invalid as they would result in dropping elements that are expected to be emitted.
func (*StepConfigBuilder) OutputPerInput ¶
func (b *StepConfigBuilder) OutputPerInput(val int) *StepConfigBuilder
OutputPerInput is the number of outputs to emit per input received. Each output is identical to the original input. A value of 0 drops all inputs and produces no output.
Valid values are in the range of [0, ...] and the default value is 1. Values below 0 are invalid as they have no logical meaning for this field.
func (*StepConfigBuilder) Splittable ¶
func (b *StepConfigBuilder) Splittable(val bool) *StepConfigBuilder
Splittable indicates whether the step should use the splittable DoFn or non-splittable DoFn implementation.
Splittable steps will split along restrictions representing the number of OutputPerInput for each element, so it is most useful for steps with a high OutputPerInput. Conversely, if OutputPerInput is 1, then there is no way to split restrictions further, so making the step splittable will do nothing.