Documentation
¶
Overview ¶
Package util provides utility Operations for DataFrames
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Accumulate ¶
func Accumulate(accumulatorFactory sif.AccumulatorFactory) *sif.DataFrameOperation
Accumulate is an alternative reduction technique, which siphons data from Partitions into a custom data structure. The result is itself an Accumulator, rather than a series of Partitions, thus ending the job (no more operations may) be performed against the data. The advantage, however, is full control over the reduction technique, which can yield substantial performance benefits. As reduction is performed locally on all workers, then worker results are all reduced on the Coordinator, Accumulators are best utilized for smaller results. Distributed reductions via Reduce() are more efficient when there is a large reduction result (e.g. a large number of buckets).
func Collect ¶
func Collect(collectionLimit int) *sif.DataFrameOperation
Collect declares that data should be shuffled to the Coordinator upon completion of the previous stage. This also signals the end of a Dataframe's tasks.
Types ¶
This section is empty.