Documentation ¶
Overview ¶
Package meduce implements an interface to run MapReduce tasks on a single machine.
Index ¶
- func Link[KeyOld, ValueOld, KeyIn, ValueIn, KeyOut, ValueOut any](prevProcess *Process[KeyOld, ValueOld, KeyIn, ValueIn], ...)
- func LinkWithBufferSize[KeyOld, ValueOld, KeyIn, ValueIn, KeyOut, ValueOut any](prevProcess *Process[KeyOld, ValueOld, KeyIn, ValueIn], ...)
- type Collector
- type Config
- type Emitter
- type Filter
- type Finalizer
- type Mapper
- type Process
- func NewDefaultProcess[KeyIn, ValueIn any, KeyOut cmp.Ordered, ValueOut any](config Config[KeyIn, ValueIn, KeyOut, ValueOut]) *Process[KeyIn, ValueIn, KeyOut, ValueOut]
- func NewProcess[KeyIn, ValueIn, KeyOut, ValueOut any](config Config[KeyIn, ValueIn, KeyOut, ValueOut]) *Process[KeyIn, ValueIn, KeyOut, ValueOut]
- type Reducer
- type Source
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Link ¶ added in v0.4.0
func Link[KeyOld, ValueOld, KeyIn, ValueIn, KeyOut, ValueOut any]( prevProcess *Process[KeyOld, ValueOld, KeyIn, ValueIn], nextProcess *Process[KeyIn, ValueIn, KeyOut, ValueOut], )
Link links two processes together.
func LinkWithBufferSize ¶ added in v0.4.0
func LinkWithBufferSize[KeyOld, ValueOld, KeyIn, ValueIn, KeyOut, ValueOut any]( prevProcess *Process[KeyOld, ValueOld, KeyIn, ValueIn], nextProcess *Process[KeyIn, ValueIn, KeyOut, ValueOut], bufferSize int, )
LinkWithBufferSize links two processes together with a buffer of given size.
bufferSize is the size of the buffer that will be created to link the processes.
Types ¶
type Collector ¶ added in v0.2.0
type Collector[KeyOut, ValueOut any] interface { Init() // Init is called just before collecting starts Collect(key KeyOut, value ValueOut) // Collect is called for each processed key-value pair Finalize() // Finalize is called after all key-value pairs were processed }
A Collector is an entity that is supplied by user and is used to collect processed key-value pairs.
type Config ¶ added in v0.3.0
type Config[KeyIn, ValueIn, KeyOut, ValueOut any] struct { // KeyComparator and ValueComparator are used to sort key-value pairs // before they are passed to the Reducer. // KeyComparator is used as primary comparator, // and ValueComparator is used as secondary. KeyComparator comparison.Comparator[KeyOut] ValueComparator comparison.Comparator[ValueOut] Mapper Mapper[KeyIn, ValueIn, KeyOut, ValueOut] Reducer Reducer[KeyOut, ValueOut] Finalizer Finalizer[KeyOut, ValueOut] Filter Filter[KeyOut, ValueOut] Source Source[KeyIn, ValueIn] Collector Collector[KeyOut, ValueOut] Logger *log.Logger }
A Config is a configuration for a single MapReduce task.
type Emitter ¶
type Emitter[KeyOut, ValueOut any] func(key KeyOut, value ValueOut)
An Emitter is a function that is supplied by library.
It is passed to user's Mapper function, and is called to emit key-value pairs.
type Filter ¶ added in v0.3.0
A Filter is a function that is created by user and is used to filter processed key-value pairs.
It is called after a key-value pair was finalized.
It should return true if a final key-value pair should be collected or false if it should be discarded.
type Finalizer ¶
type Finalizer[KeyOut, ValueOut any] func(key KeyOut, valueRef *ValueOut)
A Finalizer is a function that is created by user and is used to finalize key-value pairs.
It is called after all values for a key were reduced to a single value.
It should process the reduced value in-place before it is passed to collector.
type Mapper ¶
type Mapper[KeyIn, ValueIn, KeyOut, ValueOut any] func(key KeyIn, value ValueIn, emit Emitter[KeyOut, ValueOut])
A Mapper is a function that is created by user and is used to map input data to key-value pairs.
It is called for each key-value pair from input data.
Function can call emit function multiple times to emit any number of key-value pairs.
type Process ¶
type Process[KeyIn, ValueIn, KeyOut, ValueOut any] struct { Config[KeyIn, ValueIn, KeyOut, ValueOut] // contains filtered or unexported fields }
A Process is an instance of a single MapReduce task.
Zero value of Process has no configuration set and has invalid uid.
func NewDefaultProcess ¶ added in v0.3.0
func NewDefaultProcess[KeyIn, ValueIn any, KeyOut cmp.Ordered, ValueOut any]( config Config[KeyIn, ValueIn, KeyOut, ValueOut], ) *Process[KeyIn, ValueIn, KeyOut, ValueOut]
NewDefaultProcess creates a new Process with default key comparator for ordered keys.
func NewProcess ¶
func NewProcess[KeyIn, ValueIn, KeyOut, ValueOut any](config Config[KeyIn, ValueIn, KeyOut, ValueOut]) *Process[KeyIn, ValueIn, KeyOut, ValueOut]
NewProcess creates a new Process with given configuration.
func (*Process[KeyIn, ValueIn, KeyOut, ValueOut]) Run ¶
func (process *Process[KeyIn, ValueIn, KeyOut, ValueOut]) Run()
Run starts the MapReduce task and blocks until it is finished.
If logger is set, it will be used to log the progress.
func (*Process[KeyIn, ValueIn, KeyOut, ValueOut]) WaitToFinish ¶
func (process *Process[KeyIn, ValueIn, KeyOut, ValueOut]) WaitToFinish() Collector[KeyOut, ValueOut]
WaitToFinish blocks until the MapReduce task is finished.
type Reducer ¶
type Reducer[KeyOut, ValueOut any] func(key KeyOut, values []ValueOut) ValueOut
A Reducer is a function that is created by user and is used to reduce values to single value.
It is called for all keys multiple times, until all values for that key are reduced.
It should be idempotent and have no side effects.