rivo

package module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 6, 2025 License: MIT Imports: 6 Imported by: 0

README

rivo

rivo is a library for stream processing in Go.

NOTE: THIS LIBRARY IS STILL IN ACTIVE DEVELOPMENT AND IS NOT YET PRODUCTION READY.

About

rivo had two major inspirations:

  1. The book "Concurrency in Go";
  2. ReactiveX, in particular the Go and JS libraries;

Compared to these sources, rivo aims to provide better type safety (both "Concurrency in Go" and RxGo were written in a pre-generics era and make a heavy use of interface{}) and a more intuitive API and developer experience (Rx is very powerful, but can be overwhelming for newcomers).

Getting started

Prerequisites

rivo requires Go 1.23 or later.

Installation
  go get github.com/agiac/rivo
Basic concepts

rivo has 3 main types, which are the building blocks of the library: Item, Stream and Pipeline.

Item is a struct which contains a value and an optional error. Just like errors are returned next to the result of a function in synchronous code, they should be passed along into asynchronous code and handled where more appropriate.

type Item[T any] struct {
	Val T
	Err error
}

Stream is a read only channel of items. As the name suggests, it represents a stream of data.

type Stream[T any] <-chan Item[T]

Pipeline is a function that takes a context.Context and a Stream of one type and returns a Stream of the same or a different type. They represent the operations that can be performed on streams. Pipelines can be composed together to create more complex operations.

type Pipeline[T, U any] func(ctx context.Context, stream Stream[T]) Stream[U]

If a pipeline generates values without depending on an input stream, it is called a generator. If it consumes values without generating a new stream, it is called a sink. If it transforms values, it is called a transformer.

Here's a basic example:

package main

import (
	"context"
	"fmt"
	"github.com/agiac/rivo"
)

func main() {
	ctx := context.Background()

	// `Of` returns a generator which returns a stream that will emit the provided values
	in := rivo.Of(1, 2, 3, 4, 5)

	// `Filter` returns a transformer that filters the input stream using the given function.
	onlyEven := rivo.Filter(func(ctx context.Context, i rivo.Item[int]) (bool, error) {
		// Always check for errors
		if i.Err != nil {
			return true, i.Err // Propagate the error
		}

		return i.Val%2 == 0, nil
	})

    // `Do` returns a sync that applies the given function to each item in the input stream, without emitting any values.
	log := rivo.Do(func(ctx context.Context, i rivo.Item[int]) {
		if i.Err != nil {
			fmt.Printf("ERROR: %v\n", i.Err)
			return
		}

		fmt.Println(i.Val)
	})

	// `Pipe` composes pipelines together, returning a new pipeline.
	p := rivo.Pipe3(in, onlyEven, log)

	// By passing a context and an input channel to our pipeline, we can get the output stream.
	// Since our first pipeline `in` is a generator and does not depend on an input stream, we can pass a nil channel.
	// Also, since log is a sink, we only have to read once from the output channel to know that the pipeline has finished.
	<-p(ctx, nil)

	// Expected output:
	// 2
	// 4
}

Pipeline factories

rivo comes with a set of built-in pipeline factories.

Generators
  • Of: returns a pipeline which returns a stream that will emit the provided values;
  • FromFunc: returns a pipeline which returns a stream that will emit the values returned by the provided function;
  • FromSeq and FromSeq2: returns a pipeline which returns a stream that will emit the values from the provided iterator;
  • Tee and TeeN: return n pipelines that each receive a copy of each item from the input stream;
  • Segregate: returns two pipelines, where the first pipeline emits items that pass the predicate, and the second pipeline emits items that do not pass the predicate;
  • SegregateErrors: returns two pipelines, where the first pipeline emits items without errors, and the second pipeline emits items with errors;
Sinks
  • Do: returns a pipeline which performs a side effect for each item in the input stream;
Transformers
  • Filter: returns a pipeline which filters the input stream using the given function;
  • Map: returns a pipeline which maps the input stream using the given function;
  • Batch: returns a pipeline which groups the input stream into batches of the provided size;
  • Flatten: returns a pipeline which flattens the input stream of slices;
  • Pipe, Pipe2, Pipe3, Pipe4, Pipe5: return a pipeline which composes the provided pipelines together;
  • Connect: returns a sync which applies the given syncs to the input stream concurrently;

Besides these, the directories of the library contain more specialized pipelines factories.

Package rivo/io
  • FromReader: returns a pipeline which reads from the provided io.Reader and emits the read bytes;
  • ToWriter: returns a pipeline which writes the input stream to the provided io.Writer;
Package rivo/bufio
  • FromScanner: returns a pipeline which reads from the provided bufio.Scanner and emits the scanned items;
  • ToScanner: returns a pipeline which writes the input stream to the provided bufio.Writer;
Package rivo/csv
  • FromReader: returns a pipeline which reads from the provided csv.Reader and emits the read records;
  • ToWriter: returns a pipeline which writes the input stream to the provided csv.Writer;
Package rivo/aws/dynamodb
  • Scan
  • ScanItems
  • BatchWrite
  • BatchPutItems

Error handling

As mentioned, each values contains a value and an optional error. You can handle error either individually inside pipelines' callbacks like Map or Do or (recommended) create dedicated pipelines for error handling. See examples/errorHanlidng for this regard.

Examples

More examples can be found in the examples folder.


Contributing

Contributions are welcome! If you have any ideas, suggestions or bug reports, please open an issue or a pull request.

Roadmap

  • Review docs, in particular where "pipeline" is used instead of "generator", "sink" or "transformer"
  • Add more pipelines, also using the RxJS list of operators as a reference:
    • Tap
    • Time-based
    • SQL
    • AWS
    • ...
  • Add more utilities
    • Merge
  • Add more examples

License

rivo is licensed under the MIT license. See the LICENSE file for details.

Documentation

Overview

Package rivo is a library for stream processing.

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrEOS = errors.New("end of stream")

Functions

func Segregate

func Segregate[T any](p Pipeline[None, T], predicate func(item Item[T]) bool) (Pipeline[None, T], Pipeline[None, T])

Segregate returns two pipelines, where the first pipeline emits items that pass the predicate, and the second pipeline emits items that do not pass the predicate.

Example
package main

import (
	"context"
	"fmt"
	"strconv"

	"github.com/agiac/rivo"
)

func main() {
	ctx := context.Background()

	g := rivo.Of("1", "2", "3", "4", "5")

	toInt := rivo.Map(func(ctx context.Context, i rivo.Item[string]) (int, error) {
		return strconv.Atoi(i.Val)
	})

	p := rivo.Pipe(g, toInt)

	even, odd := rivo.Segregate(p, func(item rivo.Item[int]) bool {
		return item.Val%2 == 0
	})

	evens := make([]int, 0)
	odds := make([]int, 0)

	<-rivo.Connect(
		rivo.Pipe(even, rivo.Do(func(ctx context.Context, i rivo.Item[int]) {
			evens = append(evens, i.Val)
		})),
		rivo.Pipe(odd, rivo.Do(func(ctx context.Context, i rivo.Item[int]) {
			odds = append(odds, i.Val)
		})),
	)(ctx, nil)

	for _, i := range append(evens, odds...) {
		fmt.Println(i)
	}

}
Output:

2
4
1
3
5

func SegregateErrors added in v0.4.0

func SegregateErrors[T any](p Pipeline[None, T]) (Pipeline[None, T], Pipeline[None, T])

SegregateErrors returns two pipelines, where the first pipeline emits items without errors, and the second pipeline emits items with errors.

func Tee

func Tee[T any](p Pipeline[None, T]) (Pipeline[None, T], Pipeline[None, T])

Tee returns two pipelines that each receive a copy of each item from the input stream.

Example
package main

import (
	"context"
	"fmt"
	"sync"

	"github.com/agiac/rivo"
)

func main() {
	ctx := context.Background()

	g := rivo.Of("hello", "hello", "hello")

	out1, out2 := rivo.Tee(g)

	wg := sync.WaitGroup{}
	wg.Add(2)

	go func() {
		defer wg.Done()
		for i := range out1(ctx, nil) {
			fmt.Println(i.Val)
		}
	}()

	go func() {
		defer wg.Done()
		for i := range out2(ctx, nil) {
			fmt.Println(i.Val)
		}
	}()

	wg.Wait()

}
Output:

hello
hello
hello
hello
hello
hello

Types

type BatchOption added in v0.4.0

type BatchOption func(*batchOptions) error

func BatchBufferSize added in v0.4.0

func BatchBufferSize(n int) BatchOption

func BatchMaxWait added in v0.4.0

func BatchMaxWait(d time.Duration) BatchOption

type DoOption added in v0.4.0

type DoOption func(*doOptions) error

func DoPoolSize added in v0.4.0

func DoPoolSize(n int) DoOption

type ErrorStream added in v0.4.0

type ErrorStream = Stream[struct{}]

type FilterOption added in v0.4.0

type FilterOption func(*filterOptions) error

func FilterBufferSize added in v0.4.0

func FilterBufferSize(n int) FilterOption

func FilterPoolSize added in v0.4.0

func FilterPoolSize(n int) FilterOption

type FromFuncOption added in v0.4.0

type FromFuncOption func(*fromFuncOptions) error

func FromFuncBufferSize added in v0.4.0

func FromFuncBufferSize(bufferSize int) FromFuncOption

func FromFuncOnBeforeClose added in v0.4.0

func FromFuncOnBeforeClose(f func(context.Context) error) FromFuncOption

func FromFuncPoolSize added in v0.4.0

func FromFuncPoolSize(poolSize int) FromFuncOption

type FromSeq2Value

type FromSeq2Value[T, U any] struct {
	Val1 T
	Val2 U
}

type Item

type Item[T any] struct {
	// Val is the value of the item when there is no error.
	Val T
	// Err is the optional error of the item.
	Err error
}

Item represents a single item in a data stream. It contains a value of type T and an optional error.

func Collect

func Collect[T any](in Stream[T]) []Item[T]

Collect collects all items from the stream and returns them as a slice.

Example
ctx := context.Background()

s := Of(1, 2, 3, 4, 5)(ctx, nil)

for _, item := range Collect(s) {
	fmt.Println(item.Val)
}
Output:

1
2
3
4
5

func CollectWithContext

func CollectWithContext[T any](ctx context.Context, in Stream[T]) []Item[T]

CollectWithContext collects all items from the stream and returns them as a slice. If the context is cancelled, it stops collecting items.

type MapOption added in v0.4.0

type MapOption func(*mapOptions) error

func MapBufferSize added in v0.4.0

func MapBufferSize(bufferSize int) MapOption

func MapPoolSize added in v0.4.0

func MapPoolSize(poolSize int) MapOption

type None

type None struct{}

None is a type that represents no value. It is typically used as the input type of generator pipeline that does not depend on any input stream or for a sync pipeline that does not emit any items.

type Pipeline added in v0.3.0

type Pipeline[T, U any] func(ctx context.Context, stream Stream[T]) Stream[U]

Pipeline is a function that takes a context and a stream and returns a stream of the same type or a different type.

func Batch added in v0.1.0

func Batch[T any](n int, opt ...BatchOption) Pipeline[T, []T]

Batch returns a Pipeline that batches items from the input Stream into slices of n items. If the batch is not full after maxWait, it will be sent anyway. Any error in the input Stream will be propagated to the output Stream immediately.

Example
ctx := context.Background()

in := Of(1, 2, 3, 4, 5)

b := Batch[int](2)

p := Pipe(in, b)

for item := range p(ctx, nil) {
	fmt.Printf("%v\n", item.Val)
}
Output:

[1 2]
[3 4]
[5]

func Connect

func Connect[A any](pp ...Pipeline[A, None]) Pipeline[A, None]

Connect returns a sync pipelines that applies the given syncs pipelines to the input stream concurrently. The output stream will not emit any items, and it will be closed when the input stream is closed or the context is done.

Example
package main

import (
	"context"
	"fmt"
	"strings"

	"github.com/agiac/rivo"
)

func main() {
	ctx := context.Background()

	g := rivo.Of("Hello", "Hello", "Hello")

	capitalize := rivo.Map(func(ctx context.Context, i rivo.Item[string]) (string, error) {
		return strings.ToUpper(i.Val), nil
	})

	lowercase := rivo.Map(func(ctx context.Context, i rivo.Item[string]) (string, error) {
		return strings.ToLower(i.Val), nil
	})

	resA := make([]string, 0)
	a := rivo.Do(func(ctx context.Context, i rivo.Item[string]) {
		resA = append(resA, i.Val)
	})

	resB := make([]string, 0)
	b := rivo.Do(func(ctx context.Context, i rivo.Item[string]) {
		resB = append(resB, i.Val)
	})

	p1 := rivo.Pipe(capitalize, a)
	p2 := rivo.Pipe(lowercase, b)

	<-rivo.Connect(p1, p2)(ctx, g(ctx, nil))

	for _, s := range resA {
		fmt.Println(s)
	}

	for _, s := range resB {
		fmt.Println(s)
	}

}
Output:

HELLO
HELLO
HELLO
hello
hello
hello

func Do

func Do[T any](f func(context.Context, Item[T]), opt ...DoOption) Pipeline[T, None]

Do returns a sync pipeline that applies the given function to each item in the stream. The output stream will not emit any items, and it will be closed when the input stream is closed or the context is done.

Example
ctx := context.Background()

in := make(chan Item[int])
go func() {
	defer close(in)
	in <- Item[int]{Val: 1}
	in <- Item[int]{Val: 2}
	in <- Item[int]{Err: errors.New("error 1")}
	in <- Item[int]{Val: 4}
	in <- Item[int]{Err: errors.New("error 2")}
}()

d := Do(func(ctx context.Context, i Item[int]) {
	if i.Err != nil {
		fmt.Printf("ERROR: %v\n", i.Err)
	}
})

<-d(ctx, in)
Output:

ERROR: error 1
ERROR: error 2

func Filter

func Filter[T any](f func(context.Context, Item[T]) (bool, error), opt ...FilterOption) Pipeline[T, T]

Filter returns a pipeline that filters the input stream using the given function.

Example
ctx := context.Background()

in := Of(1, 2, 3, 4, 5)

onlyEven := Filter(func(ctx context.Context, i Item[int]) (bool, error) {
	// Always check for errors
	if i.Err != nil {
		return true, i.Err // Propagate the error
	}

	return i.Val%2 == 0, nil
})

p := Pipe(in, onlyEven)

s := p(ctx, nil)

for item := range s {
	fmt.Println(item.Val)
}
Output:

2
4

func Flatten added in v0.2.0

func Flatten[T any]() Pipeline[[]T, T]

Flatten returns a Pipeline that flattens a Stream of slices into a Stream of individual items.

Example
ctx := context.Background()

in := Of([]int{1, 2}, []int{3, 4}, []int{5})

f := Flatten[int]()

p := Pipe(in, f)

for item := range p(ctx, nil) {
	fmt.Printf("%v\n", item.Val)
}
Output:

1
2
3
4
5

func FromFunc

func FromFunc[T any](f func(context.Context) (T, error), options ...FromFuncOption) Pipeline[None, T]

FromFunc returns a generator Pipeline that emits items generated by the given function. The input stream is ignored. The returned stream will emit items until the function returns ErrEOS.

Example
ctx := context.Background()

count := atomic.Int32{}

genFn := func(ctx context.Context) (int32, error) {
	value := count.Add(1)

	if value > 5 {
		return 0, ErrEOS
	}

	return value, nil
}

in := FromFunc(genFn)

s := in(ctx, nil)

for item := range s {
	fmt.Println(item.Val)
}
Output:

1
2
3
4
5

func FromSeq

func FromSeq[T any](seq iter.Seq[T]) Pipeline[None, T]
Example
ctx := context.Background()

seq := slices.Values([]int{1, 2, 3, 4, 5})
in := FromSeq(seq)

s := in(ctx, nil)

for item := range s {
	fmt.Println(item.Val)
}
Output:

1
2
3
4
5

func FromSeq2

func FromSeq2[T, U any](seq iter.Seq2[T, U]) Pipeline[None, FromSeq2Value[T, U]]
Example
ctx := context.Background()

seq := slices.All([]string{"a", "b", "c", "d", "e"})

in := FromSeq2(seq)

s := in(ctx, nil)

for item := range s {
	fmt.Printf("%d, %s\n", item.Val.Val1, item.Val.Val2)
}
Output:

0, a
1, b
2, c
3, d
4, e

func Map

func Map[T, U any](f func(context.Context, Item[T]) (U, error), opt ...MapOption) Pipeline[T, U]

Map returns a pipeline that applies a function to each item from the input stream.

Example
ctx := context.Background()

in := Of(1, 2, 3, 4, 5)

double := Map(func(ctx context.Context, i Item[int]) (int, error) {
	// Always check for errors
	if i.Err != nil {
		return 0, i.Err // Propagate the error
	}

	return i.Val * 2, nil
})

p := Pipe(in, double)

s := p(ctx, nil)

for item := range s {
	fmt.Println(item.Val)
}
Output:

2
4
6
8
10

func Of

func Of[T any](items ...T) Pipeline[None, T]

Of returns a generator Pipeline that emits the given items. The input stream is ignored.

Example
ctx := context.Background()

in := Of(1, 2, 3, 4, 5)

s := in(ctx, nil)

for item := range s {
	fmt.Println(item.Val)
}
Output:

1
2
3
4
5

func Pipe

func Pipe[A, B, C any](a Pipeline[A, B], b Pipeline[B, C]) Pipeline[A, C]

Pipe pipes two pipelines together. It is a convenience function that calls Pipe2.

Example
ctx := context.Background()

a := Of(1, 2, 3, 4, 5)

b := Map(func(ctx context.Context, i Item[int]) (int, error) {
	return i.Val + 1, nil
})

p := Pipe(a, b)

s := p(ctx, nil)

for item := range s {
	fmt.Println(item.Val)
}
Output:

2
3
4
5
6

func Pipe2

func Pipe2[A, B, C any](a Pipeline[A, B], b Pipeline[B, C]) Pipeline[A, C]

Pipe2 pipes two pipelines together.

func Pipe3

func Pipe3[A, B, C, D any](a Pipeline[A, B], b Pipeline[B, C], c Pipeline[C, D]) Pipeline[A, D]

Pipe3 pipes three pipelines together.

func Pipe4

func Pipe4[A, B, C, D, E any](a Pipeline[A, B], b Pipeline[B, C], c Pipeline[C, D], d Pipeline[D, E]) Pipeline[A, E]

Pipe4 pipes four pipelines together.

func Pipe5

func Pipe5[A, B, C, D, E, F any](a Pipeline[A, B], b Pipeline[B, C], c Pipeline[C, D], d Pipeline[D, E], e Pipeline[E, F]) Pipeline[A, F]

Pipe5 pipes five pipelines together.

func TeeN

func TeeN[T any](p Pipeline[None, T], n int) []Pipeline[None, T]

TeeN returns n pipelines that each receive a copy of each item from the input stream.

type Stream

type Stream[T any] <-chan Item[T]

Stream represents a data stream of items. It is a read only channel of Item[T].

func OrDone

func OrDone[T any](ctx context.Context, in Stream[T]) Stream[T]

OrDone is a utility function that returns a channel that will be closed when the context is done.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL