minimus

package module

v0.0.0-...-28f2576 Latest Latest Go to latest Published: Oct 11, 2020 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/cowdude/minimus-encoder

Links

Open Source Insights

README ¶

minimus-encoder

Project description

Data stream encoder/decoder for compressing sequences of float/integer values to bits.

It focuses on encoding vectors of elements, i.e. interleaved series.

Heavily inspired by Facebook's Gorilla TSDB paper.

Shipped with a lossy float64 transform function, allowing more efficient (but lossy) storage. This module can still be used for lossless encoding.

Thanks icza for his bitio library, heavily used in this project.

Current status

The project is quite young and could definitely use more testing and eventually some optimizations. I am currently using it to store large amounts of fixed-interval time-series on the cloud.

There is currently an issue with values that rapidly oscillates around zero (i.e. flipping the float64's sign bit too often). In case you would like to use this project, it is best advised to apply a bias to your inputs in order to avoid flipping the sign bit too often.

Pull requests and suggestions are welcome, feel free to open an issue.

Example

See the example directory. The program compresses/uncompresses an arbitrary sequence of vectors with different loss levels. Finally, it prints out both inputs and outputs sequences, and displays the mean compressed data bits per sample.

go run ./example

        0: 60.4700  0.0163  1.0000  1.0000  => 60.3750  0.0156  1.0000  1.0000
        1: 94.0500  0.0105  1.0000  1.0000  => 94.0000  0.0078  1.0000  1.0000
        2: 66.4600  0.0148  1.0000  1.0000  => 66.3750  0.0117  1.0000  1.0000
      999: 45.0700  0.0217 32.0000 77.0000  => 45.0000  0.0156 32.0000 77.0000
     1000:  1.1111  2.2222  3.3333  4.4444  =>  1.0625  2.1250  3.2500  4.3750
|e|=1e-01:  6.999 b/sample

        0: 60.4700  0.0163  1.0000  1.0000  => 60.4700  0.0162  1.0000  1.0000
        1: 94.0500  0.0105  1.0000  1.0000  => 94.0499  0.0105  1.0000  1.0000
        2: 66.4600  0.0148  1.0000  1.0000  => 66.4600  0.0148  1.0000  1.0000
      999: 45.0700  0.0217 32.0000 77.0000  => 45.0699  0.0216 32.0000 77.0000
     1000:  1.1111  2.2222  3.3333  4.4444  =>  1.1111  2.2222  3.3333  4.4444
|e|=1e-04: 11.838 b/sample

        0: 60.4700  0.0163  1.0000  1.0000  => 60.4700  0.0163  1.0000  1.0000
        1: 94.0500  0.0105  1.0000  1.0000  => 94.0500  0.0105  1.0000  1.0000
        2: 66.4600  0.0148  1.0000  1.0000  => 66.4600  0.0148  1.0000  1.0000
      999: 45.0700  0.0217 32.0000 77.0000  => 45.0700  0.0217 32.0000 77.0000
     1000:  1.1111  2.2222  3.3333  4.4444  =>  1.1111  2.2222  3.3333  4.4444
|e|=1e-07: 16.819 b/sample

        0: 60.4700  0.0163  1.0000  1.0000  => 60.4700  0.0163  1.0000  1.0000
        1: 94.0500  0.0105  1.0000  1.0000  => 94.0500  0.0105  1.0000  1.0000
        2: 66.4600  0.0148  1.0000  1.0000  => 66.4600  0.0148  1.0000  1.0000
      999: 45.0700  0.0217 32.0000 77.0000  => 45.0700  0.0217 32.0000 77.0000
     1000:  1.1111  2.2222  3.3333  4.4444  =>  1.1111  2.2222  3.3333  4.4444
|e|=1e-10: 21.762 b/sample

        0: 60.4700  0.0163  1.0000  1.0000  => 60.4700  0.0163  1.0000  1.0000
        1: 94.0500  0.0105  1.0000  1.0000  => 94.0500  0.0105  1.0000  1.0000
        2: 66.4600  0.0148  1.0000  1.0000  => 66.4600  0.0148  1.0000  1.0000
      999: 45.0700  0.0217 32.0000 77.0000  => 45.0700  0.0217 32.0000 77.0000
     1000:  1.1111  2.2222  3.3333  4.4444  =>  1.1111  2.2222  3.3333  4.4444
|e|=1e-13: 26.543 b/sample

        0: 60.4700  0.0163  1.0000  1.0000  => 60.4700  0.0163  1.0000  1.0000
        1: 94.0500  0.0105  1.0000  1.0000  => 94.0500  0.0105  1.0000  1.0000
        2: 66.4600  0.0148  1.0000  1.0000  => 66.4600  0.0148  1.0000  1.0000
      999: 45.0700  0.0217 32.0000 77.0000  => 45.0700  0.0217 32.0000 77.0000
     1000:  1.1111  2.2222  3.3333  4.4444  =>  1.1111  2.2222  3.3333  4.4444
|e|=1e-16: 29.764 b/sample

        0: 60.4700  0.0163  1.0000  1.0000  => 60.4700  0.0163  1.0000  1.0000
        1: 94.0500  0.0105  1.0000  1.0000  => 94.0500  0.0105  1.0000  1.0000
        2: 66.4600  0.0148  1.0000  1.0000  => 66.4600  0.0148  1.0000  1.0000
      999: 45.0700  0.0217 32.0000 77.0000  => 45.0700  0.0217 32.0000 77.0000
     1000:  1.1111  2.2222  3.3333  4.4444  =>  1.1111  2.2222  3.3333  4.4444
(lossless)    30.078 b/sample

Documentation ¶

Index ¶

type BitHint
- func LossyFloat64(n float64, maxAbsError float64, hint BitHint) (float64, BitHint)
type Decoder
- func NewDecoder(src io.Reader, span int) *Decoder
type Encoder
- func NewEncoder(dst io.Writer, span int) *Encoder
type Vec64
- func (v Vec64) Float64() []float64
- func (v Vec64) Uint64() []uint64
type VecPool
- func NewVecPool(span int) *VecPool
- func (p *VecPool) Get() Vec64
- func (p *VecPool) Put(v Vec64)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type BitHint ¶

type BitHint uint8

func LossyFloat64 ¶

func LossyFloat64(n float64, maxAbsError float64, hint BitHint) (float64, BitHint)

LossyFloat64 transforms a float64 into a compress-friendly approximation.

The function guarantees that abs(result - n) < maxAbsError.

Under the hood, the function zero-outs as many least significant bits.

type Decoder ¶

type Decoder struct {
	// contains filtered or unexported fields
}

Decoder reads a compressed data stream and allows iterating over the resulting decoded sequence of Vec64

func NewDecoder ¶

func NewDecoder(src io.Reader, span int) *Decoder

NewDecoder allocates and initializes a new Decoder

func (*Decoder) Current ¶

func (dec *Decoder) Current() Vec64

Current returns the last Vec64 decoded after calling Next. Only valid if Next returned true.

func (*Decoder) EnumBorrow ¶

func (dec *Decoder) EnumBorrow(ctx context.Context, out chan Vec64, pool *VecPool) error

EnumBorrow is a helper method for enumerating Vec64s into a go channel.

Channel elements should be returned to the pool when no longer needed.

func (*Decoder) Err ¶

func (dec *Decoder) Err() error

Err returns the last error after calling Next. Err returns nil at EOF, not io.EOF.

func (*Decoder) Next ¶

func (dec *Decoder) Next() bool

Next decodes the next Vec64 element from the stream. Returns false if an error happens, or end of stream is reached.

type Encoder ¶

type Encoder struct {
	// contains filtered or unexported fields
}

Encoder stores the encoding context for compressing a sequence of Vec64 to an io.Writer

func NewEncoder ¶

func NewEncoder(dst io.Writer, span int) *Encoder

NewEncoder allocates and initializes a new Encoder to compress sequences of Vec64

func (*Encoder) Close ¶

func (enc *Encoder) Close() error

Close appends an EOF bit sequence and flushes any remaining buffered data to the underlying stream

func (*Encoder) Put ¶

func (enc *Encoder) Put(vec Vec64) error

Put encodes a Vec64 into a compressed, variable bits sequence and writes the result to the underlying stream.

Put will panic if the given Vec64 has a different span than the encoder.

func (*Encoder) PutFloat64 ¶

func (enc *Encoder) PutFloat64(vec []float64) error

PutFloat64 is a short-hand for appending a vector of float64 to the encoder.

func (*Encoder) PutUint64 ¶

func (enc *Encoder) PutUint64(vec []uint64) error

PutUint64 is a short-hand for appending a vector of uint64 to the encoder.

func (*Encoder) Reset ¶

func (enc *Encoder) Reset(dst io.Writer)

Reset the internal state of the encoder to write to a new given stream.

type Vec64 ¶

type Vec64 []uint64

Vec64 represents a N vector of 64-bit primitives

func (Vec64) Float64 ¶

func (v Vec64) Float64() []float64

Float64 casts a Vec64 as a float64 slice without copying it

func (Vec64) Uint64 ¶

func (v Vec64) Uint64() []uint64

Uint64 casts a Vec64 as a uint64 slice without copying it

type VecPool ¶

type VecPool struct {
	// contains filtered or unexported fields
}

VecPool is a concurrent pool of Vec64

func NewVecPool ¶

func NewVecPool(span int) *VecPool

NewVecPool creates a new empty VecPool

func (*VecPool) Get ¶

func (p *VecPool) Get() Vec64

Get takes or allocates a new Vec64 from the pool

func (*VecPool) Put ¶

func (p *VecPool) Put(v Vec64)

Put returns an existing Vec64 to the pool

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
example

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL