czlib

package module

v0.0.0-...-86a9592 Latest Latest Go to latest Published: Aug 14, 2024 License: BSD-3-Clause Imports: 8 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

README ¶

Disclaimer

This repository is considered stable but not actively maintained anymore. It is still in use in many places and safe for production use; but the zlib protocol being stable, we have not made any changes in recent times. Time to reply on issues/PRs may not be on par with other Datadog's repositories.

czlib

czlib started as a fork of the vitess project’s cgzip package. Our primary data pipeline uses zlib compressed messages, but the standard library’s pure Go implementation can be significantly slower than the C zlib library. In order to address this gap, we modified a few flags in cgzip to make it encode and decode with zlib wrapping rather than with gzip headers.

We’ve detailed some of the other more novel design decisions in czlib, including its batch interfaces, in our general blog on performance in Go a couple of years ago. Performance varies quite a bit among the various interfaces, so it pays to benchmark using a message that is typical for your system by running the czlib benchmark suite with PAYLOAD=path_to_message go test -run=NONE -bench .

Here are some benchmark results for compression and decompression of czlib compared to the standard library:

go version go1.22.6 darwin/arm64
pkg: github.com/DataDog/czlib

# 2KiB file
     │ CompressStdZlib │               Compress               │
     │     sec/op      │    sec/op     vs base                │
*-10      75.20µ ± 12%   39.84µ ± 31%  -47.02% (p=0.000 n=10)
     │ CompressStdZlib │               Compress                │
     │       B/s       │      B/s       vs base                │
*-10     27.71Mi ± 11%   52.30Mi ± 24%  +88.73% (p=0.000 n=10)

     │ DecompressStdZlib │             Decompress              │
     │      sec/op       │   sec/op     vs base                │
*-10        18.353µ ± 5%   4.993µ ± 4%  -72.80% (p=0.000 n=10)
     │ DecompressStdZlib │              Decompress               │
     │        B/s        │     B/s       vs base                 │
*-10        113.5Mi ± 5%   417.4Mi ± 3%  +267.60% (p=0.000 n=10)

# Silesia compression corpus - mr (~10MB)
     │ CompressStdZlib │              Compress               │
     │     sec/op      │   sec/op     vs base                │
*-10       327.1m ± 1%   381.0m ± 1%  +16.46% (p=0.000 n=10)

     │ CompressStdZlib │               Compress               │
     │       B/s       │     B/s       vs base                │
*-10      29.07Mi ± 1%   24.96Mi ± 1%  -14.14% (p=0.000 n=10)

     │ DecompressStdZlib │             Decompress              │
     │      sec/op       │   sec/op     vs base                │
*-10         51.20m ± 1%   13.96m ± 2%  -72.74% (p=0.000 n=10)
     │ DecompressStdZlib │              Decompress               │
     │        B/s        │     B/s       vs base                 │
*-10        185.7Mi ± 1%   681.2Mi ± 2%  +266.81% (p=0.000 n=10)

See more on the blog post

Documentation ¶

Index ¶

Constants
Variables
func Compress(input []byte) ([]byte, error)
func Decompress(input []byte) ([]byte, error)
func NewReader(r io.Reader) (io.ReadCloser, error)
func NewReaderBuffer(r io.Reader, bufferSize int) (io.ReadCloser, error)
type UnsafeByte
- func (b UnsafeByte) Free()
type Writer

Constants ¶

View Source

const (
	NoCompression      = flate.NoCompression
	BestSpeed          = flate.BestSpeed
	BestCompression    = flate.BestCompression
	DefaultCompression = flate.DefaultCompression
)

Constants copied from the flate package, so that code that imports czlib does not also have to import "compress/flate".

View Source

const (
	Z_NO_FLUSH      = 0
	Z_PARTIAL_FLUSH = 1
	Z_SYNC_FLUSH    = 2
	Z_FULL_FLUSH    = 3
	Z_FINISH        = 4
	Z_BLOCK         = 5
	Z_TREES         = 6
)

Allowed flush values

View Source

const (
	Z_OK            = 0
	Z_STREAM_END    = 1
	Z_NEED_DICT     = 2
	Z_ERRNO         = -1
	Z_STREAM_ERROR  = -2
	Z_DATA_ERROR    = -3
	Z_MEM_ERROR     = -4
	Z_BUF_ERROR     = -5
	Z_VERSION_ERROR = -6
)

Return codes

View Source

const (
	DEFAULT_COMPRESSED_BUFFER_SIZE = 32 * 1024
)

our default buffer size most go io functions use 32KB as buffer size, so 32KB works well here for compressed data buffer

Variables ¶

View Source

var (
	// ErrChecksum is returned when reading ZLIB data that has an invalid checksum.
	ErrChecksum = zlib.ErrChecksum
	// ErrDictionary is returned when reading ZLIB data that has an invalid dictionary.
	ErrDictionary = zlib.ErrDictionary
	// ErrHeader is returned when reading ZLIB data that has an invalid header.
	ErrHeader = zlib.ErrHeader
)

Functions ¶

func Compress ¶

func Compress(input []byte) ([]byte, error)

Compress returns the input compressed using zlib, or an error if encountered.

func Decompress ¶

func Decompress(input []byte) ([]byte, error)

Decompress returns the input decompressed using zlib, or an error if encountered.

func NewReader ¶

func NewReader(r io.Reader) (io.ReadCloser, error)

NewReader creates a new io.ReadCloser. Reads from the returned io.ReadCloser read and decompress data from r. The implementation buffers input and may read more data than necessary from r. It is the caller's responsibility to call Close on the ReadCloser when done.

func NewReaderBuffer ¶

func NewReaderBuffer(r io.Reader, bufferSize int) (io.ReadCloser, error)

NewReaderBuffer has the same behavior as NewReader but the user can provides a custom buffer size.

Types ¶

type UnsafeByte ¶

type UnsafeByte []byte

An UnsafeByte is a []byte whose backing array has been allocated in C and thus is not subject to the Go garbage collector. The Unsafe versions of Compress and Decompress return this in order to prevent copying the unsafe memory into collected memory.

func NewUnsafeByte ¶

func NewUnsafeByte(p *C.char, length int) UnsafeByte

NewUnsafeByte creates a []byte from the unsafe pointer without a copy, using the method outlined in this mailing list post:

https://groups.google.com/forum/#!topic/golang-nuts/KyXR0fDp0HA

but amended to use the three-index slices from go1.2 to set the capacity of b correctly:

https://tip.golang.org/doc/go1.2#three_index

This means this code only works in go1.2+.

This shouldn't copy the underlying array; it's just casting it Afterwards, we use reflect to fix the Cap & len of the slice.

func UnsafeCompress ¶

func UnsafeCompress(input []byte) (UnsafeByte, error)

UnsafeCompress zips input into an UnsafeByte without copying the result malloced in C. The UnsafeByte returned can be used as a normal []byte but must be manually free'd w/ UnsafeByte.Free()

func UnsafeDecompress ¶

func UnsafeDecompress(input []byte) (UnsafeByte, error)

UnsafeDecompress unzips input into an UnsafeByte without copying the result malloced in C. The UnsafeByte returned can be used as a normal []byte but must be manually free'd w/ UnsafeByte.Free()

func (UnsafeByte) Free ¶

func (b UnsafeByte) Free()

Free the underlying byte array; doing this twice would be bad.

type Writer ¶

type Writer struct {
	// contains filtered or unexported fields
}

Writer implements a io.WriteCloser we will call deflateEnd when we set err to a value: - whatever error is returned by the underlying writer - io.EOF if Close was called

func NewWriter ¶

func NewWriter(w io.Writer) *Writer

NewWriter returns a new zlib writer that writes to the underlying writer

func NewWriterLevel ¶

func NewWriterLevel(w io.Writer, level int) (*Writer, error)

NewWriterLevel let the user provide a compression level value

func NewWriterLevelBuffer ¶

func NewWriterLevelBuffer(w io.Writer, level, bufferSize int) (*Writer, error)

NewWriterLevelBuffer let the user provide compression level and buffer size values

func (*Writer) Close ¶

func (z *Writer) Close() error

Close closes the zlib buffer but does not close the wrapped io.Writer originally passed to NewWriterX.

func (*Writer) Flush ¶

func (z *Writer) Flush() error

Flush let the user flush the zlib buffer to the underlying writer buffer

func (*Writer) Write ¶

func (z *Writer) Write(p []byte) (n int, err error)

Write implements the io.Writer interface

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL