pzlib

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 24, 2024 License: MIT Imports: 12 Imported by: 0

README

pzlib

Build status GoDoc

Go parallel zlib compression/decompression. This is a fully zlib compatible drop-in replacement for compress/zlib.

This will split compression into blocks that are compressed in parallel. This can be useful for compressing big amounts of data. The output is a standard zlib compressed data.

The zlib decompression is modified so it decompresses ahead of the current reader. This means that reads will be non-blocking if the decompressor can keep ahead of your code reading from it. CRC calculation also takes place in a separate goroutine.

pzlib is an adaptation of klauspost/pgzip for the zlib usecase.

Installation

$> go get git.sr.ht/~sbinet/pzlib/...

Usage

Godoc Documentation

To use as a replacement for zlib, exchange

import "compress/zlib"

with:

import zlib "git.sr.ht/~sbinet/pzlib"

License

This contains large portions of code from the Go repository - see GO_LICENSE for more information. The changes are released under MIT License. See LICENSE for more information.

Documentation

Overview

Package pzlib implements reading and writing of zlib format compressed data, as specified in RFC 1952.

This is a drop in replacement for "compress/zlib". This will split compression into blocks that are compressed in parallel. This can be useful for compressing big amounts of data. The gzip decompression has not been modified, but remains in the package, so you can use it as a complete replacement for "compress/gzip".

See more at https://sr.ht/~sbinet/pzlib

Index

Examples

Constants

View Source
const (
	NoCompression       = flate.NoCompression
	BestSpeed           = flate.BestSpeed
	BestCompression     = flate.BestCompression
	DefaultCompression  = flate.DefaultCompression
	ConstantCompression = flate.ConstantCompression // Deprecated: Use HuffmanOnly.
	HuffmanOnly         = flate.HuffmanOnly
)

These constants are copied from the flate package, so that code that imports "git.sr.ht/~sbinet/pzlib" does not also have to import "compress/flate".

Variables

View Source
var (
	// ErrChecksum is returned when reading ZLIB data that has an invalid checksum.
	ErrChecksum = zlib.ErrChecksum
	// ErrDictionary is returned when reading ZLIB data that has an invalid dictionary.
	ErrDictionary = zlib.ErrDictionary
	// ErrHeader is returned when reading ZLIB data that has an invalid header.
	ErrHeader = zlib.ErrHeader
)

Functions

func NewReader

func NewReader(r io.Reader) (io.ReadCloser, error)

NewReader creates a new ReadCloser. Reads from the returned ReadCloser read and decompress data from r. If r does not implement io.ByteReader, the decompressor may read more data than necessary from r. It is the caller's responsibility to call Close on the ReadCloser when done.

The io.ReadCloser returned by NewReader also implements [Resetter].

Example
package main

import (
	"bytes"
	"io"
	"os"

	zlib "git.sr.ht/~sbinet/pzlib"
)

func main() {
	buff := []byte{120, 156, 202, 72, 205, 201, 201, 215, 81, 40, 207,
		47, 202, 73, 225, 2, 4, 0, 0, 255, 255, 33, 231, 4, 147}
	b := bytes.NewReader(buff)

	r, err := zlib.NewReader(b)
	if err != nil {
		panic(err)
	}
	io.Copy(os.Stdout, r)

	r.Close()
}
Output:

hello, world

func NewReaderDict

func NewReaderDict(r io.Reader, dict []byte) (io.ReadCloser, error)

NewReaderDict is like NewReader but uses a preset dictionary. NewReaderDict ignores the dictionary if the compressed data does not refer to it. If the compressed data refers to a different dictionary, NewReaderDict returns ErrDictionary.

The ReadCloser returned by NewReaderDict also implements [Resetter].

Types

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

A Writer takes data written to it and writes the compressed form of that data to an underlying writer (see NewWriter).

func NewWriter

func NewWriter(w io.Writer) *Writer

NewWriter creates a new Writer. Writes to the returned Writer are compressed and written to w.

It is the caller's responsibility to call Close on the Writer when done. Writes may be buffered and not flushed until Close.

Example
package main

import (
	"bytes"
	"fmt"

	zlib "git.sr.ht/~sbinet/pzlib"
)

func main() {
	var b bytes.Buffer

	w := zlib.NewWriter(&b)
	w.SetConcurrency(1<<20, 2)
	w.Write([]byte("hello, world\n"))
	w.Close()
	fmt.Println(b.Bytes())
}
Output:

[120 156 0 13 0 242 255 104 101 108 108 111 44 32 119 111 114 108 100 10 0 0 0 255 255 3 0 33 231 4 147]

func NewWriterLevel

func NewWriterLevel(w io.Writer, level int) (*Writer, error)

NewWriterLevel is like NewWriter but specifies the compression level instead of assuming DefaultCompression.

The compression level can be DefaultCompression, NoCompression, HuffmanOnly or any integer value between BestSpeed and BestCompression inclusive. The error returned will be nil if the level is valid.

func NewWriterLevelDict

func NewWriterLevelDict(w io.Writer, level int, dict []byte) (*Writer, error)

NewWriterLevelDict is like NewWriterLevel but specifies a dictionary to compress with.

The dictionary may be nil. If not, its contents should not be modified until the Writer is closed.

func (*Writer) Close

func (z *Writer) Close() error

Close closes the Writer, flushing any unwritten data to the underlying io.Writer, but does not close the underlying io.Writer.

func (*Writer) Flush

func (z *Writer) Flush() error

Flush flushes any pending compressed data to the underlying writer.

It is useful mainly in compressed network protocols, to ensure that a remote reader has enough data to reconstruct a packet. Flush does not return until the data has been written. If the underlying writer returns an error, Flush returns that error.

In the terminology of the zlib library, Flush is equivalent to Z_SYNC_FLUSH.

func (*Writer) Reset

func (z *Writer) Reset(w io.Writer)

Reset clears the state of the Writer z such that it is equivalent to its initial state from NewWriterLevel or NewWriterLevelDict, but instead writing to w.

func (*Writer) SetConcurrency

func (z *Writer) SetConcurrency(blockSize, blocks int) error

Use SetConcurrency to finetune the concurrency level if needed.

With this you can control the approximate size of your blocks, as well as how many you want to be processing in parallel.

Default values for this is SetConcurrency(defaultBlockSize, runtime.GOMAXPROCS(0)), meaning blocks are split at 1 MB and up to the number of CPU threads can be processing at once before the writer blocks.

func (*Writer) Write

func (z *Writer) Write(p []byte) (n int, err error)

Write writes a compressed form of p to the underlying io.Writer. The compressed bytes are not necessarily flushed until the Writer is closed or explicitly flushed.

The function will return quickly, if there are unused buffers. The sent slice (p) is copied, and the caller is free to re-use the buffer when the function returns.

Errors that occur during compression will be reported later, and a nil error does not signify that the compression succeeded (since it is most likely still running) That means that the call that returns an error may not be the call that caused it. Only Flush and Close functions are guaranteed to return any errors up to that point.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL