lz4

package module

v2.6.1+incompatible Latest Latest Go to latest Published: Jun 3, 2021 License: BSD-3-Clause Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/TetrationAnalytics/lz4

Links

Open Source Insights

README ¶

lz4 : LZ4 compression in pure Go

Overview

This package provides a streaming interface to LZ4 data streams as well as low level compress and uncompress functions for LZ4 data blocks. The implementation is based on the reference C one.

Install

Assuming you have the go toolchain installed:

go get github.com/pierrec/lz4

There is a command line interface tool to compress and decompress LZ4 files.

go install github.com/pierrec/lz4/cmd/lz4c

Usage

Usage of lz4c:
  -version
        print the program version

Subcommands:
Compress the given files or from stdin to stdout.
compress [arguments] [<file name> ...]
  -bc
        enable block checksum
  -l int
        compression level (0=fastest)
  -sc
        disable stream checksum
  -size string
        block max size [64K,256K,1M,4M] (default "4M")

Uncompress the given files or from stdin to stdout.
uncompress [arguments] [<file name> ...]

Example

// Compress and uncompress an input string.
s := "hello world"
r := strings.NewReader(s)

// The pipe will uncompress the data from the writer.
pr, pw := io.Pipe()
zw := lz4.NewWriter(pw)
zr := lz4.NewReader(pr)

go func() {
	// Compress the input string.
	_, _ = io.Copy(zw, r)
	_ = zw.Close() // Make sure the writer is closed
	_ = pw.Close() // Terminate the pipe
}()

_, _ = io.Copy(os.Stdout, zr)

// Output:
// hello world

Contributing

Contributions are very welcome for bug fixing, performance improvements...!

Open an issue with a proper description
Send a pull request with appropriate test case(s)

Contributors

Thanks to all contributors so far!

Special thanks to @Zariel for his asm implementation of the decoder.

Special thanks to @klauspost for his work on optimizing the code.

Documentation ¶

Rendered for

Overview ¶

Package lz4 implements reading and writing lz4 compressed data (a frame), as specified in http://fastcompression.blogspot.fr/2013/04/lz4-streaming-format-final.html.

Although the block level compression and decompression functions are exposed and are fully compatible with the lz4 block format definition, they are low level and should not be used directly. For a complete description of an lz4 compressed block, see: http://fastcompression.blogspot.fr/2011/05/lz4-explained.html

See https://github.com/Cyan4973/lz4 for the reference C implementation.

Example ¶

package main

import (
	"io"
	"os"
	"strings"

	"github.com/pierrec/lz4"
)

func main() {
	// Compress and uncompress an input string.
	s := "hello world"
	r := strings.NewReader(s)

	// The pipe will uncompress the data from the writer.
	pr, pw := io.Pipe()
	zw := lz4.NewWriter(pw)
	zr := lz4.NewReader(pr)

	go func() {
		// Compress the input string.
		_, _ = io.Copy(zw, r)
		_ = zw.Close() // Make sure the writer is closed
		_ = pw.Close() // Terminate the pipe
	}()

	_, _ = io.Copy(os.Stdout, zr)

}

Output:

hello world

Index ¶

Constants
Variables
func CompressBlock(src, dst []byte, hashTable []int) (_ int, err error)
func CompressBlockBound(n int) int
func CompressBlockHC(src, dst []byte, depth int) (_ int, err error)
func UncompressBlock(src, dst []byte) (int, error)
type Header
- func (h *Header) Reset()
- func (h Header) String() string
type Reader
- func NewReader(src io.Reader) *Reader
- func (z *Reader) Read(buf []byte) (int, error)
- func (z *Reader) Reset(r io.Reader)
- func (z *Reader) Seek(offset int64, whence int) (int64, error)
type ReaderLegacy
- func NewReaderLegacy(src io.Reader) *ReaderLegacy
- func (z *ReaderLegacy) Read(buf []byte) (int, error)
- func (z *ReaderLegacy) Reset(r io.Reader)
- func (z *ReaderLegacy) Seek(offset int64, whence int) (int64, error)
type Writer
- func NewWriter(dst io.Writer) *Writer
- func (z *Writer) Close() error
- func (z *Writer) Flush() error
- func (z *Writer) Reset(w io.Writer)
- func (z *Writer) WithConcurrency(n int) *Writer
- func (z *Writer) Write(buf []byte) (int, error)
type WriterLegacy
- func NewWriterLegacy(dst io.Writer) *WriterLegacy
- func (z *WriterLegacy) Close() error
- func (z *WriterLegacy) Flush() error
- func (z *WriterLegacy) Reset(w io.Writer)
- func (z *WriterLegacy) Write(buf []byte) (int, error)

Constants ¶

View Source

const (
	// Extension is the LZ4 frame file name extension
	Extension = ".lz4"
	// Version is the LZ4 frame format version
	Version = 1
)

Variables ¶

View Source

var (
	// ErrInvalidSourceShortBuffer is returned by UncompressBlock or CompressBLock when a compressed
	// block is corrupted or the destination buffer is not large enough for the uncompressed data.
	ErrInvalidSourceShortBuffer = errors.New("lz4: invalid source or destination buffer too short")
	// ErrInvalid is returned when reading an invalid LZ4 archive.
	ErrInvalid = errors.New("lz4: bad magic number")
	// ErrBlockDependency is returned when attempting to decompress an archive created with block dependency.
	ErrBlockDependency = errors.New("lz4: block dependency not supported")
	// ErrUnsupportedSeek is returned when attempting to Seek any way but forward from the current position.
	ErrUnsupportedSeek = errors.New("lz4: can only seek forward from io.SeekCurrent")
)

Functions ¶

func CompressBlock ¶

func CompressBlock(src, dst []byte, hashTable []int) (_ int, err error)

CompressBlock compresses the source buffer into the destination one. This is the fast version of LZ4 compression and also the default one.

The argument hashTable is scratch space for a hash table used by the compressor. If provided, it should have length at least 1<<16. If it is shorter (or nil), CompressBlock allocates its own hash table.

The size of the compressed data is returned.

If the destination buffer size is lower than CompressBlockBound and the compressed size is 0 and no error, then the data is incompressible.

An error is returned if the destination buffer is too small.

Example ¶

package main

import (
	"fmt"
	"strings"

	"github.com/pierrec/lz4"
)

func main() {
	s := "hello world"
	data := []byte(strings.Repeat(s, 100))
	buf := make([]byte, len(data))
	ht := make([]int, 64<<10) // buffer for the compression table

	n, err := lz4.CompressBlock(data, buf, ht)
	if err != nil {
		fmt.Println(err)
	}
	if n >= len(data) {
		fmt.Printf("`%s` is not compressible", s)
	}
	buf = buf[:n] // compressed data

	// Allocated a very large buffer for decompression.
	out := make([]byte, 10*len(data))
	n, err = lz4.UncompressBlock(buf, out)
	if err != nil {
		fmt.Println(err)
	}
	out = out[:n] // uncompressed data

	fmt.Println(string(out[:len(s)]))

}

Output:

hello world

func CompressBlockBound ¶

func CompressBlockBound(n int) int

CompressBlockBound returns the maximum size of a given buffer of size n, when not compressible.

func CompressBlockHC ¶

func CompressBlockHC(src, dst []byte, depth int) (_ int, err error)

CompressBlockHC compresses the source buffer src into the destination dst with max search depth (use 0 or negative value for no max).

CompressBlockHC compression ratio is better than CompressBlock but it is also slower.

The size of the compressed data is returned.

If the destination buffer size is lower than CompressBlockBound and the compressed size is 0 and no error, then the data is incompressible.

An error is returned if the destination buffer is too small.

func UncompressBlock ¶

func UncompressBlock(src, dst []byte) (int, error)

UncompressBlock uncompresses the source buffer into the destination one, and returns the uncompressed size.

The destination buffer must be sized appropriately.

An error is returned if the source data is invalid or the destination buffer is too small.

Types ¶

type Header ¶

type Header struct {
	BlockChecksum    bool   // Compressed blocks checksum flag.
	NoChecksum       bool   // Frame checksum flag.
	BlockMaxSize     int    // Size of the uncompressed data block (one of [64KB, 256KB, 1MB, 4MB]). Default=4MB.
	Size             uint64 // Frame total size. It is _not_ computed by the Writer.
	CompressionLevel int    // Compression level (higher is better, use 0 for fastest compression).
	// contains filtered or unexported fields
}

Header describes the various flags that can be set on a Writer or obtained from a Reader. The default values match those of the LZ4 frame format definition (http://fastcompression.blogspot.com/2013/04/lz4-streaming-format-final.html).

NB. in a Reader, in case of concatenated frames, the Header values may change between Read() calls. It is the caller's responsibility to check them if necessary.

func (h *Header) Reset()

Reset reset internal status

func (h Header) String() string

type Reader ¶

type Reader struct {
	Header
	// Handler called when a block has been successfully read.
	// It provides the number of bytes read.
	OnBlockDone func(size int)
	// contains filtered or unexported fields
}

Reader implements the LZ4 frame decoder. The Header is set after the first call to Read(). The Header may change between Read() calls in case of concatenated frames.

func NewReader ¶

func NewReader(src io.Reader) *Reader

NewReader returns a new LZ4 frame decoder. No access to the underlying io.Reader is performed.

func (*Reader) Read ¶

func (z *Reader) Read(buf []byte) (int, error)

Read decompresses data from the underlying source into the supplied buffer.

Since there can be multiple streams concatenated, Header values may change between calls to Read(). If that is the case, no data is actually read from the underlying io.Reader, to allow for potential input buffer resizing.

func (*Reader) Reset ¶

func (z *Reader) Reset(r io.Reader)

Reset discards the Reader's state and makes it equivalent to the result of its original state from NewReader, but reading from r instead. This permits reusing a Reader rather than allocating a new one.

func (*Reader) Seek ¶

func (z *Reader) Seek(offset int64, whence int) (int64, error)

Seek implements io.Seeker, but supports seeking forward from the current position only. Any other seek will return an error. Allows skipping output bytes which aren't needed, which in some scenarios is faster than reading and discarding them. Note this may cause future calls to Read() to read 0 bytes if all of the data they would have returned is skipped.

type ReaderLegacy ¶

type ReaderLegacy struct {
	Header
	// Handler called when a block has been successfully read.
	// It provides the number of bytes read.
	OnBlockDone func(size int)
	// contains filtered or unexported fields
}

ReaderLegacy implements the LZ4Demo frame decoder. The Header is set after the first call to Read().

func NewReaderLegacy ¶

func NewReaderLegacy(src io.Reader) *ReaderLegacy

NewReaderLegacy returns a new LZ4Demo frame decoder. No access to the underlying io.Reader is performed.

func (*ReaderLegacy) Read ¶

func (z *ReaderLegacy) Read(buf []byte) (int, error)

Read decompresses data from the underlying source into the supplied buffer.

Since there can be multiple streams concatenated, Header values may change between calls to Read(). If that is the case, no data is actually read from the underlying io.Reader, to allow for potential input buffer resizing.

func (*ReaderLegacy) Reset ¶

func (z *ReaderLegacy) Reset(r io.Reader)

Reset discards the Reader's state and makes it equivalent to the result of its original state from NewReader, but reading from r instead. This permits reusing a Reader rather than allocating a new one.

func (*ReaderLegacy) Seek ¶

func (z *ReaderLegacy) Seek(offset int64, whence int) (int64, error)

Seek implements io.Seeker, but supports seeking forward from the current position only. Any other seek will return an error. Allows skipping output bytes which aren't needed, which in some scenarios is faster than reading and discarding them. Note this may cause future calls to Read() to read 0 bytes if all of the data they would have returned is skipped.

type Writer ¶

type Writer struct {
	Header
	// Handler called when a block has been successfully written out.
	// It provides the number of bytes written.
	OnBlockDone func(size int)
	// contains filtered or unexported fields
}

Writer implements the LZ4 frame encoder.

func NewWriter ¶

func NewWriter(dst io.Writer) *Writer

NewWriter returns a new LZ4 frame encoder. No access to the underlying io.Writer is performed. The supplied Header is checked at the first Write. It is ok to change it before the first Write but then not until a Reset() is performed.

func (*Writer) Close ¶

func (z *Writer) Close() error

Close closes the Writer, flushing any unwritten data to the underlying io.Writer, but does not close the underlying io.Writer.

func (*Writer) Flush ¶

func (z *Writer) Flush() error

Flush flushes any pending compressed data to the underlying writer. Flush does not return until the data has been written. If the underlying writer returns an error, Flush returns that error.

func (*Writer) Reset ¶

func (z *Writer) Reset(w io.Writer)

Reset clears the state of the Writer z such that it is equivalent to its initial state from NewWriter, but instead writing to w. No access to the underlying io.Writer is performed.

func (*Writer) WithConcurrency ¶

func (z *Writer) WithConcurrency(n int) *Writer

WithConcurrency sets the number of concurrent go routines used for compression. A negative value sets the concurrency to GOMAXPROCS.

func (*Writer) Write ¶

func (z *Writer) Write(buf []byte) (int, error)

Write compresses data from the supplied buffer into the underlying io.Writer. Write does not return until the data has been written.

type WriterLegacy ¶

type WriterLegacy struct {
	Header
	// Handler called when a block has been successfully read.
	// It provides the number of bytes read.
	OnBlockDone func(size int)
	// contains filtered or unexported fields
}

WriterLegacy implements the LZ4Demo frame decoder.

func NewWriterLegacy ¶

func NewWriterLegacy(dst io.Writer) *WriterLegacy

NewWriterLegacy returns a new LZ4 encoder for the legacy frame format. No access to the underlying io.Writer is performed. The supplied Header is checked at the first Write. It is ok to change it before the first Write but then not until a Reset() is performed.

func (*WriterLegacy) Close ¶

func (z *WriterLegacy) Close() error

Close closes the WriterLegacy, flushing any unwritten data to the underlying io.Writer, but does not close the underlying io.Writer.

func (*WriterLegacy) Flush ¶

func (z *WriterLegacy) Flush() error