lz4

package module
v2.6.1+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 3, 2021 License: BSD-3-Clause Imports: 12 Imported by: 759

README

lz4 : LZ4 compression in pure Go

GoDoc Build Status Go Report Card GitHub tag (latest SemVer)

Overview

This package provides a streaming interface to LZ4 data streams as well as low level compress and uncompress functions for LZ4 data blocks. The implementation is based on the reference C one.

Install

Assuming you have the go toolchain installed:

go get github.com/pierrec/lz4

There is a command line interface tool to compress and decompress LZ4 files.

go install github.com/pierrec/lz4/cmd/lz4c

Usage

Usage of lz4c:
  -version
        print the program version

Subcommands:
Compress the given files or from stdin to stdout.
compress [arguments] [<file name> ...]
  -bc
        enable block checksum
  -l int
        compression level (0=fastest)
  -sc
        disable stream checksum
  -size string
        block max size [64K,256K,1M,4M] (default "4M")

Uncompress the given files or from stdin to stdout.
uncompress [arguments] [<file name> ...]

Example

// Compress and uncompress an input string.
s := "hello world"
r := strings.NewReader(s)

// The pipe will uncompress the data from the writer.
pr, pw := io.Pipe()
zw := lz4.NewWriter(pw)
zr := lz4.NewReader(pr)

go func() {
	// Compress the input string.
	_, _ = io.Copy(zw, r)
	_ = zw.Close() // Make sure the writer is closed
	_ = pw.Close() // Terminate the pipe
}()

_, _ = io.Copy(os.Stdout, zr)

// Output:
// hello world

Contributing

Contributions are very welcome for bug fixing, performance improvements...!

  • Open an issue with a proper description
  • Send a pull request with appropriate test case(s)

Contributors

Thanks to all contributors so far!

Special thanks to @Zariel for his asm implementation of the decoder.

Special thanks to @klauspost for his work on optimizing the code.

Documentation

Overview

Package lz4 implements reading and writing lz4 compressed data (a frame), as specified in http://fastcompression.blogspot.fr/2013/04/lz4-streaming-format-final.html.

Although the block level compression and decompression functions are exposed and are fully compatible with the lz4 block format definition, they are low level and should not be used directly. For a complete description of an lz4 compressed block, see: http://fastcompression.blogspot.fr/2011/05/lz4-explained.html

See https://github.com/Cyan4973/lz4 for the reference C implementation.

Example
package main

import (
	"io"
	"os"
	"strings"

	"github.com/pierrec/lz4"
)

func main() {
	// Compress and uncompress an input string.
	s := "hello world"
	r := strings.NewReader(s)

	// The pipe will uncompress the data from the writer.
	pr, pw := io.Pipe()
	zw := lz4.NewWriter(pw)
	zr := lz4.NewReader(pr)

	go func() {
		// Compress the input string.
		_, _ = io.Copy(zw, r)
		_ = zw.Close() // Make sure the writer is closed
		_ = pw.Close() // Terminate the pipe
	}()

	_, _ = io.Copy(os.Stdout, zr)

}
Output:

hello world

Index

Examples

Constants

View Source
const (
	// Extension is the LZ4 frame file name extension
	Extension = ".lz4"
	// Version is the LZ4 frame format version
	Version = 1
)

Variables

View Source
var (
	// ErrInvalidSourceShortBuffer is returned by UncompressBlock or CompressBLock when a compressed
	// block is corrupted or the destination buffer is not large enough for the uncompressed data.
	ErrInvalidSourceShortBuffer = errors.New("lz4: invalid source or destination buffer too short")
	// ErrInvalid is returned when reading an invalid LZ4 archive.
	ErrInvalid = errors.New("lz4: bad magic number")
	// ErrBlockDependency is returned when attempting to decompress an archive created with block dependency.
	ErrBlockDependency = errors.New("lz4: block dependency not supported")
	// ErrUnsupportedSeek is returned when attempting to Seek any way but forward from the current position.
	ErrUnsupportedSeek = errors.New("lz4: can only seek forward from io.SeekCurrent")
)

Functions

func CompressBlock

func CompressBlock(src, dst []byte, hashTable []int) (_ int, err error)

CompressBlock compresses the source buffer into the destination one. This is the fast version of LZ4 compression and also the default one.

The argument hashTable is scratch space for a hash table used by the compressor. If provided, it should have length at least 1<<16. If it is shorter (or nil), CompressBlock allocates its own hash table.

The size of the compressed data is returned.

If the destination buffer size is lower than CompressBlockBound and the compressed size is 0 and no error, then the data is incompressible.

An error is returned if the destination buffer is too small.

Example
package main

import (
	"fmt"
	"strings"

	"github.com/pierrec/lz4"
)

func main() {
	s := "hello world"
	data := []byte(strings.Repeat(s, 100))
	buf := make([]byte, len(data))
	ht := make([]int, 64<<10) // buffer for the compression table

	n, err := lz4.CompressBlock(data, buf, ht)
	if err != nil {
		fmt.Println(err)
	}
	if n >= len(data) {
		fmt.Printf("`%s` is not compressible", s)
	}
	buf = buf[:n] // compressed data

	// Allocated a very large buffer for decompression.
	out := make([]byte, 10*len(data))
	n, err = lz4.UncompressBlock(buf, out)
	if err != nil {
		fmt.Println(err)
	}
	out = out[:n] // uncompressed data

	fmt.Println(string(out[:len(s)]))

}
Output:

hello world

func CompressBlockBound

func CompressBlockBound(n int) int

CompressBlockBound returns the maximum size of a given buffer of size n, when not compressible.

func CompressBlockHC

func CompressBlockHC(src, dst []byte, depth int) (_ int, err error)

CompressBlockHC compresses the source buffer src into the destination dst with max search depth (use 0 or negative value for no max).

CompressBlockHC compression ratio is better than CompressBlock but it is also slower.

The size of the compressed data is returned.

If the destination buffer size is lower than CompressBlockBound and the compressed size is 0 and no error, then the data is incompressible.

An error is returned if the destination buffer is too small.

func UncompressBlock

func UncompressBlock(src, dst []byte) (int, error)

UncompressBlock uncompresses the source buffer into the destination one, and returns the uncompressed size.

The destination buffer must be sized appropriately.

An error is returned if the source data is invalid or the destination buffer is too small.

Types

type Header struct {
	BlockChecksum    bool   // Compressed blocks checksum flag.
	NoChecksum       bool   // Frame checksum flag.
	BlockMaxSize     int    // Size of the uncompressed data block (one of [64KB, 256KB, 1MB, 4MB]). Default=4MB.
	Size             uint64 // Frame total size. It is _not_ computed by the Writer.
	CompressionLevel int    // Compression level (higher is better, use 0 for fastest compression).
	// contains filtered or unexported fields
}

Header describes the various flags that can be set on a Writer or obtained from a Reader. The default values match those of the LZ4 frame format definition (http://fastcompression.blogspot.com/2013/04/lz4-streaming-format-final.html).

NB. in a Reader, in case of concatenated frames, the Header values may change between Read() calls. It is the caller's responsibility to check them if necessary.

func (*Header) Reset

func (h *Header) Reset()

Reset reset internal status

func (Header) String

func (h Header) String() string

type Reader

type Reader struct {
	Header
	// Handler called when a block has been successfully read.
	// It provides the number of bytes read.
	OnBlockDone func(size int)
	// contains filtered or unexported fields
}

Reader implements the LZ4 frame decoder. The Header is set after the first call to Read(). The Header may change between Read() calls in case of concatenated frames.

func NewReader

func NewReader(src io.Reader) *Reader

NewReader returns a new LZ4 frame decoder. No access to the underlying io.Reader is performed.

func (*Reader) Read

func (z *Reader) Read(buf []byte) (int, error)

Read decompresses data from the underlying source into the supplied buffer.

Since there can be multiple streams concatenated, Header values may change between calls to Read(). If that is the case, no data is actually read from the underlying io.Reader, to allow for potential input buffer resizing.

func (*Reader) Reset

func (z *Reader) Reset(r io.Reader)

Reset discards the Reader's state and makes it equivalent to the result of its original state from NewReader, but reading from r instead. This permits reusing a Reader rather than allocating a new one.

func (*Reader) Seek

func (z *Reader) Seek(offset int64, whence int) (int64, error)

Seek implements io.Seeker, but supports seeking forward from the current position only. Any other seek will return an error. Allows skipping output bytes which aren't needed, which in some scenarios is faster than reading and discarding them. Note this may cause future calls to Read() to read 0 bytes if all of the data they would have returned is skipped.

type ReaderLegacy

type ReaderLegacy struct {
	Header
	// Handler called when a block has been successfully read.
	// It provides the number of bytes read.
	OnBlockDone func(size int)
	// contains filtered or unexported fields
}

ReaderLegacy implements the LZ4Demo frame decoder. The Header is set after the first call to Read().

func NewReaderLegacy

func NewReaderLegacy(src io.Reader) *ReaderLegacy

NewReaderLegacy returns a new LZ4Demo frame decoder. No access to the underlying io.Reader is performed.

func (*ReaderLegacy) Read

func (z *ReaderLegacy) Read(buf []byte) (int, error)

Read decompresses data from the underlying source into the supplied buffer.

Since there can be multiple streams concatenated, Header values may change between calls to Read(). If that is the case, no data is actually read from the underlying io.Reader, to allow for potential input buffer resizing.

func (*ReaderLegacy) Reset

func (z *ReaderLegacy) Reset(r io.Reader)

Reset discards the Reader's state and makes it equivalent to the result of its original state from NewReader, but reading from r instead. This permits reusing a Reader rather than allocating a new one.

func (*ReaderLegacy) Seek

func (z *ReaderLegacy) Seek(offset int64, whence int) (int64, error)

Seek implements io.Seeker, but supports seeking forward from the current position only. Any other seek will return an error. Allows skipping output bytes which aren't needed, which in some scenarios is faster than reading and discarding them. Note this may cause future calls to Read() to read 0 bytes if all of the data they would have returned is skipped.

type Writer

type Writer struct {
	Header
	// Handler called when a block has been successfully written out.
	// It provides the number of bytes written.
	OnBlockDone func(size int)
	// contains filtered or unexported fields
}

Writer implements the LZ4 frame encoder.

func NewWriter

func NewWriter(dst io.Writer) *Writer

NewWriter returns a new LZ4 frame encoder. No access to the underlying io.Writer is performed. The supplied Header is checked at the first Write. It is ok to change it before the first Write but then not until a Reset() is performed.

func (*Writer) Close

func (z *Writer) Close() error

Close closes the Writer, flushing any unwritten data to the underlying io.Writer, but does not close the underlying io.Writer.

func (*Writer) Flush

func (z *Writer) Flush() error

Flush flushes any pending compressed data to the underlying writer. Flush does not return until the data has been written. If the underlying writer returns an error, Flush returns that error.

func (*Writer) Reset

func (z *Writer) Reset(w io.Writer)

Reset clears the state of the Writer z such that it is equivalent to its initial state from NewWriter, but instead writing to w. No access to the underlying io.Writer is performed.

func (*Writer) WithConcurrency

func (z *Writer) WithConcurrency(n int) *Writer

WithConcurrency sets the number of concurrent go routines used for compression. A negative value sets the concurrency to GOMAXPROCS.

func (*Writer) Write

func (z *Writer) Write(buf []byte) (int, error)

Write compresses data from the supplied buffer into the underlying io.Writer. Write does not return until the data has been written.

type WriterLegacy

type WriterLegacy struct {
	Header
	// Handler called when a block has been successfully read.
	// It provides the number of bytes read.
	OnBlockDone func(size int)
	// contains filtered or unexported fields
}

WriterLegacy implements the LZ4Demo frame decoder.

func NewWriterLegacy

func NewWriterLegacy(dst io.Writer) *WriterLegacy

NewWriterLegacy returns a new LZ4 encoder for the legacy frame format. No access to the underlying io.Writer is performed. The supplied Header is checked at the first Write. It is ok to change it before the first Write but then not until a Reset() is performed.

func (*WriterLegacy) Close

func (z *WriterLegacy) Close() error

Close closes the WriterLegacy, flushing any unwritten data to the underlying io.Writer, but does not close the underlying io.Writer.

func (*WriterLegacy) Flush

func (z *WriterLegacy) Flush() error

Flush flushes any pending compressed data to the underlying writer. Flush does not return until the data has been written. If the underlying writer returns an error, Flush returns that error.

func (*WriterLegacy) Reset

func (z *WriterLegacy) Reset(w io.Writer)

Reset clears the state of the WriterLegacy z such that it is equivalent to its initial state from NewWriterLegacy, but instead writing to w. No access to the underlying io.Writer is performed.

func (*WriterLegacy) Write

func (z *WriterLegacy) Write(buf []byte) (int, error)

Write compresses data from the supplied buffer into the underlying io.Writer. Write does not return until the data has been written.

Directories

Path Synopsis
cmd
internal
xxh32
Package xxh32 implements the very fast XXH hashing algorithm (32 bits version).
Package xxh32 implements the very fast XXH hashing algorithm (32 bits version).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL