xz

package module
v0.6.0-alpha1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2023 License: BSD-3-Clause Imports: 12 Imported by: 746

README

Package xz

This Go language package supports the reading and writing of xz compressed streams. It includes also a gxz command for compressing and decompressing data. The package is completely written in Go and doesn't have any dependency on any C code.

The package is currently under development. There might be bugs and APIs are not considered stable. At this time the package cannot compete with the xz tool regarding compression speed and size. The algorithms there have been developed over a long time and are highly optimized. However there are a number of improvements planned and I'm very optimistic about parallel compression and decompression. Stay tuned!

Using the API

The following example program shows how to use the API.

package main

import (
    "bytes"
    "io"
    "log"
    "os"

    "github.com/ulikunitz/xz"
)

func main() {
    const text = "The quick brown fox jumps over the lazy dog.\n"
    var buf bytes.Buffer
    // compress text
    w, err := xz.NewWriter(&buf)
    if err != nil {
        log.Fatalf("xz.NewWriter error %s", err)
    }
    if _, err := io.WriteString(w, text); err != nil {
        log.Fatalf("WriteString error %s", err)
    }
    if err := w.Close(); err != nil {
        log.Fatalf("w.Close error %s", err)
    }
    // decompress buffer and write output to stdout
    r, err := xz.NewReader(&buf)
    if err != nil {
        log.Fatalf("NewReader error %s", err)
    }
    if _, err = io.Copy(os.Stdout, r); err != nil {
        log.Fatalf("io.Copy error %s", err)
    }
}

Documentation

You can find the full documentation at pkg.go.dev.

Using the gxz compression tool

The package includes a gxz command line utility for compression and decompression.

Use following command for installation:

$ go get github.com/ulikunitz/xz/cmd/gxz

To test it call the following command.

$ gxz bigfile

After some time a much smaller file bigfile.xz will replace bigfile. To decompress it use the following command.

$ gxz -d bigfile.xz

Documentation

Overview

Package xz supports the compression and decompression of xz files. It supports version 1.1.0 of the specification without the non-LZMA2 filters. See http://tukaani.org/xz/xz-file-format-1.1.0.txt

Example
const text = "The quick brown fox jumps over the lazy dog."
var buf bytes.Buffer

// compress text
w, err := NewWriter(&buf)
if err != nil {
	log.Fatalf("NewWriter error %s", err)
}
if _, err := io.WriteString(w, text); err != nil {
	log.Fatalf("WriteString error %s", err)
}
if err := w.Close(); err != nil {
	log.Fatalf("w.Close error %s", err)
}

// decompress buffer and write result to stdout
r, err := NewReader(&buf)
if err != nil {
	log.Fatalf("NewReader error %s", err)
}
if _, err = io.Copy(os.Stdout, r); err != nil {
	log.Fatalf("io.Copy error %s", err)
}
Output:

The quick brown fox jumps over the lazy dog.

Index

Examples

Constants

View Source
const (
	None   byte = 0x0
	CRC32  byte = 0x1
	CRC64  byte = 0x4
	SHA256 byte = 0xa
)

Constants for the checksum methods supported by xz.

View Source
const HeaderLen = 12

HeaderLen provides the length of the xz file header.

Variables

This section is empty.

Functions

func NewReader

func NewReader(xz io.Reader) (r io.ReadCloser, err error)

NewReader creates an io.ReadCloser. The function should never fail.

func NewReaderConfig

func NewReaderConfig(xz io.Reader, cfg ReaderConfig) (r io.ReadCloser, err error)

NewReaderConfig creates an xz reader using the provided configuration. If Workers are larger than one, the LZMA reader will only use single-threaded workers.

func ValidHeader

func ValidHeader(data []byte) bool

ValidHeader checks whether data is a correct xz file header. The length of data must be HeaderLen.

Types

type ReaderConfig added in v0.5.1

type ReaderConfig struct {
	LZMA lzma.Reader2Config

	// input contains only a single stream without padding.
	SingleStream bool

	// Workers defines the number of readers for parallel reading. The
	// default is the value of GOMAXPROCS.
	Workers int
}

ReaderConfig defines the parameters for the xz reader. The SingleStream parameter requests the reader to assume that the underlying stream contains only a single stream without padding.

The workers variable controls the number of parallel workers decoding the file. It only has an effect if the file was encoded in a way that it created blocks with the compressed size set in the headers. If Workers not 1 the Workers variable in LZMAConfig will be ignored.

func (*ReaderConfig) SetDefaults

func (cfg *ReaderConfig) SetDefaults()

SetDefaults sets the defaults in ReaderConfig.

func (*ReaderConfig) Verify added in v0.5.1

func (cfg *ReaderConfig) Verify() error

Verify checks the reader parameters for Validity. Zero values will be replaced by default values.

type WriteFlushCloser

type WriteFlushCloser interface {
	io.WriteCloser
	Flush() error
}

func NewWriter

func NewWriter(xz io.Writer) (w WriteFlushCloser, err error)

func NewWriterConfig

func NewWriterConfig(xz io.Writer, cfg WriterConfig) (w WriteFlushCloser, err error)

NewWriterConfig creates a WriteFlushCloser instance. If multi-threading is requested by a Workers configuration larger than 1, single threading will be requested for the LZMA writer by setting the Workers variable there to 1.

type WriterConfig added in v0.5.1

type WriterConfig struct {
	// LZMA2 configuration
	LZMA lzma.Writer2Config

	// BlockSize defines the maximum uncompressed size of a block.
	// (default: MaxInt64=2^63-1) if Worker equals 1 or 8 MB otherwise.
	BlockSize int64

	// checksum method: CRC32, CRC64 or SHA256 (default: CRC64)
	CheckSum byte
	// Forces NoChecksum (default: false)
	NoCheckSum bool

	// Workers defines the number of goroutines compressing data. If it is
	// zero the GONUMPROCS environment variable determines the number of
	// goroutines. If the number is 1 the compression happens in classic
	// streaming mode and the compressed file must be also decompressed
	// serially.
	Workers int
}

WriterConfig describe the parameters for an xz writer. CRC64 is used as the default checksum despite the XZ specification saying a decoder must only support CRC32.

func (*WriterConfig) SetDefaults

func (c *WriterConfig) SetDefaults()

SetDefaults applies the defaults to the xz writer configuration.

func (*WriterConfig) Verify added in v0.5.1

func (c *WriterConfig) Verify() error

Verify checks the configuration for errors. Zero values will be replaced by default values.

Directories

Path Synopsis
internal
randtxt
Package randtxt supports the generation of random text using a trigram model for the English language.
Package randtxt supports the generation of random text using a trigram model for the English language.
Package lzma provides support for the encoding and decoding LZMA, LZMA2 and raw LZMA streams without a header.
Package lzma provides support for the encoding and decoding LZMA, LZMA2 and raw LZMA streams without a header.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL