xz

package module
v0.6.0-alpha.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 12, 2023 License: BSD-3-Clause Imports: 14 Imported by: 746

README

Package xz

This Go language package supports the reading and writing of xz compressed streams. It includes also a gxz command for compressing and decompressing data. The package is completely written in Go and doesn't have any dependency on any C code.

The package is currently under development. There might be bugs and APIs are not considered stable. At this time the package cannot compete with the xz tool regarding compression speed and size. The algorithms there have been developed over a long time and are highly optimized. However there are a number of improvements planned and I'm very optimistic about parallel compression and decompression. Stay tuned!

Using the API

The following example program shows how to use the API.

package main

import (
    "bytes"
    "io"
    "log"
    "os"

    "github.com/ulikunitz/xz"
)

func main() {
    const text = "The quick brown fox jumps over the lazy dog.\n"
    var buf bytes.Buffer
    // compress text
    w, err := xz.NewWriter(&buf)
    if err != nil {
        log.Fatalf("xz.NewWriter error %s", err)
    }
    if _, err := io.WriteString(w, text); err != nil {
        log.Fatalf("WriteString error %s", err)
    }
    if err := w.Close(); err != nil {
        log.Fatalf("w.Close error %s", err)
    }
    // decompress buffer and write output to stdout
    r, err := xz.NewReader(&buf)
    if err != nil {
        log.Fatalf("NewReader error %s", err)
    }
    if _, err = io.Copy(os.Stdout, r); err != nil {
        log.Fatalf("io.Copy error %s", err)
    }
}

Documentation

You can find the full documentation at pkg.go.dev.

Using the gxz compression tool

The package includes a gxz command line utility for compression and decompression.

Use following command for installation:

$ go get github.com/ulikunitz/xz/cmd/gxz

To test it call the following command.

$ gxz bigfile

After some time a much smaller file bigfile.xz will replace bigfile. To decompress it use the following command.

$ gxz -d bigfile.xz

Documentation

Overview

Package xz supports the compression and decompression of xz files. It supports version 1.1.0 of the specification without the non-LZMA2 filters. See http://tukaani.org/xz/xz-file-format-1.1.0.txt

Example
const text = "The quick brown fox jumps over the lazy dog."
var buf bytes.Buffer

// compress text
w, err := NewWriter(&buf)
if err != nil {
	log.Fatalf("NewWriter error %s", err)
}
if _, err := io.WriteString(w, text); err != nil {
	log.Fatalf("WriteString error %s", err)
}
if err := w.Close(); err != nil {
	log.Fatalf("w.Close error %s", err)
}

// decompress buffer and write result to stdout
r, err := NewReader(&buf)
if err != nil {
	log.Fatalf("NewReader error %s", err)
}
if _, err = io.Copy(os.Stdout, r); err != nil {
	log.Fatalf("io.Copy error %s", err)
}
Output:

The quick brown fox jumps over the lazy dog.

Index

Examples

Constants

View Source
const (
	None   byte = 0x0
	CRC32  byte = 0x1
	CRC64  byte = 0x4
	SHA256 byte = 0xa
)

Constants for the checksum methods supported by xz.

View Source
const HeaderLen = 12

HeaderLen provides the length of the xz file header.

Variables

This section is empty.

Functions

func NewReader

func NewReader(xz io.Reader) (r io.ReadCloser, err error)

NewReader creates an io.ReadCloser. The function should never fail.

func NewReaderConfig

func NewReaderConfig(xz io.Reader, cfg ReaderConfig) (r io.ReadCloser, err error)

NewReaderConfig creates an xz reader using the provided configuration. If Workers are larger than one, the LZMA reader will only use single-threaded workers.

func ValidHeader

func ValidHeader(data []byte) bool

ValidHeader checks whether data is a correct xz file header. The length of data must be HeaderLen.

Types

type ReaderConfig added in v0.5.1

type ReaderConfig struct {
	// Workers defines the number of readers for parallel reading. The
	// default is the value of GOMAXPROCS.
	Workers int

	// Read a single xz stream from the underlying reader, stop and return
	// EOF. No checks are done whether the underlying reader finishes too.
	SingleStream bool

	// Runs the multiple Workers in LZMA mode. (This is an experimental
	// setup is normally not required.)
	LZMAParallel bool

	// LZMAWorkSize provides the work size to the LZMA layer. It is only
	// required if LZMAParallel is set.
	LZMAWorkSize int
}

ReaderConfig defines the parameters for the xz reader. The SingleStream parameter requests the reader to assume that the underlying stream contains only a single stream without padding.

The workers variable controls the number of parallel workers decoding the file. It only has an effect if the file was encoded in a way that it created blocks with the compressed size set in the headers. If Workers not 1 the Workers variable in LZMAConfig will be ignored.

func (*ReaderConfig) MarshalJSON

func (cfg *ReaderConfig) MarshalJSON() (p []byte, err error)

MarshalJSON creates the jason structure for a ReaderConfig value.

func (*ReaderConfig) SetDefaults

func (cfg *ReaderConfig) SetDefaults()

SetDefaults sets the defaults in ReaderConfig.

func (*ReaderConfig) UnmarshalJSON

func (cfg *ReaderConfig) UnmarshalJSON(p []byte) error

UnmarshalJSON parses JSON and sets the ReaderConfig accordingly.

func (*ReaderConfig) Verify added in v0.5.1

func (cfg *ReaderConfig) Verify() error

Verify checks the reader parameters for Validity. Zero values will be replaced by default values.

type WriteFlushCloser

type WriteFlushCloser interface {
	io.WriteCloser
	Flush() error
}

WriteFlushCloser supports the Write, Flush and Close methods.

func NewWriter

func NewWriter(xz io.Writer) (w WriteFlushCloser, err error)

NewWriter creates a new Writer for xz-compressed data. The Writer uses the preset #5. See Preset and NewWriterConfig for changing the parameters.

func NewWriterConfig

func NewWriterConfig(xz io.Writer, cfg WriterConfig) (w WriteFlushCloser, err error)

NewWriterConfig creates a WriteFlushCloser instance. If multi-threading is requested by a Workers configuration larger than 1, single threading will be requested for the LZMA writer by setting the Workers variable there to 1.

type WriterConfig added in v0.5.1

type WriterConfig struct {
	// WindowSize sets the dictionary size.
	WindowSize int

	// Properties for the LZMA algorithm.
	Properties lzma.Properties
	// FixedProperties indicate that the Properties is indeed zero
	FixedProperties bool

	// Number of workers processing data.
	Workers int
	// LZMAParallel indicates that the parallel execution should be on the
	// LZMA level. (This is an experimental setup and should normally not be
	// used.)
	LZMAParallel bool
	// Size of buffer used by the worker in LZMA work.
	LZMAWorkSize int

	// Configuration for the LZ parser.
	ParserConfig lz.ParserConfig

	// XZBlockSize defines the maximum uncompressed size of a xz-format
	// block. The default for a single worker setup MaxInt64=2^63-1 and 256
	// kByte with multiple parallel workers. Note that the XZ block size
	// differs from the parser block size.
	XZBlockSize int64

	// checksum method: CRC32, CRC64 or SHA256 (default: CRC64)
	Checksum byte

	// Forces NoChecksum (default: false)
	NoChecksum bool
}

WriterConfig describe the parameters for an xz writer. CRC64 is used as the default checksum despite the XZ specification saying a decoder must only support CRC32.

func Preset

func Preset(n int) WriterConfig

Preset returns a WriterConfig with preset parameters. Supported presets are ranging from 1 to 9 from fast to slow with increasing compression rate.

func (*WriterConfig) Clone

func (cfg *WriterConfig) Clone() WriterConfig

Clone creates a deep copy of the cfg value.

func (*WriterConfig) MarshalJSON

func (cfg *WriterConfig) MarshalJSON() (p []byte, err error)

MarshalJSON creates the JSON representation of the WriterConfig value.

func (*WriterConfig) SetDefaults

func (cfg *WriterConfig) SetDefaults()

SetDefaults applies the defaults to the xz writer configuration.

func (*WriterConfig) UnmarshalJSON

func (cfg *WriterConfig) UnmarshalJSON(p []byte) error

UnmarshalJSON parses a JSON value and set the WriterConfig value accordingly.

func (*WriterConfig) Verify added in v0.5.1

func (cfg *WriterConfig) Verify() error

Verify checks the configuration for errors. Zero values will be replaced by default values.

Directories

Path Synopsis
internal
randtxt
Package randtxt supports the generation of random text using a trigram model for the English language.
Package randtxt supports the generation of random text using a trigram model for the English language.
Package lzma provides support for the encoding and decoding LZMA, LZMA2 and raw LZMA streams without a header.
Package lzma provides support for the encoding and decoding LZMA, LZMA2 and raw LZMA streams without a header.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL