xz

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 4, 2020 License: CC0-1.0 Imports: 7 Imported by: 19

README

Xz

Package xz implements XZ decompression natively in Go.

Documentation at https://pkg.go.dev/github.com/therootcompany/xz.

Download and install with go get github.com/xi2/xz.

If you need compression as well as decompression, you might want to look at https://github.com/ulikunitz/xz.

LICENSE

This was originally released into the public domain by the AUTHORS.

Here it is licensed more explicitly as Creative Commons CC0 1.0 Universal (CC0-1.0) so that it can be detected by automated tooling, and satisfy the legal requirements for vendor integration for cases in which a "public domain" statement is not sufficient.

Documentation

Overview

Package xz implements XZ decompression natively in Go.

Usage

For ease of use, this package is designed to have a similar API to compress/gzip. See the examples for further details.

Implementation

This package is a translation from C to Go of XZ Embedded (http://tukaani.org/xz/embedded.html) with enhancements made so as to implement all mandatory and optional parts of the XZ file format specification v1.0.4. It supports all filters and block check types, supports multiple streams, and performs index verification using SHA-256 as recommended by the specification.

Speed

On the author's Intel Ivybridge i5, decompression speed is about half that of the standard XZ Utils (tested with a recent linux kernel tarball).

Thanks

Thanks are due to Lasse Collin and Igor Pavlov, the authors of XZ Embedded, on whose code package xz is based. It would not exist without their decision to allow others to modify and reuse their code.

Bug reports

For bug reports relating to this package please contact the author through https://github.com/xi2/xz/issues, and not the authors of XZ Embedded.

Index

Examples

Constants

View Source
const DefaultDictMax = 1 << 26 // 64 MiB

DefaultDictMax is the default maximum dictionary size in bytes used by the decoder. This value is sufficient to decompress files created with XZ Utils "xz -9".

Variables

View Source
var (
	ErrUnsupportedCheck = errors.New("xz: integrity check type not supported")
	ErrMemlimit         = errors.New("xz: LZMA2 dictionary size exceeds max")
	ErrFormat           = errors.New("xz: file format not recognized")
	ErrOptions          = errors.New("xz: compression options not supported")
	ErrData             = errors.New("xz: data is corrupt")
	ErrBuf              = errors.New("xz: data is truncated or corrupt")
)

Package specific errors.

Functions

This section is empty.

Types

type CheckID

type CheckID int

CheckID is the type of the data integrity check in an XZ stream calculated from the uncompressed data.

const (
	CheckNone   CheckID = 0x00
	CheckCRC32  CheckID = 0x01
	CheckCRC64  CheckID = 0x04
	CheckSHA256 CheckID = 0x0A
)

func (CheckID) String

func (id CheckID) String() string
type Header struct {
	CheckType CheckID // type of the stream's data integrity check
}

An XZ stream contains a stream header which holds information about the stream. That information is exposed as fields of the Reader. Currently it contains only the stream's data integrity check type.

type Reader

type Reader struct {
	Header
	// contains filtered or unexported fields
}

A Reader is an io.Reader that can be used to retrieve uncompressed data from an XZ file.

In general, an XZ file can be a concatenation of other XZ files. Reads from the Reader return the concatenation of the uncompressed data of each.

func NewReader

func NewReader(r io.Reader, dictMax uint32) (*Reader, error)

NewReader creates a new Reader reading from r. The decompressor will use an LZMA2 dictionary size up to dictMax bytes in size. Passing a value of zero sets dictMax to DefaultDictMax. If an individual XZ stream requires a dictionary size greater than dictMax in order to decompress, Read will return ErrMemlimit.

If NewReader is passed a value of nil for r then a Reader is created such that all read attempts will return io.EOF. This is useful if you just want to allocate memory for a Reader which will later be initialized with Reset.

Due to internal buffering, the Reader may read more data than necessary from r.

Example
package main

import (
	"bytes"
	"io"
	"io/ioutil"
	"log"
	"os"
	"path/filepath"

	"github.com/therootcompany/xz"
)

func main() {
	// load some XZ data into memory
	data, err := ioutil.ReadFile(
		filepath.Join("testdata", "xz-utils", "good-1-check-sha256.xz"))
	if err != nil {
		log.Fatal(err)
	}
	// create an xz.Reader to decompress the data
	r, err := xz.NewReader(bytes.NewReader(data), 0)
	if err != nil {
		log.Fatal(err)
	}
	// write the decompressed data to os.Stdout
	_, err = io.Copy(os.Stdout, r)
	if err != nil {
		log.Fatal(err)
	}
}
Output:

Hello
World!

func (*Reader) Multistream

func (z *Reader) Multistream(ok bool)

Multistream controls whether the reader is operating in multistream mode.

If enabled (the default), the Reader expects the input to be a sequence of XZ streams, possibly interspersed with stream padding, which it reads one after another. The effect is that the concatenation of a sequence of XZ streams or XZ files is treated as equivalent to the compressed result of the concatenation of the sequence. This is standard behaviour for XZ readers.

Calling Multistream(false) disables this behaviour; disabling the behaviour can be useful when reading file formats that distinguish individual XZ streams. In this mode, when the Reader reaches the end of the stream, Read returns io.EOF. To start the next stream, call z.Reset(nil) followed by z.Multistream(false). If there is no next stream, z.Reset(nil) will return io.EOF.

Example
package main

import (
	"bytes"
	"fmt"
	"io"
	"io/ioutil"
	"log"
	"os"
	"path/filepath"

	"github.com/therootcompany/xz"
)

func main() {
	// load some XZ data into memory
	data, err := ioutil.ReadFile(
		filepath.Join("testdata", "xz-utils", "good-1-check-sha256.xz"))
	if err != nil {
		log.Fatal(err)
	}
	// create a MultiReader that will read the data twice
	mr := io.MultiReader(bytes.NewReader(data), bytes.NewReader(data))
	// create an xz.Reader from the MultiReader
	r, err := xz.NewReader(mr, 0)
	if err != nil {
		log.Fatal(err)
	}
	// set Multistream mode to false
	r.Multistream(false)
	// decompress the first stream
	_, err = io.Copy(os.Stdout, r)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println("Read first stream")
	// reset the XZ reader so it is ready to read the second stream
	err = r.Reset(nil)
	if err != nil {
		log.Fatal(err)
	}
	// set Multistream mode to false again
	r.Multistream(false)
	// decompress the second stream
	_, err = io.Copy(os.Stdout, r)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println("Read second stream")
	// reset the XZ reader so it is ready to read further streams
	err = r.Reset(nil)
	// confirm that the second stream was the last one
	if err == io.EOF {
		fmt.Println("No more streams")
	}
}
Output:

Hello
World!
Read first stream
Hello
World!
Read second stream
No more streams

func (*Reader) Read

func (z *Reader) Read(p []byte) (n int, err error)

func (*Reader) Reset

func (z *Reader) Reset(r io.Reader) error

Reset, for non-nil values of io.Reader r, discards the Reader z's state and makes it equivalent to the result of its original state from NewReader, but reading from r instead. This permits reusing a Reader rather than allocating a new one.

If you wish to leave r unchanged use z.Reset(nil). This keeps r unchanged and ensures internal buffering is preserved. If the Reader was at the end of a stream it is then ready to read any follow on streams. If there are no follow on streams z.Reset(nil) returns io.EOF. If the Reader was not at the end of a stream then z.Reset(nil) does nothing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL