dictzip

package module

v0.2.0 Latest Latest Go to latest Published: Nov 17, 2024 License: Apache-2.0 Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ianlewis/go-dictzip

Links

Open Source Insights

README ¶

go-dictzip

go-dictzip is a Go library for reading and writing dictzip files.

Status

The API is currently unstable and will change. This package will use module version numbering to manage versions and compatibility.

Installation

To install this package run

go get github.com/ianlewis/go-dictzip

Examples

Reading compressed files

You can open a dictionary file and read it much like a normal reader.

// Open the dictionary.
f, _ := os.Open("dictionary.dict.dz")
r, _ := dictzip.NewReader(f)
defer r.Close()

uncompressedData, _ = io.ReadAll(r)

Random access

Random access can be performed using the ReadAt method.

// Open the dictionary.
f, _ := os.Open("dictionary.dict.dz")
r, _ := dictzip.NewReader(f)
defer r.Close()

buf := make([]byte, 12)
_, _ = r.ReadAt(buf, 5)

Writing compressed files

Dictzip files can be written using the dictzip.Writer. Compressed data is stored in chunks and chunk sizes are stored in the archive header allowing for more efficient random access.

// Open the dictionary.
f, _ := os.Open("dictionary.dict.dz", os.O_WRONLY|os.O_CREATE, 0o644)
w, _ := dictzip.NewWriter(f)
defer w.Close()

buf := []byte("Hello World!")
_, _ = r.Write(buf)

dictzip Command

This repository also includes a dictzip command that is compatible with the dictzip(1) command.

# compress dictionary.dict to dictionary.dict.dz
$ dictzip dictionary.dict

# decompress dictionary.dict.dz to dictionary.dict
$ dictzip -d dictionary.dict.dz

# decompress part of the file and print to stdout
$ dictzip --stdout --start 1024 --size 25 dictionary.dict.dz
dictionary entry contents

pebbe/dictzip

References

dictzip(1) - Linux man page
RFC 1952 - GZIP file format specification

Documentation ¶

Overview ¶

Package dictzip implements the dictzip compression format. Dictzip compresses files using the gzip(1) algorithm (LZ77) in a manner which is completely compatible with the gzip file format. See: https://linux.die.net/man/1/dictzip See: https://linux.die.net/man/1/gzip See: https://datatracker.ietf.org/doc/html/rfc1952

Unless otherwise informed clients should not assume implementations in this package are safe for parallel execution.

Example ¶

path := "internal/testdata/hello.txt.dz"
f, err := os.Open(path)
if err != nil {
	panic(err)
}

r, err := dictzip.NewReader(f)
if err != nil {
	panic(err)
}

buf := make([]byte, 12)
_, err = r.ReadAt(buf, 5)
if err != nil {
	panic(err)
}

fmt.Println(string(buf))

Output:

Hello World!

Index ¶

Constants
Variables
type Header
- func (h *Header) ChunkSize() int
- func (h *Header) Sizes() []int
type Reader
- func NewReader(r io.ReadSeeker) (*Reader, error)
type Writer
- func NewWriter(w io.Writer) (*Writer, error)
- func NewWriterLevel(w io.Writer, level, chunkSize int) (*Writer, error)
- func (z *Writer) Close() error
- func (z *Writer) Write(p []byte) (int, error)

Examples ¶

Package

Constants ¶

View Source

const (
	// OSFAT represents an FAT filesystem OS (MS-DOS, OS/2, NT/Win32).
	OSFAT byte = iota

	// OSAmiga represents the Amiga OS.
	OSAmiga

	// OSVMS represents VMS (or OpenVMS).
	OSVMS

	// OSUnix represents Unix operating systems.
	OSUnix

	// OSVM represents VM/CMS.
	OSVM

	// OSAtari represents Atari TOS.
	OSAtari

	// OSHPFS represents HPFS filesystem (OS/2, NT).
	OSHPFS

	// OSMacintosh represents the Macintosh operating system.
	OSMacintosh

	// OSZSystem represents Z-System.
	OSZSystem

	// OSCPM represents the CP/M operating system.
	OSCPM

	// OSTOPS20 represents the TOPS-20 operating system.
	OSTOPS20

	// OSNTFS represents an NTFS filesystem OS (NT).
	OSNTFS

	// OSQDOS represents QDOS.
	OSQDOS

	// OSAcorn represents Acorn RISCOS.
	OSAcorn

	// OSUnknown represents an unknown operating system.
	OSUnknown = 0xff
)

View Source

const (
	// XFLSlowest indicates that the compressor used maximum compression (e.g. slowest algorithm).
	XFLSlowest byte = 0x2

	// XFLFastest indicates that the compressor used the fastest algorithm.
	XFLFastest byte = 0x4
)

View Source

const (
	// NoCompression performs no compression on the input.
	NoCompression = flate.NoCompression

	// BestSpeed provides the lowest level of compression but the fastest
	// performance.
	BestSpeed = flate.BestSpeed

	// BestCompression provides the highest level of compression but the slowest
	// performance.
	BestCompression = flate.BestCompression

	// DefaultCompression is the default compression level used for compressing
	// chunks. It provides a balance between compression and performance.
	DefaultCompression = flate.DefaultCompression

	// HuffmanOnly disables Lempel-Ziv match searching and only performs Huffman
	// entropy encoding. See [flate.HuffmanOnly].
	HuffmanOnly = flate.HuffmanOnly
)

View Source

const (
	// DefaultChunkSize is the default chunk size used when writing dictzip files.
	DefaultChunkSize = math.MaxUint16
)

Variables ¶

View Source

var (

	// ErrHeader indicates an error with gzip header data.
	ErrHeader = fmt.Errorf("%w: invalid header", errDictzip)
)

Functions ¶

This section is empty.

Types ¶

type Header ¶

type Header struct {
	// Comment is the COMMENT header field.
	Comment string

	// Extra includes all EXTRA sub-fields except the dictzip RA sub-field.
	Extra []byte

	// ModTime is the MTIME modification time field.
	ModTime time.Time

	// Name is the NAME header field.
	Name string

	// OS is the OS header field.
	OS byte
	// contains filtered or unexported fields
}

Header is the gzip file header.

Strings must be UTF-8 encoded and may only contain Unicode code points U+0001 through U+00FF, due to limitations of the gzip file format.

func (h *Header) ChunkSize() int

ChunkSize returns the dictzip uncompressed data chunk size.

func (h *Header) Sizes() []int

Sizes returns the dictzip sizes for the compressed data chunks.

type Reader ¶

type Reader struct {
	// Header is the gzip header data and is valid after [NewReader] or
	// [Reader.Reset].
	Header
	// contains filtered or unexported fields
}

Reader implements io.Reader and io.ReaderAt. It provides random access to the compressed data.

func NewReader ¶

func NewReader(r io.ReadSeeker) (*Reader, error)

NewReader returns a new dictzip Reader reading compressed data from the given reader. It does not assume control of the given io.Reader. It is the responsibility of the caller to Close on that reader when it is not longer used.

NewReader will call Seek on the given reader to ensure that it is being read from the beginning.

It is the callers responsibility to call Reader.Close on the returned Reader when done.

func (*Reader) Close ¶

func (z *Reader) Close() error

Close closes the reader. It does not close the underlying io.Reader.

func (*Reader) Read ¶

func (z *Reader) Read(p []byte) (int, error)

Read implements io.Reader.

func (*Reader) ReadAt ¶

func (z *Reader) ReadAt(p []byte, off int64) (int, error)

ReadAt implements io.ReaderAt.ReadAt.

func (*Reader) Reset ¶

func (z *Reader) Reset(r io.ReadSeeker) error

Reset discards the reader's state and resets it to the initial state as returned by NewReader but reading from the r instead.

Reset will call Seek on the given reader to ensure that it is being read from the beginning.

func (*Reader) Seek ¶

func (z *Reader) Seek(offset int64, whence int) (int64, error)

Seek implements io.Seeker.Seek.

type Writer ¶ added in v0.2.0

type Writer struct {
	// Header is written to the file when [Writer.Close] is called.
	Header
	// contains filtered or unexported fields
}

Writer implements io.WriteCloser for writing dictzip files. Writer writes chunks to a temporary file during write and copies the resulting data to the final file when Writer.Close is called.

For this reason, Writer.Close must be called in order to write the file correctly.

func NewWriter ¶ added in v0.2.0

func NewWriter(w io.Writer) (*Writer, error)

NewWriter initializes a new dictzip Writer with the default compression level and chunk size.

The OS Header is always set to OSUnknown (0xff) by default.

func NewWriterLevel ¶ added in v0.2.0

func NewWriterLevel(w io.Writer, level, chunkSize int) (*Writer, error)

NewWriterLevel initializes a new dictzip Writer with the given compression level and chunk size.

The OS Header is always set to OSUnknown (0xff) by default.

func (*Writer) Close ¶ added in v0.2.0

func (z *Writer) Close() error

Close closes the writer by writing the header with calculated offsets and copying chunks from the temporary file to the final output file.

func (*Writer) Write ¶ added in v0.2.0

func (z *Writer) Write(p []byte) (int, error)

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
dictzip Package main is the main package for the `dictzip` command.	Package main is the main package for the `dictzip` command.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

dictzip

README ¶

go-dictzip

Status

Installation

Examples

Reading compressed files

Random access

Writing compressed files

dictzip Command

References

Documentation ¶

Overview ¶

Index ¶

Examples ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type Header ¶

func (*Header) ChunkSize ¶

func (*Header) Sizes ¶ added in v0.2.0

type Reader ¶

func NewReader ¶

func (*Reader) Close ¶

func (*Reader) Read ¶

func (*Reader) ReadAt ¶

func (*Reader) Reset ¶

func (*Reader) Seek ¶

type Writer ¶ added in v0.2.0

func NewWriter ¶ added in v0.2.0

func NewWriterLevel ¶ added in v0.2.0

func (*Writer) Close ¶ added in v0.2.0

func (*Writer) Write ¶ added in v0.2.0

Source Files ¶

Directories ¶

README ¶

go-dictzip

Status

Installation

Examples

Reading compressed files

Random access

Writing compressed files

dictzip Command

Related projects

References

Documentation ¶

Overview ¶

Index ¶

Examples ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type Header ¶

func (*Header) ChunkSize ¶

func (*Header) Sizes ¶ added in v0.2.0

type Reader ¶

func NewReader ¶

func (*Reader) Close ¶

func (*Reader) Read ¶

func (*Reader) ReadAt ¶

func (*Reader) Reset ¶

func (*Reader) Seek ¶

type Writer ¶ added in v0.2.0

func NewWriter ¶ added in v0.2.0

func NewWriterLevel ¶ added in v0.2.0

func (*Writer) Close ¶ added in v0.2.0

func (*Writer) Write ¶ added in v0.2.0

Source Files ¶

Directories ¶