ctw

package module
v0.0.0-...-5bf3644 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 8, 2024 License: BSD-3-Clause Imports: 10 Imported by: 0

README

Context Tree Weighting

Documentation

Package ctw provides an implementation of the Context Tree Weighting algorithm. Also contained is an implementation of the Rissanen-Langdon Arithmetic Coding algorithm, which is combined with Context Tree Weighting to create a lossless compression/decompression utility.

Below is an example of using this package to compress Lincoln's Gettysburg address:

go run compress/main.go gettysburg.txt > gettys.ctw
cat gettys.ctw | go run decompress/main.go > gettys.dctw
diff gettysburg.txt gettys.dctw

The results are noticeably superior to that of other commercial applications on a Mac OS X:

  • Original: 1463
  • tar.gz: 993
  • 7z: 908
  • zip: 874
  • xz: 828
  • CTW: 772

Reference: F.M.J. Willems and Tj. J. Tjalkens, Complexity Reduction of the Context-Tree Weighting Algorithm: A Study for KPN Research, Technical University of Eindhoven, EIDMA Report RS.97.01.

Full documentation at https://godoc.org/github.com/fumin/ctw.

Applications

A demonstration of using this compression tool to study mammalian evolution and construct the phylogeny of the SARS virus can be found here.

Testing

go test

Questions

  • Why does increasing the depth above 48 not improve the compression of gettysburg.txt? Depth 48 gives 772 bytes, while depth 60 also gives 772 bytes.
  • The exposition in https://cs.anu.edu.au/courses/comp4620/2015/slides-ctw.pdf gives a CTW based way of predicting the next bit. However, it is not clear how should we predict the next say 10 bits, without iterating through the 1024 different possibilities.

Documentation

Overview

Package ctw provides an implementation of the Context Tree Weighting algorithm. Also contained is an implementation of the Rissanen-Langdon Arithmetic Coding algorithm, which is combined with Context Tree Weighting to create a lossless compression/decompression utility.

Below is an example of using this package to compress Lincoln's Gettysburg address:

go run compress/main.go gettysburg.txt > gettys.ctw
cat gettys.ctw | go run decompress/main.go > gettys.dctw
diff gettysburg.txt gettys.dctw

Reference: F.M.J. Willems and Tj. J. Tjalkens, Complexity Reduction of the Context-Tree Weighting Algorithm: A Study for KPN Research, Technical University of Eindhoven, EIDMA Report RS.97.01.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Compress

func Compress(w io.Writer, name string, depth int) error

(Compress) compresses the named file using arithmetic coding supplied with a Context Tree Weighting probabilistic model of depth <depth>. The compressed result is written to <w>.

func Decompress

func Decompress(w io.Writer, r io.Reader, depth int) error

Decompress decompress a compressed stream of bytes generated by Compress. Decompress reads the compressed bytes from r, and writes the decompressed result to w. Decompress expects the same Context Tree Weighting depth used in Compress.

Types

type CTW

type CTW struct {
	Bits []int
	Root *treeNode
}

A CTW is a Context Tree Weighting based probabilistic model for binary data. CTW implements the arithmetic coding Model interface.

func NewCTW

func NewCTW(bits []int) *CTW

NewCTW returns a new CTW whose context tree's depth is len(bits). The prior context of the tree is given by bits.

func (*CTW) Observe

func (model *CTW) Observe(bit int)

Observe updates the context tree, given that the sequence is followed by bit.

func (*CTW) Prob0

func (model *CTW) Prob0() float64

Prob0 returns the probability that the next bit be zero.

type CTWReverter

type CTWReverter struct {
	// contains filtered or unexported fields
}

A CTWReverter is a CTW model that allows reverting to its previous state. This is useful for predicting several steps ahead, while keeping the model's original state intact.

func NewCTWReverter

func NewCTWReverter(model *CTW) *CTWReverter

func (*CTWReverter) Observe

func (cr *CTWReverter) Observe(bit int)

func (*CTWReverter) Prob0

func (cr *CTWReverter) Prob0() float64

func (*CTWReverter) Unobserve

func (cr *CTWReverter) Unobserve()

type FCTW

type FCTW struct {
	Trees     []*CTW
	Block_len int
	Index     int
}

An FCTW is a probabilistic model for structured binary data constructed from CTW models for each bit position within a block. FCTW implements the arithmetic coding Model interface

func NewFCTW

func NewFCTW(block_len int, bits []int) *FCTW

NewFCTW returns a new FCTW whose context tree's depth is len(bits). The prior context of the trees is given by bits. The initial index position is len(bits) mod block_len.

func (*FCTW) Observe

func (model *FCTW) Observe(bit int)

Observe updates the context tree, given that the sequence is followed by bit.

func (*FCTW) Prob0

func (model *FCTW) Prob0() float64

Prob0 returns the probability that the next bit be zero.

Directories

Path Synopsis
ac
Package ac defines the interfaces the arithmetic coding algorithm requires.
Package ac defines the interfaces the arithmetic coding algorithm requires.
willems
Package willems implements the arithmetic coding algorithm described in Chapter 6.
Package willems implements the arithmetic coding algorithm described in Chapter 6.
witten
Package witten implements the arithmetic coding algorithm described in Witten, Ian H.; Neal, Radford M.; Cleary, John G. (June 1987).
Package witten implements the arithmetic coding algorithm described in Witten, Ian H.; Neal, Radford M.; Cleary, John G. (June 1987).
app

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL