compress

package
v0.0.0-...-7eedc68 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 4, 2022 License: MIT Imports: 12 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// MagicNumber is an arbirary number at the start of all guppy files
	// which should help identify when the code is run on somehting else by
	// accident.
	MagicNumber = 0xbadf00d0
	// ReverseMagicNumber is the magic number if read on a machine with
	// flipped endianness.
	ReverseMagicNumber = 0xd000fdba
	Version            = 1
)

Variables

This section is empty.

Functions

func BlockToSlices

func BlockToSlices(span [3]int, firstDim int, x, buf []int64) [][]int64

BlockToSlices converts a block of x-major indices into a set of slices which each correspond to a 1-dimensional "skewer" through the block. These are organized so only one actual value needs to be stored for the block. First, one skewer down firstDim, then a face of skewers in the next direciton, and then a block of skewers filling out the rest of the data.

func ChooseFirstDim

func ChooseFirstDim(name string) int

CooseFirstDim chooses the first encoded dimension for a variable with a given name. This is chosen so almost all the deltas are perpendicular to the direction of the vector if the stored data is vector.

func DeltaDecode

func DeltaDecode(offset int64, x, out []int64)

DeltaDecode decodes a integer array encoded with DeltaEncode.

func DeltaDecodeFromSlices

func DeltaDecodeFromSlices(firstOffset int64, x [][]int64)

DeltaDecodeFromSlices runs DeltaDecode on a set of slices. This includes finding the correct offsets.

func DeltaEncode

func DeltaEncode(offset, qPeriod int64, x, out []int64)

DeltaEncode delta encodes the array x into the array out. The element before x[0] is taken to be offset. x and out can be the same array. Since the encoding is done into a uint64 array,

func Dequantize

func Dequantize(
	name string, q []int64, delta float64, qPeriod int64,
	typeFlag TypeFlag, buf *Buffer,
) particles.Field

dequantize converts an []int64 array to a different type of array. If the output type is floating point, delta*x + delta*uniform(0, 1) is used instead. Assumes that buf has been resized to the same length as q.

func MakeDeltaSlices

func MakeDeltaSlices(span [3]int, firstDim int, buf []int64) [][]int64

MakeDeltaSlices splits up an array, buf, into slices according to the splitting strategy used by BlockToSlices: first slice has length span[firstDim], next span[firstDim] sliaces have length span[secondDim] - 1, next span[firstDim]*span[secondDim] have length span[thridDim] - 1.

func Quantize

func Quantize(f particles.Field, delta float64, qPeriod int64, out []int64)

quantize comverts an array to []uin64 and write it to out. If the array is floating point, it is stored to an accuracy of delta.

func ReadCompressedIntsZLib

func ReadCompressedIntsZLib(rd io.Reader, b []byte, q []int64) ([]byte, error)

readCompressedIntsZLib reads an array of ints, q, from an io.Reader using column-ordered zlib blocks. b is used as a temporary internal buffer and will be resized as needed. A resized version is returned by the function.

This function is based on zlib entropy encoding

func ReadCompressedIntsZStd

func ReadCompressedIntsZStd(
	rd io.Reader, b, buf []byte, q []int64,
) (bOut, bufOut []byte, err error)

readCompressedIntsZLib reads an array of ints, q, from an io.Reader using column-ordered zlib blocks. b and buf are used as a temporary internals buffers and will be resized as needed. Resized versions are returned by the function.

This function is based on zstd entropy encoding.

func RotateDecode

func RotateDecode(delta []int64, rot int64)

func RotateEncode

func RotateEncode(delta []int64, rot int64)

func SliceOffsets

func SliceOffsets(x [][]int64) []int64

SliceOffsets returns the offset associated with each slice within the overall block.

func SlicesToBlock

func SlicesToBlock(span [3]int, firstDim int, x [][]int64, out []int64)

SlicesToBlock joins a set of slices, x, into a block in out.

func WriteCompressedIntsZLib

func WriteCompressedIntsZLib(q []int64, b []byte, wr io.Writer) error

writeCompressedIntsZlib writes an array of ints, q, to an io.Writer using column-ordered zlib blocks. b is used as a temporary internal buffer and must be the same length as q.

This function is based on zlib entropy encoding

func WriteCompressedIntsZStd

func WriteCompressedIntsZStd(
	q []int64, b, buf []byte, wr io.Writer,
) ([]byte, error)

writeCompressedIntsZStd writes an array of ints, q, to an io.Writer using column-ordered zlib blocks. b is used as a temporary internal buffer and must be the same length as q. buf is a buffer used internally and will be resized as needed and returned. Just keep passing the same buffer to the WriteCompressedIntsZLib function and you'll be okay.

This function is based on zstd entropy encoding

Types

type Buffer

type Buffer struct {
	// contains filtered or unexported fields
}

Buffer is an expandable buffer which is used by many of compress's functions to avoid unneeded heap allocations.

func NewBuffer

func NewBuffer(seed uint64) *Buffer

NewBuffer creates a new, resizable Buffer,

func (*Buffer) Resize

func (buf *Buffer) Resize(n int)

Resize resizes the buffer so its arrays all have length n.

type DeltaStats

type DeltaStats struct {
	// contains filtered or unexported fields
}

DeltaStats is a histogram containing delta values which can be used to compute various statistics of the delta distribution.

func (*DeltaStats) Load

func (stats *DeltaStats) Load(delta []int64)

Load loads an array into the DeltaStas array. It must be called before other methods are called.

func (*DeltaStats) Mean

func (stats *DeltaStats) Mean() int64

Mean returns the mean of the histogram.

func (*DeltaStats) Mode

func (stats *DeltaStats) Mode() int64

Mode returns the mode of the histogram.

func (*DeltaStats) NeededRotation

func (stats *DeltaStats) NeededRotation(mid int64) int64

NeededRotation returns how many values higher the each element in delta woudl need to shift to make sure that all elements are positive and that mid % 256 = 127. If mid is chosen approppriately, this latter condition can allow zlib to compress values more efficeintly.

func (*DeltaStats) Window

func (stats *DeltaStats) Window(size int) int64

Window returns the center of "window" of the given size which contains the maximum number of values.

type FixedWidthHeader

type FixedWidthHeader struct {
	// N and Ntot give the number of particles in the file and in the
	// total simulation, respectively.
	N, NTot int64
	// Span gives the dimensions of the slab of particles in the file.
	// Span[0], Span[1], and Span[2] are the x-, y-, and z- dimensions.
	// Offset gives the ID-coordinates of the particle at index zero. This
	// allows the original IDs to be reconstructed. TotalSpan gives the span
	// of the entire simulation volume (or the HR region if you're looking
	// at a zoom-in).
	Span, Offset, TotalSpan [3]int64
	// Z, OmegaM, OmegaL, H100, L, and Mass give the redshift, Omega_m,
	// Omega_Lambda, H0 / (100 km/s/Mpc), box width in comoving Mpc/h,
	// and particle mass in Msun/h.
	Z, OmegaM, OmegaL, H100, L, Mass float64
}
type Header struct {
	FixedWidthHeader
	// OriginalHeader is the original header of the one of the simulation
	OriginalHeader []byte
	// Names gives the names of all the variables stored in the file.
	// Types give the types of these variables. "u32"/"u64" give 32-bit and
	// 64-bit unisghned integers, respectively, anf "f32"/"f64" give 32-bit
	// and 64-bit floats, respectively.
	Names, Types []string
	Sizes        []int64
}

type LagrangianDelta

type LagrangianDelta struct {
	// contains filtered or unexported fields
}

LagrangianDelta is a compression method which encodes the difference between variables along lines in Lagrangian space. It implements the Method interface. See the documentation for Method for descriptions of the various class methods.

func NewLagrangianDelta

func NewLagrangianDelta(span [3]int, delta, period float64) *LagrangianDelta

NewLagrangianDelta creates a new LagrangianDelta object. The span of the particles in ID-space is given by span, the minimum accuracy is given by delta, and the periodicity is given by period (i.e. "the size of the box"). If this method is being used on non periodic data set period to a non-positive number.

func (*LagrangianDelta) Compress

func (m *LagrangianDelta) Compress(
	f particles.Field, buf *Buffer, wr io.Writer,
) error

(see documentaion for the Method interface)

func (*LagrangianDelta) Decompress

func (m *LagrangianDelta) Decompress(
	buf *Buffer, rd io.Reader, name string,
) (particles.Field, error)

(see documentaion for the Method interface)

func (*LagrangianDelta) MethodFlag

func (m *LagrangianDelta) MethodFlag() MethodFlag

(see documentaion for the Method interface)

func (*LagrangianDelta) ReadInfo

func (m *LagrangianDelta) ReadInfo(order binary.ByteOrder, rd io.Reader) error

(see documentaion for the Method interface)

func (*LagrangianDelta) SetOrder

func (m *LagrangianDelta) SetOrder(order binary.ByteOrder)

(see documentaion for the Method interface)

func (*LagrangianDelta) Span

func (m *LagrangianDelta) Span() [3]int

(see documentaion for the Method interface)

func (*LagrangianDelta) WriteInfo

func (m *LagrangianDelta) WriteInfo(wr io.Writer) error

(see documentaion for the Method interface)

type Method

type Method interface {
	// MethodFlag returns the method used to compress the data.
	MethodFlag() MethodFlag
	// SetOrder sets the byte order of the compression method.
	SetOrder(order binary.ByteOrder)
	// Span returns the span of the data compressed by the method.
	Span() [3]int

	// WriteInfo writes initialization information to a Writer.
	WriteInfo(wr io.Writer) error
	// ReadInfo reads initialization information from a Reader.
	ReadInfo(order binary.ByteOrder, rd io.Reader) error

	// Compress compresses the particles in a given field and writes them to
	// a Writer. The buffer buf is used for intermetiate allocations.
	Compress(f particles.Field, buf *Buffer, wr io.Writer) error
	// Decompress decompresses the particles from a Reader and returns a Field
	// containing them. This Field will use the Buffer buf to create the space
	// for the Field, so you need to copy that data elsewhere before calling
	// Decompress again.
	Decompress(buf *Buffer, rd io.Reader, name string) (particles.Field, error)
}

Method is an interface representing a compression method.

type MethodFlag

type MethodFlag uint32

MethodFlag is a flag representing the method used to compress the data.

const (
	LagrangianDeltaFlag MethodFlag = iota
)

type RNG

type RNG struct {
	// contains filtered or unexported fields
}

RNG is an xorshift random number generator. It is the same as gotetra's xorshiftGenerator. It is not thread safe.

func NewRNG

func NewRNG(seed uint64) *RNG

Init initializes RNG with a given seed.

func (*RNG) Uniform

func (gen *RNG) Uniform() float64

Uniform generates a single random number in the range [0, 1)

func (*RNG) UniformSequence

func (gen *RNG) UniformSequence(target []float64)

UniformSeqeunce generates one random number in the range [0, 1) for each element of the array target and writes them to that array.

type Reader

type Reader struct {
	Header
	// contains filtered or unexported fields
}

Reader handles the I/O and navigation asosociated with reading compressed fields from disk. Unlike Writer, it will need to be closed after use.

func NewReader

func NewReader(
	fname string, buf *Buffer, midBuf []byte,
) (*Reader, error)

NewReader creates a new Reader associated with the given gile and uses the given buffers to avoid unneccessary heap allocation.

func (*Reader) Close

func (rd *Reader) Close()

Close closes the files associated with the Reader.

func (*Reader) ReadField

func (rd *Reader) ReadField(name string) (particles.Field, error)

ReadField reads a field from the reader using the given method. (Note: use Names() to find these.)

NOTE: ReadField uses the array space in Buffer to allocate the Field. If you want to call ReadField again, YOU WILL NEED TO COPY THE DATA OUT OF THE FIELD and into your own locally-allocated array or you could lose it.

func (*Reader) ReuseMidBuf

func (rd *Reader) ReuseMidBuf() []byte

ReuseMidBuf returns the midBuf used by the Reader so that it can be used by a later reader without excess heap allocation.

type TypeFlag

type TypeFlag int64

TypeFlag is a flag representing an array type.

const (
	Uint32Flag TypeFlag = iota
	Uint64Flag
	Float32Flag
	Float64Flag
)

func GetTypeFlag

func GetTypeFlag(x interface{}) TypeFlag

GetTypeFlag returns the type flag associated with an array. Only []uint32, []uint64, []float32, and []float64 are supported.

type Writer

type Writer struct {
	Header
	// contains filtered or unexported fields
}

Writer is a class which handles writing to disk. The pattern is that you create a single writer wiht NewWriter, add fields to it with AddField, and to finally call Flush() when you want to flush all the buffers and write to disk.

func NewWriter

func NewWriter(
	fname string, snapioHeader snapio.Header,
	span, offset, totalSpan [3]int64,
	buf *Buffer, b []byte, order binary.ByteOrder,
) *Writer

NewWriter creates a Writer targeting a given file and using a given byte ordering. Two buffers need to be passed as arguments, a compress.Buffer to handle all the internal arrays needed by the compression methods, and a byte array that's used to store an in-RAM version of the file. If you don't want to make excess heap allocaitons, pass the same array returned by Flush(). You can pass the same compress.Buffer each time.

func (*Writer) AddField

func (wr *Writer) AddField(field particles.Field, method Method) error

Add field adds a new field to the file which will be compressed with a given method.

func (*Writer) Flush

func (wr *Writer) Flush() ([]byte, error)

Flush flushes the internal buffers to disk. It returns a (potentially cap-expanded) byte array that can be passed to later call to NewWriter().

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL