Documentation ¶
Overview ¶
Package s2 implements the S2 compression format.
S2 is an extension of Snappy. Similar to Snappy S2 is aimed for high throughput, which is why it features concurrent compression for bigger payloads.
Decoding is compatible with Snappy compressed content, but content compressed with S2 cannot be decompressed by Snappy.
For more information on Snappy/S2 differences see README in: https://github.com/klauspost/compress/tree/master/s2
There are actually two S2 formats: block and stream. They are related, but different: trying to decompress block-compressed data as a S2 stream will fail, and vice versa. The block format is the Decode and Encode functions and the stream format is the Reader and Writer types.
A "better" compression option is available. This will trade some compression speed
The block format, the more common case, is used when the complete size (the number of bytes) of the original data is known upfront, at the time compression starts. The stream format, also known as the framing format, is for when that isn't always true.
Blocks to not offer much data protection, so it is up to you to add data validation of decompressed blocks.
Streams perform CRC validation of the decompressed data. Stream compression will also be performed on multiple CPU cores concurrently significantly improving throughput.
Index ¶
- Constants
- Variables
- func ConcatBlocks(dst []byte, blocks ...[]byte) ([]byte, error)
- func Decode(dst, src []byte) ([]byte, error)
- func DecodedLen(src []byte) (int, error)
- func Encode(dst, src []byte) []byte
- func EncodeBetter(dst, src []byte) []byte
- func MaxEncodedLen(srcLen int) int
- type Reader
- type Writer
- type WriterOption
Constants ¶
const MaxBlockSize = math.MaxUint32 - binary.MaxVarintLen32 - 5
MaxBlockSize is the maximum value where MaxEncodedLen will return a valid block size. Blocks this big are highly discouraged, though.
Variables ¶
var ( // ErrCorrupt reports that the input is invalid. ErrCorrupt = errors.New("s2: corrupt input") // ErrCRC reports that the input failed CRC validation (streams only) ErrCRC = errors.New("s2: corrupt input, crc mismatch") // ErrTooLarge reports that the uncompressed length is too large. ErrTooLarge = errors.New("s2: decoded block is too large") // ErrUnsupported reports that the input isn't supported. ErrUnsupported = errors.New("s2: unsupported input") )
Functions ¶
func ConcatBlocks ¶
ConcatBlocks will concatenate the supplied blocks and append them to the supplied destination. If the destination is nil or too small, a new will be allocated. The blocks are not validated, so garbage in = garbage out. dst may not overlap block data. Any data in dst is preserved as is, so it will not be considered a block.
func Decode ¶
Decode returns the decoded form of src. The returned slice may be a sub- slice of dst if dst was large enough to hold the entire decoded block. Otherwise, a newly allocated slice will be returned.
The dst and src must not overlap. It is valid to pass a nil dst.
func DecodedLen ¶
DecodedLen returns the length of the decoded block.
func Encode ¶
Encode returns the encoded form of src. The returned slice may be a sub- slice of dst if dst was large enough to hold the entire encoded block. Otherwise, a newly allocated slice will be returned.
The dst and src must not overlap. It is valid to pass a nil dst.
The blocks will require the same amount of memory to decode as encoding, and does not make for concurrent decoding. Also note that blocks do not contain CRC information, so corruption may be undetected.
If you need to encode larger amounts of data, consider using the streaming interface which gives all of these features.
func EncodeBetter ¶
EncodeBetter returns the encoded form of src. The returned slice may be a sub- slice of dst if dst was large enough to hold the entire encoded block. Otherwise, a newly allocated slice will be returned.
EncodeBetter compresses better than Encode but typically with a 10-40% speed decrease on both compression and decompression.
The dst and src must not overlap. It is valid to pass a nil dst.
The blocks will require the same amount of memory to decode as encoding, and does not make for concurrent decoding. Also note that blocks do not contain CRC information, so corruption may be undetected.
If you need to encode larger amounts of data, consider using the streaming interface which gives all of these features.
func MaxEncodedLen ¶
MaxEncodedLen returns the maximum length of a snappy block, given its uncompressed length.
It will return a negative value if srcLen is too large to encode. 32 bit platforms will have lower thresholds for rejecting big content.
Types ¶
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader is an io.Reader that can read Snappy-compressed bytes.
func NewReader ¶
NewReader returns a new Reader that decompresses from r, using the framing format described at https://github.com/google/snappy/blob/master/framing_format.txt with S2 changes.
func (*Reader) Reset ¶
Reset discards any buffered data, resets all state, and switches the Snappy reader to read from r. This permits reusing a Reader rather than allocating a new one.
func (*Reader) Skip ¶
Skip will skip n bytes forward in the decompressed output. For larger skips this consumes less CPU and is faster than reading output and discarding it. CRC is not checked on skipped blocks. io.ErrUnexpectedEOF is returned if the stream ends before all bytes have been skipped. If a decoding error is encountered subsequent calls to Read will also fail.
type Writer ¶
type Writer struct {
// contains filtered or unexported fields
}
Writer is an io.Writer that can write Snappy-compressed bytes.
func NewWriter ¶
func NewWriter(w io.Writer, opts ...WriterOption) *Writer
NewWriter returns a new Writer that compresses to w, using the framing format described at https://github.com/google/snappy/blob/master/framing_format.txt
Users must call Close to guarantee all data has been forwarded to the underlying io.Writer and that resources are released. They may also call Flush zero or more times before calling Close.
func (*Writer) Close ¶
Close calls Flush and then closes the Writer. Calling Close multiple times is ok.
func (*Writer) Flush ¶
Flush flushes the Writer to its underlying io.Writer. This does not apply padding.
func (*Writer) ReadFrom ¶
ReadFrom implements the io.ReaderFrom interface. Using this is typically more efficient since it avoids a memory copy. ReadFrom reads data from r until EOF or error. The return value n is the number of bytes read. Any error except io.EOF encountered during the read is also returned.
type WriterOption ¶
WriterOption is an option for creating a encoder.
func WriterBetterCompression ¶
func WriterBetterCompression() WriterOption
WriterBetterCompression will enable better compression. EncodeBetter compresses better than Encode but typically with a 10-40% speed decrease on both compression and decompression.
func WriterBlockSize ¶
func WriterBlockSize(n int) WriterOption
WriterBlockSize allows to override the default block size. Blocks will be this size or smaller. Minimum size is 4KB and and maximum size is 4MB.
Bigger blocks may give bigger throughput on systems with many cores, and will increase compression slightly, but it will limit the possible concurrency for smaller payloads for both encoding and decoding. Default block size is 1MB.
func WriterConcurrency ¶
func WriterConcurrency(n int) WriterOption
WriterConcurrency will set the concurrency, meaning the maximum number of decoders to run concurrently. The value supplied must be at least 1. By default this will be set to GOMAXPROCS.
func WriterPadding ¶
func WriterPadding(n int) WriterOption
WriterPadding will add padding to all output so the size will be a multiple of n. This can be used to obfuscate the exact output size or make blocks of a certain size. The contents will be a skippable frame, so it will be invisible by the decoder. n must be > 0 and <= 4MB. The padded area will be filled with data from crypto/rand.Reader. The padding will be applied whenever Close is called on the writer.