dedup

package
v0.0.62 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 7, 2023 License: Apache-2.0, NCSA Imports: 4 Imported by: 2

Documentation

Overview

Package dedup implements a duplication-reducing reader for streams of length-delimited byte records. Each record is read as a varint-encoded length in bytes, followed immediately by the record itself.

A stream consists of a sequence of such records packed consecutively without additional padding. There are no checksums or compression. See also: kythe.io/kythe/go/platform/delimited.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader implements the Reader interface. Duplicate records are removed by hashing each and checking against a set of known record hashes. This is a quick-and-dirty method of removing duplicates; it will not be perfect.

func NewReader

func NewReader(r io.Reader, maxSize int) (*Reader, error)

NewReader returns a reader that consumes records from r, using a cache of up to maxSize bytes for known record hashes.

func (*Reader) Next

func (u *Reader) Next() ([]byte, error)

Next returns the next length-delimited record from the input, or io.EOF if there are no more records available. Returns io.ErrUnexpectedEOF if a short record is found, with a length of n but fewer than n bytes of data. Because there is no resynchronization mechanism, it is generally not possible to recover from a short record in this format.

The slice returned is valid only until a subsequent call to Next.

func (*Reader) NextProto

func (u *Reader) NextProto(pb proto.Message) error

NextProto consumes the next available record by calling r.Next, and decodes it into pb with proto.Unmarshal.

func (*Reader) Skipped

func (u *Reader) Skipped() uint64

Skipped returns the number of records that have been skipped so far by the deduplication process.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL