dedup

package
v0.0.63 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 18, 2023 License: Apache-2.0, NCSA Imports: 2 Imported by: 1

Documentation

Overview

Package dedup implements a utility to determine if a record has not been seen before (whether it's unique).

Index

Constants

View Source
const HashSize = sha512.Size384

HashSize is the size of hash used to determine uniqueness in Deduper.

Variables

This section is empty.

Functions

This section is empty.

Types

type Deduper

type Deduper struct {
	// contains filtered or unexported fields
}

Deduper determines if a data record has been seen before by checking its size-limited cache of hashes.

func New

func New(maxSize int) (*Deduper, error)

New returns a new Deduper with the given cache size (in bytes). maxSize must be at least HashSize/2.

func (*Deduper) Duplicates

func (d *Deduper) Duplicates() uint64

Duplicates returns the number of duplicate records seen so far.

func (*Deduper) IsUnique

func (d *Deduper) IsUnique(data []byte, rest ...[]byte) bool

IsUnique determines if the given data record has not been seen before.

func (*Deduper) Unique

func (d *Deduper) Unique() uint64

Unique returns the number of unique records seen so far.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL