filter

package
v0.0.0-...-ee0e00b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 16, 2023 License: Apache-2.0 Imports: 12 Imported by: 1

Documentation

Overview

Package filter provides a set of generic filter implementations that can be used to track any comparable. For purposes of go-car-mirror, they are used to track carmirror.BlockId.

The name filter is inspired by bloom filters and the related family of space-efficient, probabilistic data structures that can be used to test whether an element is a member of a set. The test is probabilistic in that it may return false positives, but will never return a false negative.

Index

Constants

View Source
const XXH3_HASH_32_BYTES = 1

Variables

View Source
var (
	ErrIncompatibleFilter            = errors.New("incompatible filter type")
	ErrBloomOverflow                 = errors.New("bloom filter overflowing")
	ErrCantMarshalBloomWithoutHashId = errors.New("bloom must be created with a hash Id in order to marshal it")
)

Errors that can be returned for invalid filter operations.

Functions

func RegisterHash

func RegisterHash[T any](id uint64, hash bloom.HashFunction[T])

RegisterHash registers a hash function with the global registry.

func RegistryLookup

func RegistryLookup[T any](id uint64) (bloom.HashFunction[T], bool)

RegistryLookup looks up a hash function by its ID.

func XXH3Hash32Bytes

func XXH3Hash32Bytes(id [32]byte, seed uint64) uint64

XXH3Hash32Bytes is a 32-bit hash function for byte arrays.

Types

type BloomFilter

type BloomFilter[K comparable] struct {
	// contains filtered or unexported fields
}

BloomFilter is a classic, fixed-capacity bloom filter, which also stores an approximate count of the number of items.

func NewBloomFilter

func NewBloomFilter[K comparable](capacity uint, function bloom.HashFunction[K]) *BloomFilter[K]

NewBloomFilter creates a bloom filter with the given estimated capacity. The false positive rate is determined by bloom.EstimateFPP.

func NewBloomFilterFromBytes

func NewBloomFilterFromBytes[K comparable](bytes []byte, bitCount uint64, hashCount uint64, function bloom.HashFunction[K]) *BloomFilter[K]

NewBloomFilterFromBytes creates a bloom filter from a prepopulated byte array and related low-level information including the hash function.

func TryNewBloomFilter

func TryNewBloomFilter[K comparable](capacity uint, function uint64) (*BloomFilter[K], error)

TryNewBloomFilter creates a bloom filter of the given capacity using the registered hash function, failing if no such hash function can be found.

func TryNewBloomFilterFromBytes

func TryNewBloomFilterFromBytes[K comparable](bytes []byte, bitCount uint64, hashCount uint64, function uint64) (*BloomFilter[K], error)

TryNewBloomFilterFromBytes creates a bloom filter from a prepopulated byte array and related low-level information including the hash function, failing if no such hash function can be found.

func (*BloomFilter[K]) Add

func (f *BloomFilter[K]) Add(item K) Filter[K]

Add adds an item to the filter. When adding a new item would saturate the filter, it creates a compound filter composed of the existing filter plus a new bloom of double the capacity.

func (*BloomFilter[K]) AddAll

func (f *BloomFilter[K]) AddAll(other Filter[K]) Filter[K]

AddAll adds all items from another filter to this one.

func (*BloomFilter[K]) Capacity

func (f *BloomFilter[K]) Capacity() int

Capacity returns the maximum number of items the filter can hold.

func (*BloomFilter[K]) Clear

func (f *BloomFilter[K]) Clear() Filter[K]

Clear returns a new filter with the same capacity and hash function as the current one, but with no items.

func (*BloomFilter[K]) Copy

func (f *BloomFilter[K]) Copy() Filter[K]

Copy returns a copy of the filter.

func (*BloomFilter[K]) Count

func (f *BloomFilter[K]) Count() int

Count returns the number of items in the filter.

func (*BloomFilter[K]) DoesNotContain

func (f *BloomFilter[K]) DoesNotContain(item K) bool

DoesNotContain returns true if the item is not in the filter.

func (*BloomFilter[K]) Dump

func (bf *BloomFilter[K]) Dump(log *zap.SugaredLogger, prefix string)

Dump dumps the BloomFilter to the log.

func (*BloomFilter[K]) Equal

func (bf *BloomFilter[K]) Equal(other Filter[K]) bool

Equal returns true if the other filter is a BloomFilter with the same parameters and items.

func (*BloomFilter[K]) MarshalCBOR

func (bf *BloomFilter[K]) MarshalCBOR() ([]byte, error)

MarshalCBOR marshals the BloomFilter into a CBOR-encoded wire format.

func (*BloomFilter[K]) MarshalJSON

func (bf *BloomFilter[K]) MarshalJSON() ([]byte, error)

MarshalJSON marshals the BloomFilter into a JSON-encoded wire format. If the hash function is not registered, it returns ErrCantMarshalBloomWithoutHashId.

func (*BloomFilter[K]) TryAddAll

func (f *BloomFilter[K]) TryAddAll(other Filter[K]) error

TryAddAll attempts to add all items from another filter to this one.

func (*BloomFilter[K]) UnmarshalCBOR

func (bf *BloomFilter[K]) UnmarshalCBOR(bytes []byte) error

UnmarshalCBOR unmarshals the CBOR-encoded wire format into a BloomFilter.

func (*BloomFilter[K]) UnmarshalJSON

func (bf *BloomFilter[K]) UnmarshalJSON(bytes []byte) error

UnmarshalJSON unmarshals the JSON-encoded wire format into a BloomFilter.

type BloomWireFormat

type BloomWireFormat[K comparable] struct {
	Bytes        []byte `json:"bytes"`
	HashFunction uint64 `json:"hashFunction"`
	HashCount    uint64 `json:"hashCount"`
	BitCount     uint64 `json:"bitCount"`
}

BloomWireFormat is the wire format for a BloomFilter.

func (*BloomWireFormat[K]) UnmarshalTo

func (wf *BloomWireFormat[K]) UnmarshalTo(target *BloomFilter[K]) error

UnmarshalTo unmarshals the wire format into a BloomFilter. If the hash function is not registered, it returns ErrIncompatibleFilter.

type CompoundFilter

type CompoundFilter[K comparable] struct {
	SideA Filter[K]
	SideB Filter[K]
}

CompoundFilter implements the union of two disparate filter types

func (*CompoundFilter[K]) Add

func (cf *CompoundFilter[K]) Add(item K) Filter[K]

Add adds an item to the sparser side of the filter.

func (*CompoundFilter[K]) AddAll

func (cf *CompoundFilter[K]) AddAll(other Filter[K]) Filter[K]

AddAll adds all items from the other filter to this filter.

func (*CompoundFilter[K]) Capacity

func (cf *CompoundFilter[K]) Capacity() int

Capacity returns the total capacity of the filter.

func (*CompoundFilter[K]) Clear

func (cf *CompoundFilter[K]) Clear() Filter[K]

Clear returns a cleared filter with the same type as the sparser side.

func (*CompoundFilter[K]) Copy

func (cf *CompoundFilter[K]) Copy() Filter[K]

Copy returns a copy of the filter.

func (*CompoundFilter[K]) Count

func (cf *CompoundFilter[K]) Count() int

Count returns the number of items in the filter.

func (*CompoundFilter[K]) DoesNotContain

func (cf *CompoundFilter[K]) DoesNotContain(item K) bool

DoesNotContain returns true if item is not present in the filter.

func (*CompoundFilter[K]) Dump

func (cf *CompoundFilter[K]) Dump(log *zap.SugaredLogger, prefix string)

Dump prints a human-readable representation of the filter to the log.

func (*CompoundFilter[K]) Equal

func (cf *CompoundFilter[K]) Equal(other Filter[K]) bool

func (*CompoundFilter[K]) GetSparser

func (cf *CompoundFilter[K]) GetSparser() Side

GetSparser returns the side of the filter with the most unused capacity.

func (*CompoundFilter[K]) MarshalCBOR

func (cf *CompoundFilter[K]) MarshalCBOR() ([]byte, error)

MarshalCBOR marshals the filter into a CBOR-encoded CompoundFilterWireFormat.

func (*CompoundFilter[K]) MarshalJSON

func (cf *CompoundFilter[K]) MarshalJSON() ([]byte, error)

MarshalJSON marshals the filter into a JSON-encoded CompoundFilterWireFormat.

func (*CompoundFilter[K]) TryAddAll

func (cf *CompoundFilter[K]) TryAddAll(other Filter[K]) error

TryAddAll attempts to add all items from other filter to this filter.

func (*CompoundFilter[K]) UnmarshalCBOR

func (cf *CompoundFilter[K]) UnmarshalCBOR(bytes []byte) error

UnmarshalCBOR unmarshals the CBOR-encoded CompoundFilterWireFormat into the filter.

func (*CompoundFilter[K]) UnmarshalJSON

func (cf *CompoundFilter[K]) UnmarshalJSON(bytes []byte) error

UnmarshalJSON unmarshals the JSON-encoded CompoundFilterWireFormat into the filter.

type CompoundFilterWireFormat

type CompoundFilterWireFormat[K comparable] struct {
	SideA *FilterWireFormat[K] `json:"a"`
	SideB *FilterWireFormat[K] `json:"b"`
}

CompoundFilterWireFormat is the wire format for a CompoundFilter.

type EmptyFilter

type EmptyFilter[K comparable] struct {
	// contains filtered or unexported fields
}

EmptyFilter represents a filter which contains no items. An allocator is provided which is used to create a new filter if necessary.

func NewEmptyFilter

func NewEmptyFilter[K comparable](allocator func() Filter[K]) *EmptyFilter[K]

NewEmptyFilter returns a new EmptyFilter.

func (*EmptyFilter[K]) Add

func (ef *EmptyFilter[K]) Add(item K) Filter[K]

Add adds the item to the filter.

func (*EmptyFilter[K]) AddAll

func (ef *EmptyFilter[K]) AddAll(other Filter[K]) Filter[K]

AddAll adds all items from the other filter to the filter.

func (*EmptyFilter[K]) Capacity

func (ef *EmptyFilter[K]) Capacity() int

Capacity returns the capacity of the filter.

func (*EmptyFilter[K]) Clear

func (ef *EmptyFilter[K]) Clear() Filter[K]

Clear clears the filter.

func (*EmptyFilter[K]) Copy

func (ef *EmptyFilter[K]) Copy() Filter[K]

Copy returns a copy of the filter.

func (*EmptyFilter[K]) Count

func (ef *EmptyFilter[K]) Count() int

Count returns the number of items in the filter.

func (*EmptyFilter[K]) DoesNotContain

func (ef *EmptyFilter[K]) DoesNotContain(item K) bool

DoesNotContain returns true if the item is not in the filter.

func (*EmptyFilter[K]) Dump

func (ef *EmptyFilter[K]) Dump(log *zap.SugaredLogger, prefix string)

Dump dumps the filter to the log.

func (*EmptyFilter[K]) Equal

func (ef *EmptyFilter[K]) Equal(other Filter[K]) bool

Equal returns true if the other filter is equal to the filter.

func (*EmptyFilter[K]) TryAddAll

func (ef *EmptyFilter[K]) TryAddAll(other Filter[K]) error

TryAddAll adds all items from the other filter to the filter. Returns an error if the other filter is not compatible with the filter.

type Filter

type Filter[K comparable] interface {
	// DoesNotContain returns true if item is not present in the filter.
	DoesNotContain(item K) bool

	// Capacity returns the total number of items this filter can contain.
	// May return -1 to indicate unconstrained.
	Capacity() int

	// Count returns an estimate of the number of items in the filter.
	Count() int

	// Copy returns a copy or reference to the filter, as appropriate.
	Copy() Filter[K]

	// Equal returns true if the two filters are equivalent.
	Equal(Filter[K]) bool

	// Add adds the two filters together.
	// Mutate operations follow the Go convention of normally mutating, but returning a new filter when necessary.
	Add(item K) Filter[K]

	// Clear creates a new, empty filter of the same type as this one
	Clear() Filter[K]

	// TryAddAll attempts to add all items from other filter to this filter.
	// It fails if the other filter cannot be added.
	TryAddAll(other Filter[K]) error

	// AddAll adds all items from the other filter to this filter.
	// It creates a new filter, if necessary, to accomodate the merge.
	AddAll(other Filter[K]) Filter[K]

	// Dump dumps the filter data to the log with the given prefix.
	// Note: this may be quite slow, as there are multiple lines of output.
	Dump(*zap.SugaredLogger, string)
}

Filter is a space-efficient, probabilistic data structure that can be used to test whether an element is a member of a set. The test is probabilistic in that it may return false positives, but will never return a false negative.

func Desynchronize

func Desynchronize[K comparable](filter Filter[K]) Filter[K]

Desynchronize returns a filter which is not synchronized.

type FilterWireFormat

type FilterWireFormat[K comparable] struct {
	// BL is a Bloom filter.
	BL *BloomFilter[K] `json:"bl,omitempty"`
	// CM is a compound filter.
	CM *CompoundFilter[K] `json:"cm,omitempty"`
	// EM is an empty filter.
	EM *EmptyFilter[K] `json:"em,omitempty"`
	// PF is a perfect filter, which is mainly used for testing.
	PF *PerfectFilter[K] `json:"pf,omitempty"`
}

FilterWireFormat is a wire format for filters. It is used to serialize and deserialize filters and can support multiple filter types.

func NewFilterWireFormat

func NewFilterWireFormat[K comparable](filter Filter[K]) *FilterWireFormat[K]

NewFilterWireFormat creates a new wire format for the given filter.

func (*FilterWireFormat[K]) Any

func (wf *FilterWireFormat[K]) Any() Filter[K]

Any returns the first non-nil filter in the wire format.

type PerfectFilter

type PerfectFilter[K comparable] struct {
	// contains filtered or unexported fields
}

PerfectFilter is a Filter implementation which uses and underlying map - mainly for testing

func NewPerfectFilter

func NewPerfectFilter[K comparable]() *PerfectFilter[K]

NewPerfectFilter returns a new PerfectFilter.

func (*PerfectFilter[K]) Add

func (pf *PerfectFilter[K]) Add(item K) Filter[K]

Add adds the item to the filter.

func (*PerfectFilter[K]) AddAll

func (pf *PerfectFilter[K]) AddAll(other Filter[K]) Filter[K]

AddAll adds all items from the other filter to the filter.

func (*PerfectFilter[K]) Capacity

func (pf *PerfectFilter[K]) Capacity() int

Capacity returns the capacity of the filter.

func (*PerfectFilter[K]) Clear

func (pf *PerfectFilter[K]) Clear() Filter[K]

Clear clears the filter.

func (*PerfectFilter[K]) Copy

func (pf *PerfectFilter[K]) Copy() Filter[K]

Copy returns a copy of the filter.

func (*PerfectFilter[K]) Count

func (pf *PerfectFilter[K]) Count() int

Count returns the number of items in the filter.

func (*PerfectFilter[K]) DoesNotContain

func (pf *PerfectFilter[K]) DoesNotContain(item K) bool

DoesNotContain returns true if the item is not in the filter.

func (*PerfectFilter[K]) Dump

func (pf *PerfectFilter[K]) Dump(log *zap.SugaredLogger, prefix string)

Dump dumps the filter to the log.

func (*PerfectFilter[K]) Equal

func (pf *PerfectFilter[K]) Equal(other Filter[K]) bool

Equal returns true if the other filter is equal to the filter.

func (*PerfectFilter[K]) MarshalCBOR

func (pf *PerfectFilter[K]) MarshalCBOR() ([]byte, error)

MarshalCBOR marshals the PerfectFilter into the CBOR-encoded wire format.

func (*PerfectFilter[K]) MarshalJSON

func (pf *PerfectFilter[K]) MarshalJSON() ([]byte, error)

MarshalJSON marshals the PerfectFilter into the JSON-encoded wire format.

func (*PerfectFilter[K]) TryAddAll

func (pf *PerfectFilter[K]) TryAddAll(other Filter[K]) error

TryAddAll adds all items from the other filter to the filter.

func (*PerfectFilter[K]) UnmarshalCBOR

func (pf *PerfectFilter[K]) UnmarshalCBOR(bytes []byte) error

UnmarshalCBOR unmarshals the CBOR-encoded wire format into the PerfectFilter.

func (*PerfectFilter[K]) UnmarshalJSON

func (pf *PerfectFilter[K]) UnmarshalJSON(bytes []byte) error

UnmarshalJSON unmarshals the JSON-encoded wire format into the PerfectFilter.

type Registry

type Registry struct {
	// contains filtered or unexported fields
}

Registry is a global registry of hash functions.

type Side

type Side uint8
const (
	SIDE_A Side = 0
	SIDE_B Side = 1
)

type SynchronizedFilter

type SynchronizedFilter[K comparable] struct {
	// contains filtered or unexported fields
}

SyncFilter is a Filter that is a thread-safe Filter, which wraps a regular Filter with a mutex. All of its methods are safe for concurrent use.

func NewSynchronizedFilter

func NewSynchronizedFilter[K comparable](filter Filter[K]) *SynchronizedFilter[K]

NewSynchronizedFilter returns a new SynchronizedFilter.

func (*SynchronizedFilter[K]) Add

func (rf *SynchronizedFilter[K]) Add(item K) Filter[K]

Add adds the item to the filter.

func (*SynchronizedFilter[K]) AddAll

func (rf *SynchronizedFilter[K]) AddAll(other Filter[K]) Filter[K]

AddAll adds all items from the other filter to the filter.

func (*SynchronizedFilter[K]) Capacity

func (rf *SynchronizedFilter[K]) Capacity() int

Capacity returns the capacity of the filter.

func (*SynchronizedFilter[K]) Clear

func (rf *SynchronizedFilter[K]) Clear() Filter[K]

Clear clears the filter.

func (*SynchronizedFilter[K]) Copy

func (rf *SynchronizedFilter[K]) Copy() Filter[K]

Copy returns a copy of the filter.

func (*SynchronizedFilter[K]) Count

func (rf *SynchronizedFilter[K]) Count() int

Count returns the number of items in the filter.

func (*SynchronizedFilter[K]) DoesNotContain

func (rf *SynchronizedFilter[K]) DoesNotContain(item K) bool

DoesNotContain returns true if the item is not in the filter.

func (*SynchronizedFilter[K]) Dump

func (sf *SynchronizedFilter[K]) Dump(log *zap.SugaredLogger, prefix string)

Dump dumps the filter to the log.

func (*SynchronizedFilter[K]) Equal

func (sf *SynchronizedFilter[K]) Equal(other Filter[K]) bool

Equal returns true if the other filter is equal to the filter.

func (*SynchronizedFilter[K]) MarshalCBOR

func (sf *SynchronizedFilter[K]) MarshalCBOR() ([]byte, error)

MarshalCBOR marshals the SynchronizedFilter into the CBOR-encoded wire format.

func (*SynchronizedFilter[K]) MarshalJSON

func (sf *SynchronizedFilter[K]) MarshalJSON() ([]byte, error)

MarshalJSON marshals the SynchronizedFilter into the JSON-encoded wire format.

func (*SynchronizedFilter[K]) TryAddAll

func (rf *SynchronizedFilter[K]) TryAddAll(other Filter[K]) error

TryAddAll adds all items from the other filter to the filter.

func (*SynchronizedFilter[K]) UnmarshalCBOR

func (sf *SynchronizedFilter[K]) UnmarshalCBOR(bytes []byte) error

UnmarshalCBOR unmarshals the CBOR-encoded wire format into the SynchronizedFilter.

func (*SynchronizedFilter[K]) UnmarshalJSON

func (sf *SynchronizedFilter[K]) UnmarshalJSON(bytes []byte) error

UnmarshalJSON unmarshals the JSON-encoded wire format into the SynchronizedFilter.

func (*SynchronizedFilter[K]) UnsynchronizedCopy

func (rf *SynchronizedFilter[K]) UnsynchronizedCopy() Filter[K]

UnsynchronizedCopy returns an unsynchronized copy of the filter.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL