Documentation ¶
Overview ¶
Package bloom is a fast, space-efficient bloom filter.
Index ¶
- Variables
- func Jaccard(a, b *Filter) (float64, error)
- type Dynamic
- type Filter
- func (f *Filter) Add(key string)
- func (f *Filter) AddBytes(key []byte)
- func (f *Filter) ErrRate() float64
- func (f *Filter) Has(key string) bool
- func (f *Filter) HasBytes(key []byte) bool
- func (f *Filter) Intersect(f1, f2 *Filter) error
- func (f *Filter) MarshalBinary() (data []byte, err error)
- func (f *Filter) Size() int
- func (f *Filter) Stats() (hashes, nbits, popcount uint64)
- func (f *Filter) Union(f1, f2 *Filter) error
- func (f *Filter) UnmarshalBinary(data []byte) error
Constants ¶
This section is empty.
Variables ¶
var ( // ErrUnknownEncoding is returned from UnmarshalBinary if the version isn't // recognized. ErrUnknownEncoding = errors.New("bloom: unknown encoding") // ErrDataTooShort is returned from UnmarshalBinary if the length of the // data is too short. ErrDataTooShort = errors.New("bloom: data too short") )
Functions ¶
Types ¶
type Dynamic ¶
type Dynamic struct {
// contains filtered or unexported fields
}
Dynamic is a Bloom filter that doesn't need a pre-set size. The idea comes from http://gsd.di.uminho.pt/members/cbm/ps/dbloom.pdf
func NewDynamic ¶
NewDynamic creates a new Bloom filter for an unbounded number of items with probability p. See New for more information.
func (*Dynamic) MarshalBinary ¶
func (*Dynamic) UnmarshalBinary ¶
type Filter ¶
type Filter struct { N uint64 // original number of items P float64 // original p value // contains filtered or unexported fields }
Filter is a Bloom filter.
func New ¶
New creates a new Bloom filter for n items with probability p. p should be the between 0 and 1 and indicate the probability of false positives wanted. n should be a positive integer describing the number of items in the filter.
func (*Filter) ErrRate ¶
ErrRate returns the estimated false positive rate based on the number of items in the filter. This approaches 1 as more items are added to the filter.
func (*Filter) Intersect ¶
Intersect sets f to the intersection of f1 and f2. Intersect's special cases are the same as Union's.
func (*Filter) MarshalBinary ¶
MarshalBinary implements encoding.BinaryMarshaler.
func (*Filter) Size ¶
Size returns the approximate number of items in the filter. At most it should be within 3.5% of the actual amount, assuming the number of items in the filter is <= the original size of the filter.
Full disclosure: in practice, the variance is less than 1%; 3.5% is absolute maximum, tested up to 1e7 elements (see bloom_test.go). The algorithm is from http://pubs.acs.org/doi/abs/10.1021/ci600526a, but since I do not have access to ACS I do not know if the authors of the authors published the variance of the algorithm.
func (*Filter) Stats ¶
Stats returns basic memory information. hashes is the number of hash functions and nbits is the number of usable bits. The total number of bits allocated by the filter will be nbits rounded up to the nearest multiple of 64. popcount is the number of set bits.
func (*Filter) Union ¶
Union sets f to the union of f1 and f2.
The union of two filters can only be taken if their P values are the same and their bit vectors are the same length. In practice, this means f1.P == f2.P && f1.N == f2.N. f will not be modified if err != nil. It is safe for f to alias either f1 or f2.
func (*Filter) UnmarshalBinary ¶
UnmarshalBinary implements encoding.BinaryUnmarshaler.