gobloom

package
v0.0.0-...-41bdb7a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 10, 2024 License: MIT Imports: 5 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var DefaultBuildBloomOptions = BloomBuildOptions{
	Entries:      100,
	Errors:       0.01,
	BloomOptions: BLOOM_OPT_NOROUND | BLOOM_OPT_FORCE64 | BLOOM_OPT_NO_SCALING,
	Growth:       2,
}

DefaultBuildBloomOptions is the default options for building a new bloom filter. The options values same as the redisbloom C version.

Functions

This section is empty.

Types

type BloomBuildOptions

type BloomBuildOptions struct {
	Entries      uint64
	Errors       float64
	BloomOptions GoBloomOptions
	Growth       uint8
}

type GoBloomChain

type GoBloomChain struct {
	// contains filtered or unexported fields
}

GoBloomChain is a chain of bloom filters It refers to the `SBChain` struct of redisbloom module. See: https://github.com/RedisBloom/RedisBloom/blob/master/src/sb.h

func NewGoBloomChain

func NewGoBloomChain(initsize uint64, errorRate float64, options GoBloomOptions, growth uint8) *GoBloomChain

NewGoBloomChain 创建一个新的布隆过滤器链 在布隆过滤器中,扩容时调整容错率(错误率,`error_rate`)乘以一个因子(在这个例子中是`0.5`,即`ERROR_TIGHTENING_RATIO`)是为了提高随后添加的过滤器链环节的精确度,并且减少误判的概率。这种做法基于以下考虑:

1. **过滤器链的性能保障**:随着数据量的增加,单一布隆过滤器由于其固定的大小和错误率,可能会遇到容量瓶颈,导致误判率上升。通过添加新的过滤器并降低其错误率,可以在整个过滤器链中平衡误判率,从而在容量增加的同时,保持或提高整体性能。

2. **误判率的累积影响**:布隆过滤器链是由多个布隆过滤器组成的,每个过滤器有自己的错误率。随着链的增长,如果每个过滤器保持相同的错误率,则整体的误判概率会逐步增加。通过降低新添加的过滤器的错误率,可以减少随着链扩展而导致的误判率增加的累积效应。

3. **扩容策略的权衡**:扩容时降低错误率是在空间效率和准确度之间的一种权衡。虽然降低错误率会导致单个过滤器需要更多的位来存储,增加了空间复杂度,但这可以有效减少整个过滤器链的总体误判率,特别是在数据规模不断增长的情况下。

4. **提高后续过滤器的效率**:随着数据的增长和过滤器链的扩展,新加入的数据项相对于整个数据集的比例逐渐减小。通过减少新过滤器的错误率,可以为新增的数据项提供更高的过滤精度,同时减少对已有数据项的影响。

综上所述,扩容时降低容错率是为了保持或提高整个布隆过滤器链的准确性和效率,尤其是在处理大规模数据集时,这种策略能够有效控制误判率的增长,保证过滤器链的长期性能和稳定性。

func (*GoBloomChain) Add

func (bc *GoBloomChain) Add(data []byte) bool
func (bc *GoBloomChain) AddLink(size uint64, errorRate float64)

func (*GoBloomChain) Check

func (bc *GoBloomChain) Check(data []byte) bool

Check 检查数据是否在布隆过滤器链中

func (*GoBloomChain) GetFilter

func (bc *GoBloomChain) GetFilter(index int) GoBloomFilter

func (*GoBloomChain) GetItems

func (bc *GoBloomChain) GetItems() uint64

func (*GoBloomChain) GetNumberFilters

func (bc *GoBloomChain) GetNumberFilters() uint64

type GoBloomFilter

type GoBloomFilter interface {
	Test(data []byte) bool
	Add(data []byte) bool
	Reset()
	SetBit(index uint64, value uint8)
	LoadBytes(offset uint64, data []byte) error
	// TestOrAdd(data []byte) bool
	GetItems() uint64
	GetEntries() uint64
	GetBytesNumber() uint64
	GetErrorRate() float64
}

type GoBloomFilterImpl

type GoBloomFilterImpl struct {
	// contains filtered or unexported fields
}

GoBloomFilterImpl is a bloom filter implementation which is refer to redisbloom module.

You can see the redisbloom source code from https://github.com/RedisBloom/RedisBloom/blob/master/deps/bloom/bloom.h

func NewGoBloom

func NewGoBloom(buildOptions *BloomBuildOptions) *GoBloomFilterImpl

NewGoBloom is a constructor for GoBloomFilterImpl. The `buildOptions` is the options for building a new bloom filter.

It is refer to the c version of `bloom_init` function in redisbloom module.

func (*GoBloomFilterImpl) Add

func (g *GoBloomFilterImpl) Add(data []byte) bool

func (*GoBloomFilterImpl) GetBytesNumber

func (g *GoBloomFilterImpl) GetBytesNumber() uint64

func (*GoBloomFilterImpl) GetEntries

func (g *GoBloomFilterImpl) GetEntries() uint64

func (*GoBloomFilterImpl) GetErrorRate

func (g *GoBloomFilterImpl) GetErrorRate() float64

func (*GoBloomFilterImpl) GetItems

func (g *GoBloomFilterImpl) GetItems() uint64

func (*GoBloomFilterImpl) Hash64

func (g *GoBloomFilterImpl) Hash64(data []byte) baseHash

func (*GoBloomFilterImpl) LoadBytes

func (g *GoBloomFilterImpl) LoadBytes(offset uint64, data []byte) error

func (*GoBloomFilterImpl) Reset

func (g *GoBloomFilterImpl) Reset()

func (*GoBloomFilterImpl) SetBit

func (g *GoBloomFilterImpl) SetBit(index uint64, value uint8)

func (*GoBloomFilterImpl) Test

func (g *GoBloomFilterImpl) Test(data []byte) bool

func (*GoBloomFilterImpl) TestOrAdd

func (g *GoBloomFilterImpl) TestOrAdd(data []byte) bool

type GoBloomOptions

type GoBloomOptions uint8

// Disable auto-scaling. Saves memory #define BLOOM_OPT_NO_SCALING 8

const (
	BLOOM_OPT_NOROUND GoBloomOptions = 1
	// Entries is actually the number of bits, not the number of entries to reserve
	BLOOM_OPT_ENTS_IS_BITS GoBloomOptions = 2
	BLOOM_OPT_FORCE64      GoBloomOptions = 4
	BLOOM_OPT_NO_SCALING   GoBloomOptions = 8
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL