Documentation ¶
Index ¶
- Variables
- type BloomBuildOptions
- type GoBloomChain
- func (bc *GoBloomChain) Add(data []byte) bool
- func (bc *GoBloomChain) AddLink(size uint64, errorRate float64)
- func (bc *GoBloomChain) Check(data []byte) bool
- func (bc *GoBloomChain) GetFilter(index int) GoBloomFilter
- func (bc *GoBloomChain) GetItems() uint64
- func (bc *GoBloomChain) GetNumberFilters() uint64
- type GoBloomFilter
- type GoBloomFilterImpl
- func (g *GoBloomFilterImpl) Add(data []byte) bool
- func (g *GoBloomFilterImpl) GetBytesNumber() uint64
- func (g *GoBloomFilterImpl) GetEntries() uint64
- func (g *GoBloomFilterImpl) GetErrorRate() float64
- func (g *GoBloomFilterImpl) GetItems() uint64
- func (g *GoBloomFilterImpl) Hash64(data []byte) baseHash
- func (g *GoBloomFilterImpl) LoadBytes(offset uint64, data []byte) error
- func (g *GoBloomFilterImpl) Reset()
- func (g *GoBloomFilterImpl) SetBit(index uint64, value uint8)
- func (g *GoBloomFilterImpl) Test(data []byte) bool
- func (g *GoBloomFilterImpl) TestOrAdd(data []byte) bool
- type GoBloomOptions
Constants ¶
This section is empty.
Variables ¶
var DefaultBuildBloomOptions = BloomBuildOptions{ Entries: 100, Errors: 0.01, BloomOptions: BLOOM_OPT_NOROUND | BLOOM_OPT_FORCE64 | BLOOM_OPT_NO_SCALING, Growth: 2, }
DefaultBuildBloomOptions is the default options for building a new bloom filter. The options values same as the redisbloom C version.
Functions ¶
This section is empty.
Types ¶
type BloomBuildOptions ¶
type BloomBuildOptions struct { Entries uint64 Errors float64 BloomOptions GoBloomOptions Growth uint8 }
type GoBloomChain ¶
type GoBloomChain struct {
// contains filtered or unexported fields
}
GoBloomChain is a chain of bloom filters It refers to the `SBChain` struct of redisbloom module. See: https://github.com/RedisBloom/RedisBloom/blob/master/src/sb.h
func NewGoBloomChain ¶
func NewGoBloomChain(initsize uint64, errorRate float64, options GoBloomOptions, growth uint8) *GoBloomChain
NewGoBloomChain 创建一个新的布隆过滤器链 在布隆过滤器中,扩容时调整容错率(错误率,`error_rate`)乘以一个因子(在这个例子中是`0.5`,即`ERROR_TIGHTENING_RATIO`)是为了提高随后添加的过滤器链环节的精确度,并且减少误判的概率。这种做法基于以下考虑:
1. **过滤器链的性能保障**:随着数据量的增加,单一布隆过滤器由于其固定的大小和错误率,可能会遇到容量瓶颈,导致误判率上升。通过添加新的过滤器并降低其错误率,可以在整个过滤器链中平衡误判率,从而在容量增加的同时,保持或提高整体性能。
2. **误判率的累积影响**:布隆过滤器链是由多个布隆过滤器组成的,每个过滤器有自己的错误率。随着链的增长,如果每个过滤器保持相同的错误率,则整体的误判概率会逐步增加。通过降低新添加的过滤器的错误率,可以减少随着链扩展而导致的误判率增加的累积效应。
3. **扩容策略的权衡**:扩容时降低错误率是在空间效率和准确度之间的一种权衡。虽然降低错误率会导致单个过滤器需要更多的位来存储,增加了空间复杂度,但这可以有效减少整个过滤器链的总体误判率,特别是在数据规模不断增长的情况下。
4. **提高后续过滤器的效率**:随着数据的增长和过滤器链的扩展,新加入的数据项相对于整个数据集的比例逐渐减小。通过减少新过滤器的错误率,可以为新增的数据项提供更高的过滤精度,同时减少对已有数据项的影响。
综上所述,扩容时降低容错率是为了保持或提高整个布隆过滤器链的准确性和效率,尤其是在处理大规模数据集时,这种策略能够有效控制误判率的增长,保证过滤器链的长期性能和稳定性。
func (*GoBloomChain) Add ¶
func (bc *GoBloomChain) Add(data []byte) bool
func (*GoBloomChain) AddLink ¶
func (bc *GoBloomChain) AddLink(size uint64, errorRate float64)
func (*GoBloomChain) GetFilter ¶
func (bc *GoBloomChain) GetFilter(index int) GoBloomFilter
func (*GoBloomChain) GetItems ¶
func (bc *GoBloomChain) GetItems() uint64
func (*GoBloomChain) GetNumberFilters ¶
func (bc *GoBloomChain) GetNumberFilters() uint64
type GoBloomFilter ¶
type GoBloomFilterImpl ¶
type GoBloomFilterImpl struct {
// contains filtered or unexported fields
}
GoBloomFilterImpl is a bloom filter implementation which is refer to redisbloom module.
You can see the redisbloom source code from https://github.com/RedisBloom/RedisBloom/blob/master/deps/bloom/bloom.h
func NewGoBloom ¶
func NewGoBloom(buildOptions *BloomBuildOptions) *GoBloomFilterImpl
NewGoBloom is a constructor for GoBloomFilterImpl. The `buildOptions` is the options for building a new bloom filter.
It is refer to the c version of `bloom_init` function in redisbloom module.
func (*GoBloomFilterImpl) Add ¶
func (g *GoBloomFilterImpl) Add(data []byte) bool
func (*GoBloomFilterImpl) GetBytesNumber ¶
func (g *GoBloomFilterImpl) GetBytesNumber() uint64
func (*GoBloomFilterImpl) GetEntries ¶
func (g *GoBloomFilterImpl) GetEntries() uint64
func (*GoBloomFilterImpl) GetErrorRate ¶
func (g *GoBloomFilterImpl) GetErrorRate() float64
func (*GoBloomFilterImpl) GetItems ¶
func (g *GoBloomFilterImpl) GetItems() uint64
func (*GoBloomFilterImpl) Hash64 ¶
func (g *GoBloomFilterImpl) Hash64(data []byte) baseHash
func (*GoBloomFilterImpl) LoadBytes ¶
func (g *GoBloomFilterImpl) LoadBytes(offset uint64, data []byte) error
func (*GoBloomFilterImpl) Reset ¶
func (g *GoBloomFilterImpl) Reset()
func (*GoBloomFilterImpl) SetBit ¶
func (g *GoBloomFilterImpl) SetBit(index uint64, value uint8)
func (*GoBloomFilterImpl) Test ¶
func (g *GoBloomFilterImpl) Test(data []byte) bool
func (*GoBloomFilterImpl) TestOrAdd ¶
func (g *GoBloomFilterImpl) TestOrAdd(data []byte) bool
type GoBloomOptions ¶
type GoBloomOptions uint8
// Disable auto-scaling. Saves memory #define BLOOM_OPT_NO_SCALING 8
const ( BLOOM_OPT_NOROUND GoBloomOptions = 1 // Entries is actually the number of bits, not the number of entries to reserve BLOOM_OPT_ENTS_IS_BITS GoBloomOptions = 2 BLOOM_OPT_FORCE64 GoBloomOptions = 4 BLOOM_OPT_NO_SCALING GoBloomOptions = 8 )