Documentation ¶
Overview ¶
Package bloom implements parquet bloom filters.
Index ¶
- Constants
- func CheckSplitBlock(r io.ReaderAt, n int64, x uint64) (bool, error)
- func NumSplitBlocksOf(numValues int64, bitsPerValue uint) int
- type Block
- type Filter
- type Hash
- type SplitBlockFilter
- type Word
- type XXH64
- func (XXH64) MultiSum64Uint128(h []uint64, v [][16]byte) int
- func (XXH64) MultiSum64Uint16(h []uint64, v []uint16) int
- func (XXH64) MultiSum64Uint32(h []uint64, v []uint32) int
- func (XXH64) MultiSum64Uint64(h []uint64, v []uint64) int
- func (XXH64) MultiSum64Uint8(h []uint64, v []uint8) int
- func (XXH64) Sum64(b []byte) uint64
- func (XXH64) Sum64Uint128(v [16]byte) uint64
- func (XXH64) Sum64Uint16(v uint16) uint64
- func (XXH64) Sum64Uint32(v uint32) uint64
- func (XXH64) Sum64Uint64(v uint64) uint64
- func (XXH64) Sum64Uint8(v uint8) uint64
Constants ¶
const (
// BlockSize is the size of bloom filter blocks in bytes.
BlockSize = 32
)
Variables ¶
This section is empty.
Functions ¶
func CheckSplitBlock ¶
CheckSplitBlock is similar to bloom.SplitBlockFilter.Check but reads the bloom filter of n bytes from r.
The size n of the bloom filter is assumed to be a multiple of the block size.
func NumSplitBlocksOf ¶
NumSplitBlocksOf returns the number of blocks in a filter intended to hold the given number of values and bits of filter per value.
This function is useful to determine the number of blocks when creating bloom filters in memory, for example:
f := make(bloom.SplitBlockFilter, bloom.NumSplitBlocksOf(n, 10))
Types ¶
type Block ¶
type Block [8]Word
Block represents bloom filter blocks which contain eight 32 bits words.
type Filter ¶
Filter is an interface representing read-only bloom filters where programs can probe for the possible presence of a hash key.
type Hash ¶
type Hash interface { // Returns the 64 bit hash of the value passed as argument. Sum64(value []byte) uint64 // Compute hashes of individual values of primitive types. Sum64Uint8(value uint8) uint64 Sum64Uint16(value uint16) uint64 Sum64Uint32(value uint32) uint64 Sum64Uint64(value uint64) uint64 Sum64Uint128(value [16]byte) uint64 // Compute hashes of the array of fixed size values passed as arguments, // returning the number of hashes written to the destination buffer. MultiSum64Uint8(dst []uint64, src []uint8) int MultiSum64Uint16(dst []uint64, src []uint16) int MultiSum64Uint32(dst []uint64, src []uint32) int MultiSum64Uint64(dst []uint64, src []uint64) int MultiSum64Uint128(dst []uint64, src [][16]byte) int }
Hash is an interface abstracting the hashing algorithm used in bloom filters.
Hash instances must be safe to use concurrently from multiple goroutines.
type SplitBlockFilter ¶
type SplitBlockFilter []Block
SplitBlockFilter is an in-memory implementation of the parquet bloom filters.
This type is useful to construct bloom filters that are later serialized to a storage medium.
func MakeSplitBlockFilter ¶
func MakeSplitBlockFilter(data []byte) SplitBlockFilter
MakeSplitBlockFilter constructs a SplitBlockFilter value from the data byte slice.
func (SplitBlockFilter) Block ¶
func (f SplitBlockFilter) Block(x uint64) *Block
Block returns a pointer to the block that the given value hashes to in the bloom filter.
func (SplitBlockFilter) Bytes ¶
func (f SplitBlockFilter) Bytes() []byte
Bytes converts f to a byte slice.
The returned slice shares the memory of f. The method is intended to be used to serialize the bloom filter to a storage medium.
func (SplitBlockFilter) Check ¶
func (f SplitBlockFilter) Check(x uint64) bool
Check tests whether x is in f.
func (SplitBlockFilter) InsertBulk ¶
func (f SplitBlockFilter) InsertBulk(x []uint64)
InsertBulk adds all values from x into f.
func (SplitBlockFilter) Reset ¶
func (f SplitBlockFilter) Reset()
Reset clears the content of the filter f.
type XXH64 ¶
type XXH64 struct{}
XXH64 is an implementation of the Hash interface using the XXH64 algorithm.