Documentation ¶
Overview ¶
Package hyperloglog implements the HyperLogLog algorithm for cardinality estimation. In English: it counts things. It counts things using very small amounts of memory compared to the number of objects it is counting.
For a full description of the algorithm, see the paper HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm by Flajolet, et. al.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Murmur128 ¶
Murmur128 implements a fast version of the murmur hash function for two uint64s for little endian machines. Suitable for adding a 128bit value to an HLL counter.
func Murmur32 ¶
Murmur32 implements a fast version of the murmur hash function for uint32 for little endian machines. Suitable for adding 32bit integers to a HLL counter.
func Murmur64 ¶
Murmur64 implements a fast version of the murmur hash function for uint64 for little endian machines. Suitable for adding 64bit integers to a HLL counter.
func MurmurBytes ¶
MurmurBytes implements a fast version of the murmur hash function for bytes for little endian machines. Suitable for adding strings to HLL counter.
func MurmurString ¶
MurmurString implements a fast version of the murmur hash function for strings for little endian machines. Suitable for adding strings to HLL counter.
Types ¶
type HyperLogLog ¶
type HyperLogLog struct { M uint // Number of registers B uint32 // Number of bits used to determine register index Alpha float64 // Bias correction constant Registers []uint8 }
A HyperLogLog is a deterministic cardinality estimator. This version exports its fields so that it is suitable for saving eg. to a database.
func New ¶
func New(registers uint) (*HyperLogLog, error)
New creates a HyperLogLog with the given number of registers. More registers leads to lower error in your estimated count, at the expense of memory.
Choose a power of two number of registers, depending on the amount of memory you're willing to use and the error you're willing to tolerate. Each register uses one byte of memory.
Standard error will be: σ ≈ 1.04 / sqrt(registers) The estimates provided by hyperloglog are expected to be within σ, 2σ, 3σ of the exact count in respectively 65%, 95%, 99% of all the cases.
func (*HyperLogLog) Add ¶
func (h *HyperLogLog) Add(val uint32)
Add to the count. val should be a 32 bit unsigned integer from a good hash function.
func (*HyperLogLog) Count ¶
func (h *HyperLogLog) Count() uint64
Count returns the estimated cardinality.
func (*HyperLogLog) CountWithoutLargeRangeCorrection ¶
func (h *HyperLogLog) CountWithoutLargeRangeCorrection() uint64
CountWithoutLargeRangeCorrection returns the estimated cardinality, without applying the large range correction proposed by Flajolet et al. as it can lead to significant overcounting.
func (*HyperLogLog) Merge ¶
func (h *HyperLogLog) Merge(other *HyperLogLog) error
Merge another HyperLogLog into this one. The number of registers in each must be the same.
func (*HyperLogLog) Reset ¶
func (h *HyperLogLog) Reset()
Reset all internal variables and set the count to zero.