sketchy

package module
v0.0.0-...-15845fb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 16, 2015 License: MIT Imports: 5 Imported by: 0

README

Build Status GoDoc


Package sketchy provides count- and rate-tracking sketches. These are probabilistic data structures that can track the occurrences of a very large number of distinct keys using a relatively small amount of space. The resulting counts/rates are estimates with a guaranteed level of accuracy.

Documentation

Overview

Package sketchy provides count- and rate-tracking sketches. These are probabilistic data structures that can track the occurrences of a very large number of distinct keys using a relatively small amount of space. The resulting counts/rates are estimates with a guaranteed level of accuracy.

Index

Constants

This section is empty.

Variables

View Source
var (
	DefaultEpsilon = 0.999
	DefaultDelta   = 0.99
)

Functions

This section is empty.

Types

type CountSketch

type CountSketch interface {
	// Count adds delta to the count of occurrences of the given key.
	// Returns the updated estimated count.
	Count(key []byte, delta int) uint64

	// Query returns the estimated count of the given key.
	Query(key []byte) uint64
}

A Sketch counts occurrences of keys and returns approximate total counts.

func NewSketch

func NewSketch(epsilon, delta float64) CountSketch

NewSketch returns a new, empty count-min sketch with the given parameters. The values of epsilon and delta determine the desired accuracy of the sketch. Queries for the count of observations by a particular key by this sketch will be within a factor of epsilon of the true count, with probability delta.

The closer these parameters are to 1, the greater the storage and computation cost. The value of delta determines how many hashes we must compute for each key (and how many counters we must inspect to answer a count query). The bucket uses ceil(log(1 / (1-delta))) hashes. The value of epsilon determines the size of the domain we map these hashes to. The size of the domain is e / (1-epsilon). So, for epsilon=0.999, delta=0.99, we would store 2719 counters for each of five hash values.

type RateSketch

type RateSketch interface {
	// Count records delta occurrences of key, returning the updated observed
	// rate over the given interval. If interval is smaller than time.Second,
	// or the available data covers less than a second, then 0 is returned.
	Count(key []byte, delta int, interval time.Duration) float64

	// Query returns the observed rate of the given key over the given interval.
	// If interval is smaller than time.Second, or the available data covers
	// less than a second, then 0 is returned.
	Query(key []byte, interval time.Duration) float64
}

Counter provides an interface for tracking the rate at which keys are observed.

func RollingCounter

func RollingCounter(epsilon, delta float64, interval time.Duration, num int) RateSketch

RollingCounter maintains a series of count-min sketches to count events in time-based buckets. Counts are always applied to the "current" bucket (which is reinitialized as needed). Rate queries can use multiple buckets to account for the interval of the query.

The interval given to this constructor specifies the maximum duration of each bucket. If an event comes in beyond the current bucket's duration, then a new bucket is created. If the maximum number of buckets (given by num) is exceeded, then the oldest bucket is forgotten.

func RollupCounter

func RollupCounter(epsilon, delta float64, durations ...time.Duration) RateSketch

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL