histogram

package
v0.40.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 1, 2022 License: Apache-2.0 Imports: 3 Imported by: 88

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Bucket

type Bucket[BC BucketCount] struct {
	Lower, Upper                   float64
	LowerInclusive, UpperInclusive bool
	Count                          BC

	// Index within schema. To easily compare buckets that share the same
	// schema and sign (positive or negative). Irrelevant for the zero bucket.
	Index int32
}

Bucket represents a bucket with lower and upper limit and the absolute count of samples in the bucket. It also specifies if each limit is inclusive or not. (Mathematically, inclusive limits create a closed interval, and non-inclusive limits an open interval.)

To represent cumulative buckets, Lower is set to -Inf, and the Count is then cumulative (including the counts of all buckets for smaller values).

func (Bucket[BC]) String

func (b Bucket[BC]) String() string

String returns a string representation of a Bucket, using the usual mathematical notation of '['/']' for inclusive bounds and '('/')' for non-inclusive bounds.

type BucketCount

type BucketCount interface {
	float64 | uint64
}

BucketCount is a type constraint for the count in a bucket, which can be float64 (for type FloatHistogram) or uint64 (for type Histogram).

type BucketIterator

type BucketIterator[BC BucketCount] interface {
	// Next advances the iterator by one.
	Next() bool
	// At returns the current bucket.
	At() Bucket[BC]
}

BucketIterator iterates over the buckets of a Histogram, returning decoded buckets.

type FloatHistogram

type FloatHistogram struct {
	// Currently valid schema numbers are -4 <= n <= 8.  They are all for
	// base-2 bucket schemas, where 1 is a bucket boundary in each case, and
	// then each power of two is divided into 2^n logarithmic buckets.  Or
	// in other words, each bucket boundary is the previous boundary times
	// 2^(2^-n).
	Schema int32
	// Width of the zero bucket.
	ZeroThreshold float64
	// Observations falling into the zero bucket. Must be zero or positive.
	ZeroCount float64
	// Total number of observations. Must be zero or positive.
	Count float64
	// Sum of observations. This is also used as the stale marker.
	Sum float64
	// Spans for positive and negative buckets (see Span below).
	PositiveSpans, NegativeSpans []Span
	// Observation counts in buckets. Each represents an absolute count and
	// must be zero or positive.
	PositiveBuckets, NegativeBuckets []float64
}

FloatHistogram is similar to Histogram but uses float64 for all counts. Additionally, bucket counts are absolute and not deltas.

A FloatHistogram is needed by PromQL to handle operations that might result in fractional counts. Since the counts in a histogram are unlikely to be too large to be represented precisely by a float64, a FloatHistogram can also be used to represent a histogram with integer counts and thus serves as a more generalized representation.

func (*FloatHistogram) Add

Add adds the provided other histogram to the receiving histogram. Count, Sum, and buckets from the other histogram are added to the corresponding components of the receiving histogram. Buckets in the other histogram that do not exist in the receiving histogram are inserted into the latter. The resulting histogram might have buckets with a population of zero or directly adjacent spans (offset=0). To normalize those, call the Compact method.

The method reconciles differences in the zero threshold and in the schema, but the schema of the other histogram must be ≥ the schema of the receiving histogram (i.e. must have an equal or higher resolution). This means that the schema of the receiving histogram won't change. Its zero threshold, however, will change if needed. The other histogram will not be modified in any case.

This method returns a pointer to the receiving histogram for convenience.

func (*FloatHistogram) AllBucketIterator

func (h *FloatHistogram) AllBucketIterator() BucketIterator[float64]

AllBucketIterator returns a BucketIterator to iterate over all negative, zero, and positive buckets in ascending order (starting at the lowest bucket and going up). If the highest negative bucket or the lowest positive bucket overlap with the zero bucket, their upper or lower boundary, respectively, is set to the zero threshold.

func (*FloatHistogram) Compact

func (h *FloatHistogram) Compact(maxEmptyBuckets int) *FloatHistogram

Compact eliminates empty buckets at the beginning and end of each span, then merges spans that are consecutive or at most maxEmptyBuckets apart, and finally splits spans that contain more consecutive empty buckets than maxEmptyBuckets. (The actual implementation might do something more efficient but with the same result.) The compaction happens "in place" in the receiving histogram, but a pointer to it is returned for convenience.

The ideal value for maxEmptyBuckets depends on circumstances. The motivation to set maxEmptyBuckets > 0 is the assumption that is is less overhead to represent very few empty buckets explicitly within one span than cutting the one span into two to treat the empty buckets as a gap between the two spans, both in terms of storage requirement as well as in terms of encoding and decoding effort. However, the tradeoffs are subtle. For one, they are different in the exposition format vs. in a TSDB chunk vs. for the in-memory representation as Go types. In the TSDB, as an additional aspects, the span layout is only stored once per chunk, while many histograms with that same chunk layout are then only stored with their buckets (so that even a single empty bucket will be stored many times).

For the Go types, an additional Span takes 8 bytes. Similarly, an additional bucket takes 8 bytes. Therefore, with a single separating empty bucket, both options have the same storage requirement, but the single-span solution is easier to iterate through. Still, the safest bet is to use maxEmptyBuckets==0 and only use a larger number if you know what you are doing.

func (*FloatHistogram) Copy

func (h *FloatHistogram) Copy() *FloatHistogram

Copy returns a deep copy of the Histogram.

func (*FloatHistogram) CopyToSchema

func (h *FloatHistogram) CopyToSchema(targetSchema int32) *FloatHistogram

CopyToSchema works like Copy, but the returned deep copy has the provided target schema, which must be ≤ the original schema (i.e. it must have a lower resolution).

func (*FloatHistogram) DetectReset

func (h *FloatHistogram) DetectReset(previous *FloatHistogram) bool

DetectReset returns true if the receiving histogram is missing any buckets that have a non-zero population in the provided previous histogram. It also returns true if any count (in any bucket, in the zero count, or in the count of observations, but NOT the sum of observations) is smaller in the receiving histogram compared to the previous histogram. Otherwise, it returns false.

Special behavior in case the Schema or the ZeroThreshold are not the same in both histograms:

  • A decrease of the ZeroThreshold or an increase of the Schema (i.e. an increase of resolution) can only happen together with a reset. Thus, the method returns true in either case.

  • Upon an increase of the ZeroThreshold, the buckets in the previous histogram that fall within the new ZeroThreshold are added to the ZeroCount of the previous histogram (without mutating the provided previous histogram). The scenario that a populated bucket of the previous histogram is partially within, partially outside of the new ZeroThreshold, can only happen together with a counter reset and therefore shortcuts to returning true.

  • Upon a decrease of the Schema, the buckets of the previous histogram are merged so that they match the new, lower-resolution schema (again without mutating the provided previous histogram).

Note that this kind of reset detection is quite expensive. Ideally, resets are detected at ingest time and stored in the TSDB, so that the reset information can be read directly from there rather than be detected each time again.

func (*FloatHistogram) NegativeBucketIterator

func (h *FloatHistogram) NegativeBucketIterator() BucketIterator[float64]

NegativeBucketIterator returns a BucketIterator to iterate over all negative buckets in descending order (starting next to the zero bucket and going down).

func (*FloatHistogram) NegativeReverseBucketIterator

func (h *FloatHistogram) NegativeReverseBucketIterator() BucketIterator[float64]

NegativeReverseBucketIterator returns a BucketIterator to iterate over all negative buckets in ascending order (starting at the lowest bucket and going up towards the zero bucket).

func (*FloatHistogram) PositiveBucketIterator

func (h *FloatHistogram) PositiveBucketIterator() BucketIterator[float64]

PositiveBucketIterator returns a BucketIterator to iterate over all positive buckets in ascending order (starting next to the zero bucket and going up).

func (*FloatHistogram) PositiveReverseBucketIterator

func (h *FloatHistogram) PositiveReverseBucketIterator() BucketIterator[float64]

PositiveReverseBucketIterator returns a BucketIterator to iterate over all positive buckets in descending order (starting at the highest bucket and going down towards the zero bucket).

func (*FloatHistogram) Scale

func (h *FloatHistogram) Scale(factor float64) *FloatHistogram

Scale scales the FloatHistogram by the provided factor, i.e. it scales all bucket counts including the zero bucket and the count and the sum of observations. The bucket layout stays the same. This method changes the receiving histogram directly (rather than acting on a copy). It returns a pointer to the receiving histogram for convenience.

func (*FloatHistogram) String

func (h *FloatHistogram) String() string

String returns a string representation of the Histogram.

func (*FloatHistogram) Sub

Sub works like Add but subtracts the other histogram.

func (*FloatHistogram) ZeroBucket

func (h *FloatHistogram) ZeroBucket() Bucket[float64]

ZeroBucket returns the zero bucket.

type Histogram

type Histogram struct {
	// Currently valid schema numbers are -4 <= n <= 8.  They are all for
	// base-2 bucket schemas, where 1 is a bucket boundary in each case, and
	// then each power of two is divided into 2^n logarithmic buckets.  Or
	// in other words, each bucket boundary is the previous boundary times
	// 2^(2^-n).
	Schema int32
	// Width of the zero bucket.
	ZeroThreshold float64
	// Observations falling into the zero bucket.
	ZeroCount uint64
	// Total number of observations.
	Count uint64
	// Sum of observations. This is also used as the stale marker.
	Sum float64
	// Spans for positive and negative buckets (see Span below).
	PositiveSpans, NegativeSpans []Span
	// Observation counts in buckets. The first element is an absolute
	// count. All following ones are deltas relative to the previous
	// element.
	PositiveBuckets, NegativeBuckets []int64
}

Histogram encodes a sparse, high-resolution histogram. See the design document for full details: https://docs.google.com/document/d/1cLNv3aufPZb3fNfaJgdaRBZsInZKKIHo9E6HinJVbpM/edit#

The most tricky bit is how bucket indices represent real bucket boundaries. An example for schema 0 (by which each bucket is twice as wide as the previous bucket):

Bucket boundaries →              [-2,-1)  [-1,-0.5) [-0.5,-0.25) ... [-0.001,0.001] ... (0.25,0.5] (0.5,1]  (1,2] ....
                                    ↑        ↑           ↑                  ↑                ↑         ↑      ↑
Zero bucket (width e.g. 0.001) →    |        |           |                  ZB               |         |      |
Positive bucket indices →           |        |           |                          ...     -1         0      1    2    3
Negative bucket indices →  3   2    1        0          -1       ...

Which bucket indices are actually used is determined by the spans.

func (*Histogram) Compact

func (h *Histogram) Compact(maxEmptyBuckets int) *Histogram

Compact works like FloatHistogram.Compact. See there for detailed explanations.

func (*Histogram) Copy

func (h *Histogram) Copy() *Histogram

Copy returns a deep copy of the Histogram.

func (*Histogram) CumulativeBucketIterator

func (h *Histogram) CumulativeBucketIterator() BucketIterator[uint64]

CumulativeBucketIterator returns a BucketIterator to iterate over a cumulative view of the buckets. This method currently only supports Histograms without negative buckets and panics if the Histogram has negative buckets. It is currently only used for testing.

func (*Histogram) Equals

func (h *Histogram) Equals(h2 *Histogram) bool

Equals returns true if the given histogram matches exactly. Exact match is when there are no new buckets (even empty) and no missing buckets, and all the bucket values match. Spans can have different empty length spans in between, but they must represent the same bucket layout to match.

func (*Histogram) NegativeBucketIterator

func (h *Histogram) NegativeBucketIterator() BucketIterator[uint64]

NegativeBucketIterator returns a BucketIterator to iterate over all negative buckets in descending order (starting next to the zero bucket and going down).

func (*Histogram) PositiveBucketIterator

func (h *Histogram) PositiveBucketIterator() BucketIterator[uint64]

PositiveBucketIterator returns a BucketIterator to iterate over all positive buckets in ascending order (starting next to the zero bucket and going up).

func (*Histogram) String

func (h *Histogram) String() string

String returns a string representation of the Histogram.

func (*Histogram) ToFloat

func (h *Histogram) ToFloat() *FloatHistogram

ToFloat returns a FloatHistogram representation of the Histogram. It is a deep copy (e.g. spans are not shared).

func (*Histogram) ZeroBucket

func (h *Histogram) ZeroBucket() Bucket[uint64]

ZeroBucket returns the zero bucket.

type Span

type Span struct {
	// Gap to previous span (always positive), or starting index for the 1st
	// span (which can be negative).
	Offset int32
	// Length of the span.
	Length uint32
}

A Span defines a continuous sequence of buckets.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL