histogram

package
v0.0.0-...-03ae564 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 31, 2024 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrHistogramCountNotBigEnough    = errors.New("histogram's observation count should be at least the number of observations found in the buckets")
	ErrHistogramCountMismatch        = errors.New("histogram's observation count should equal the number of observations found in the buckets (in absence of NaN)")
	ErrHistogramNegativeBucketCount  = errors.New("histogram has a bucket whose observation count is negative")
	ErrHistogramSpanNegativeOffset   = errors.New("histogram has a span whose offset is negative")
	ErrHistogramSpansBucketsMismatch = errors.New("histogram spans specify different number of buckets than provided")
)

Functions

This section is empty.

Types

type Bucket

type Bucket[BC BucketCount] struct {
	Lower, Upper                   float64
	LowerInclusive, UpperInclusive bool
	Count                          BC

	// Index within schema. To easily compare buckets that share the same
	// schema and sign (positive or negative). Irrelevant for the zero bucket.
	Index int32
}

Bucket represents a bucket with lower and upper limit and the absolute count of samples in the bucket. It also specifies if each limit is inclusive or not. (Mathematically, inclusive limits create a closed interval, and non-inclusive limits an open interval.)

To represent cumulative buckets, Lower is set to -Inf, and the Count is then cumulative (including the counts of all buckets for smaller values).

func (Bucket[BC]) String

func (b Bucket[BC]) String() string

String returns a string representation of a Bucket, using the usual mathematical notation of '['/']' for inclusive bounds and '('/')' for non-inclusive bounds.

type BucketCount

type BucketCount interface {
	float64 | uint64
}

BucketCount is a type constraint for the count in a bucket, which can be float64 (for type FloatHistogram) or uint64 (for type Histogram).

type BucketIterator

type BucketIterator[BC BucketCount] interface {
	// Next advances the iterator by one.
	Next() bool
	// At returns the current bucket.
	At() Bucket[BC]
}

BucketIterator iterates over the buckets of a Histogram, returning decoded buckets.

type CounterResetHint

type CounterResetHint byte

CounterResetHint contains the known information about a counter reset, or alternatively that we are dealing with a gauge histogram, where counter resets do not apply.

const (
	UnknownCounterReset CounterResetHint = iota // UnknownCounterReset means we cannot say if this histogram signals a counter reset or not.
	CounterReset                                // CounterReset means there was definitely a counter reset starting from this histogram.
	NotCounterReset                             // NotCounterReset means there was definitely no counter reset with this histogram.
	GaugeType                                   // GaugeType means this is a gauge histogram, where counter resets do not happen.
)

type FloatHistogram

type FloatHistogram struct {
	// Counter reset information.
	CounterResetHint CounterResetHint
	// Currently valid schema numbers are -4 <= n <= 8.  They are all for
	// base-2 bucket schemas, where 1 is a bucket boundary in each case, and
	// then each power of two is divided into 2^n logarithmic buckets.  Or
	// in other words, each bucket boundary is the previous boundary times
	// 2^(2^-n).
	Schema int32
	// Width of the zero bucket.
	ZeroThreshold float64
	// Observations falling into the zero bucket. Must be zero or positive.
	ZeroCount float64
	// Total number of observations. Must be zero or positive.
	Count float64
	// Sum of observations. This is also used as the stale marker.
	Sum float64
	// Spans for positive and negative buckets (see Span below).
	PositiveSpans, NegativeSpans []Span
	// Observation counts in buckets. Each represents an absolute count and
	// must be zero or positive.
	PositiveBuckets, NegativeBuckets []float64
}

FloatHistogram is similar to Histogram but uses float64 for all counts. Additionally, bucket counts are absolute and not deltas.

A FloatHistogram is needed by PromQL to handle operations that might result in fractional counts. Since the counts in a histogram are unlikely to be too large to be represented precisely by a float64, a FloatHistogram can also be used to represent a histogram with integer counts and thus serves as a more generalized representation.

func (*FloatHistogram) Add

Add adds the provided other histogram to the receiving histogram. Count, Sum, and buckets from the other histogram are added to the corresponding components of the receiving histogram. Buckets in the other histogram that do not exist in the receiving histogram are inserted into the latter. The resulting histogram might have buckets with a population of zero or directly adjacent spans (offset=0). To normalize those, call the Compact method.

The method reconciles differences in the zero threshold and in the schema, and changes them if needed. The other histogram will not be modified in any case.

This method returns a pointer to the receiving histogram for convenience.

func (*FloatHistogram) AllBucketIterator

func (h *FloatHistogram) AllBucketIterator() BucketIterator[float64]

AllBucketIterator returns a BucketIterator to iterate over all negative, zero, and positive buckets in ascending order (starting at the lowest bucket and going up). If the highest negative bucket or the lowest positive bucket overlap with the zero bucket, their upper or lower boundary, respectively, is set to the zero threshold.

func (*FloatHistogram) AllReverseBucketIterator

func (h *FloatHistogram) AllReverseBucketIterator() BucketIterator[float64]

AllReverseBucketIterator returns a BucketIterator to iterate over all negative, zero, and positive buckets in descending order (starting at the lowest bucket and going up). If the highest negative bucket or the lowest positive bucket overlap with the zero bucket, their upper or lower boundary, respectively, is set to the zero threshold.

func (*FloatHistogram) Compact

func (h *FloatHistogram) Compact(maxEmptyBuckets int) *FloatHistogram

Compact eliminates empty buckets at the beginning and end of each span, then merges spans that are consecutive or at most maxEmptyBuckets apart, and finally splits spans that contain more consecutive empty buckets than maxEmptyBuckets. (The actual implementation might do something more efficient but with the same result.) The compaction happens "in place" in the receiving histogram, but a pointer to it is returned for convenience.

The ideal value for maxEmptyBuckets depends on circumstances. The motivation to set maxEmptyBuckets > 0 is the assumption that is less overhead to represent very few empty buckets explicitly within one span than cutting the one span into two to treat the empty buckets as a gap between the two spans, both in terms of storage requirement as well as in terms of encoding and decoding effort. However, the tradeoffs are subtle. For one, they are different in the exposition format vs. in a TSDB chunk vs. for the in-memory representation as Go types. In the TSDB, as an additional aspects, the span layout is only stored once per chunk, while many histograms with that same chunk layout are then only stored with their buckets (so that even a single empty bucket will be stored many times).

For the Go types, an additional Span takes 8 bytes. Similarly, an additional bucket takes 8 bytes. Therefore, with a single separating empty bucket, both options have the same storage requirement, but the single-span solution is easier to iterate through. Still, the safest bet is to use maxEmptyBuckets==0 and only use a larger number if you know what you are doing.

func (*FloatHistogram) Copy

func (h *FloatHistogram) Copy() *FloatHistogram

Copy returns a deep copy of the Histogram.

func (*FloatHistogram) CopyTo

func (h *FloatHistogram) CopyTo(to *FloatHistogram)

CopyTo makes a deep copy into the given FloatHistogram. The destination object has to be a non-nil pointer.

func (*FloatHistogram) CopyToSchema

func (h *FloatHistogram) CopyToSchema(targetSchema int32) *FloatHistogram

CopyToSchema works like Copy, but the returned deep copy has the provided target schema, which must be ≤ the original schema (i.e. it must have a lower resolution).

func (*FloatHistogram) DetectReset

func (h *FloatHistogram) DetectReset(previous *FloatHistogram) bool

DetectReset returns true if the receiving histogram is missing any buckets that have a non-zero population in the provided previous histogram. It also returns true if any count (in any bucket, in the zero count, or in the count of observations, but NOT the sum of observations) is smaller in the receiving histogram compared to the previous histogram. Otherwise, it returns false.

This method will shortcut to true if a CounterReset is detected, and shortcut to false if NotCounterReset is detected. Otherwise it will do the work to detect a reset.

Special behavior in case the Schema or the ZeroThreshold are not the same in both histograms:

  • A decrease of the ZeroThreshold or an increase of the Schema (i.e. an increase of resolution) can only happen together with a reset. Thus, the method returns true in either case.

  • Upon an increase of the ZeroThreshold, the buckets in the previous histogram that fall within the new ZeroThreshold are added to the ZeroCount of the previous histogram (without mutating the provided previous histogram). The scenario that a populated bucket of the previous histogram is partially within, partially outside of the new ZeroThreshold, can only happen together with a counter reset and therefore shortcuts to returning true.

  • Upon a decrease of the Schema, the buckets of the previous histogram are merged so that they match the new, lower-resolution schema (again without mutating the provided previous histogram).

func (*FloatHistogram) Div

func (h *FloatHistogram) Div(scalar float64) *FloatHistogram

Div works like Mul but divides instead of multiplies. When dividing by 0, everything will be set to Inf.

func (*FloatHistogram) Equals

func (h *FloatHistogram) Equals(h2 *FloatHistogram) bool

Equals returns true if the given float histogram matches exactly. Exact match is when there are no new buckets (even empty) and no missing buckets, and all the bucket values match. Spans can have different empty length spans in between, but they must represent the same bucket layout to match. Sum, Count, ZeroCount and bucket values are compared based on their bit patterns because this method is about data equality rather than mathematical equality.

func (*FloatHistogram) Mul

func (h *FloatHistogram) Mul(factor float64) *FloatHistogram

Mul multiplies the FloatHistogram by the provided factor, i.e. it scales all bucket counts including the zero bucket and the count and the sum of observations. The bucket layout stays the same. This method changes the receiving histogram directly (rather than acting on a copy). It returns a pointer to the receiving histogram for convenience.

func (*FloatHistogram) NegativeBucketIterator

func (h *FloatHistogram) NegativeBucketIterator() BucketIterator[float64]

NegativeBucketIterator returns a BucketIterator to iterate over all negative buckets in descending order (starting next to the zero bucket and going down).

func (*FloatHistogram) NegativeReverseBucketIterator

func (h *FloatHistogram) NegativeReverseBucketIterator() BucketIterator[float64]

NegativeReverseBucketIterator returns a BucketIterator to iterate over all negative buckets in ascending order (starting at the lowest bucket and going up towards the zero bucket).

func (*FloatHistogram) PositiveBucketIterator

func (h *FloatHistogram) PositiveBucketIterator() BucketIterator[float64]

PositiveBucketIterator returns a BucketIterator to iterate over all positive buckets in ascending order (starting next to the zero bucket and going up).

func (*FloatHistogram) PositiveReverseBucketIterator

func (h *FloatHistogram) PositiveReverseBucketIterator() BucketIterator[float64]

PositiveReverseBucketIterator returns a BucketIterator to iterate over all positive buckets in descending order (starting at the highest bucket and going down towards the zero bucket).

func (*FloatHistogram) ReduceResolution

func (h *FloatHistogram) ReduceResolution(targetSchema int32) *FloatHistogram

ReduceResolution reduces the float histogram's spans, buckets into target schema. The target schema must be smaller than the current float histogram's schema.

func (*FloatHistogram) Size

func (h *FloatHistogram) Size() int

Size returns the total size of the FloatHistogram, which includes the size of the pointer to FloatHistogram, all its fields, and all elements contained in slices. NOTE: this is only valid for 64 bit architectures.

func (*FloatHistogram) String

func (h *FloatHistogram) String() string

String returns a string representation of the Histogram.

func (*FloatHistogram) Sub

Sub works like Add but subtracts the other histogram.

func (*FloatHistogram) TestExpression

func (h *FloatHistogram) TestExpression() string

TestExpression returns the string representation of this histogram as it is used in the internal PromQL testing framework as well as in promtool rules unit tests. The syntax is described in https://prometheus.io/docs/prometheus/latest/configuration/unit_testing_rules/#series

func (*FloatHistogram) Validate

func (h *FloatHistogram) Validate() error

Validate validates consistency between span and bucket slices. Also, buckets are checked against negative values. We do not check for h.Count being at least as large as the sum of the counts in the buckets because floating point precision issues can create false positives here.

func (*FloatHistogram) ZeroBucket

func (h *FloatHistogram) ZeroBucket() Bucket[float64]

ZeroBucket returns the zero bucket.

type Histogram

type Histogram struct {
	// Counter reset information.
	CounterResetHint CounterResetHint
	// Currently valid schema numbers are -4 <= n <= 8.  They are all for
	// base-2 bucket schemas, where 1 is a bucket boundary in each case, and
	// then each power of two is divided into 2^n logarithmic buckets.  Or
	// in other words, each bucket boundary is the previous boundary times
	// 2^(2^-n).
	Schema int32
	// Width of the zero bucket.
	ZeroThreshold float64
	// Observations falling into the zero bucket.
	ZeroCount uint64
	// Total number of observations.
	Count uint64
	// Sum of observations. This is also used as the stale marker.
	Sum float64
	// Spans for positive and negative buckets (see Span below).
	PositiveSpans, NegativeSpans []Span
	// Observation counts in buckets. The first element is an absolute
	// count. All following ones are deltas relative to the previous
	// element.
	PositiveBuckets, NegativeBuckets []int64
}

Histogram encodes a sparse, high-resolution histogram. See the design document for full details: https://docs.google.com/document/d/1cLNv3aufPZb3fNfaJgdaRBZsInZKKIHo9E6HinJVbpM/edit#

The most tricky bit is how bucket indices represent real bucket boundaries. An example for schema 0 (by which each bucket is twice as wide as the previous bucket):

Bucket boundaries →              [-2,-1)  [-1,-0.5) [-0.5,-0.25) ... [-0.001,0.001] ... (0.25,0.5] (0.5,1]  (1,2] ....
                                    ↑        ↑           ↑                  ↑                ↑         ↑      ↑
Zero bucket (width e.g. 0.001) →    |        |           |                  ZB               |         |      |
Positive bucket indices →           |        |           |                          ...     -1         0      1    2    3
Negative bucket indices →  3   2    1        0          -1       ...

Which bucket indices are actually used is determined by the spans.

func GenerateBigTestHistograms

func GenerateBigTestHistograms(numHistograms, numBuckets int) []*Histogram

GenerateBigTestHistograms generates a slice of histograms with given number of buckets each.

func (*Histogram) Compact

func (h *Histogram) Compact(maxEmptyBuckets int) *Histogram

Compact works like FloatHistogram.Compact. See there for detailed explanations.

func (*Histogram) Copy

func (h *Histogram) Copy() *Histogram

Copy returns a deep copy of the Histogram.

func (*Histogram) CopyTo

func (h *Histogram) CopyTo(to *Histogram)

CopyTo makes a deep copy into the given Histogram object. The destination object has to be a non-nil pointer.

func (*Histogram) CumulativeBucketIterator

func (h *Histogram) CumulativeBucketIterator() BucketIterator[uint64]

CumulativeBucketIterator returns a BucketIterator to iterate over a cumulative view of the buckets. This method currently only supports Histograms without negative buckets and panics if the Histogram has negative buckets. It is currently only used for testing.

func (*Histogram) Equals

func (h *Histogram) Equals(h2 *Histogram) bool

Equals returns true if the given histogram matches exactly. Exact match is when there are no new buckets (even empty) and no missing buckets, and all the bucket values match. Spans can have different empty length spans in between, but they must represent the same bucket layout to match. Sum is compared based on its bit pattern because this method is about data equality rather than mathematical equality.

func (*Histogram) NegativeBucketIterator

func (h *Histogram) NegativeBucketIterator() BucketIterator[uint64]

NegativeBucketIterator returns a BucketIterator to iterate over all negative buckets in descending order (starting next to the zero bucket and going down).

func (*Histogram) PositiveBucketIterator

func (h *Histogram) PositiveBucketIterator() BucketIterator[uint64]

PositiveBucketIterator returns a BucketIterator to iterate over all positive buckets in ascending order (starting next to the zero bucket and going up).

func (*Histogram) ReduceResolution

func (h *Histogram) ReduceResolution(targetSchema int32) *Histogram

ReduceResolution reduces the histogram's spans, buckets into target schema. The target schema must be smaller than the current histogram's schema.

func (*Histogram) String

func (h *Histogram) String() string

String returns a string representation of the Histogram.

func (*Histogram) ToFloat

func (h *Histogram) ToFloat(fh *FloatHistogram) *FloatHistogram

ToFloat returns a FloatHistogram representation of the Histogram. It is a deep copy (e.g. spans are not shared). The function accepts a FloatHistogram as an argument whose memory will be reused and overwritten if provided. If this argument is nil, a new FloatHistogram will be allocated.

func (*Histogram) Validate

func (h *Histogram) Validate() error

Validate validates consistency between span and bucket slices. Also, buckets are checked against negative values. For histograms that have not observed any NaN values (based on IsNaN(h.Sum) check), a strict h.Count = nCount + pCount + h.ZeroCount check is performed. Otherwise, only a lower bound check will be done (h.Count >= nCount + pCount + h.ZeroCount), because NaN observations do not increment the values of buckets (but they do increment the total h.Count).

func (*Histogram) ZeroBucket

func (h *Histogram) ZeroBucket() Bucket[uint64]

ZeroBucket returns the zero bucket.

type InternalBucketCount

type InternalBucketCount interface {
	float64 | int64
}

InternalBucketCount is used internally by Histogram and FloatHistogram. The difference to the BucketCount above is that Histogram internally uses deltas between buckets rather than absolute counts (while FloatHistogram uses absolute counts directly). Go type parameters don't allow type specialization. Therefore, where special treatment of deltas between buckets vs. absolute counts is important, this information has to be provided as a separate boolean parameter "deltaBuckets".

type Span

type Span struct {
	// Gap to previous span (always positive), or starting index for the 1st
	// span (which can be negative).
	Offset int32
	// Length of the span.
	Length uint32
}

A Span defines a continuous sequence of buckets.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL