stats

package
v0.0.0-...-dca2ff9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 10, 2020 License: Apache-2.0 Imports: 15 Imported by: 0

Documentation

Index

Constants

View Source
const (
	HITS     string = "hits"
	ERRORS          = "errors"
	DURATION        = "duration"
)

Hardcoded measures names for ease of reference

Variables

View Source
var (
	// DefaultCounts is an array of the measures we represent as Count by default
	DefaultCounts = [...]string{HITS, ERRORS, DURATION}
	// DefaultDistributions is an array of the measures we represent as Distribution by default
	// Not really used right now as we don't have a way to easily add new distros
	DefaultDistributions = [...]string{DURATION}
)

Functions

func EncodePayload

func EncodePayload(w io.Writer, payload *Payload) error

EncodePayload encodes the payload as Gzipped JSON into w.

func FilterTags

func FilterTags(tags, groups []string) []string

FilterTags will return the tags that have the given group.

func GrainKey

func GrainKey(name, measure, aggr string) string

GrainKey generates the key used to aggregate counts and distributions which is of the form: name|measure|aggr for example: serve|duration|service:webserver

func SetSublayersOnSpan

func SetSublayersOnSpan(span *pb.Span, values []SublayerValue)

SetSublayersOnSpan takes some sublayers and pins them on the given span.Metrics

func SplitTag

func SplitTag(tag string) (group, value string)

SplitTag splits the tag into group and value. If it doesn't have a separator the empty string will be used for the group.

func TagGroup

func TagGroup(tag string) string

TagGroup will return the tag group from the given string. For example, "host:abc" => "host"

func Weight

func Weight(s *pb.Span) float64

Weight returns the weight of the span as defined for sampling, i.e. the inverse of the sampling rate.

Types

type Bucket

type Bucket struct {
	Start    int64 // Timestamp of start in our format
	Duration int64 // Duration of a bucket in nanoseconds

	// Stats indexed by keys
	Counts           map[string]Count        // All the counts
	Distributions    map[string]Distribution // All the distributions (e.g.: for quantile queries)
	ErrDistributions map[string]Distribution // All the error distributions (e.g.: for apdex, as they account for frustrated)
}

Bucket is a time bucket to track statistic around multiple Counts

func NewBucket

func NewBucket(ts, d int64) Bucket

NewBucket opens a new bucket for time ts and initializes it properly

func (Bucket) IsEmpty

func (sb Bucket) IsEmpty() bool

IsEmpty just says if this stats bucket has no information (in which case it's useless)

type Concentrator

type Concentrator struct {
	In  chan *Input
	Out chan []Bucket
	// contains filtered or unexported fields
}

Concentrator produces time bucketed statistics from a stream of raw traces. https://en.wikipedia.org/wiki/Knelson_concentrator Gets an imperial shitton of traces, and outputs pre-computed data structures allowing to find the gold (stats) amongst the traces.

func NewConcentrator

func NewConcentrator(aggregators []string, bsize int64, out chan []Bucket) *Concentrator

NewConcentrator initializes a new concentrator ready to be started

func (*Concentrator) Flush

func (c *Concentrator) Flush() []Bucket

Flush deletes and returns complete statistic buckets

func (*Concentrator) Run

func (c *Concentrator) Run()

Run runs the main loop of the concentrator goroutine. Traces are received through `Add`, this loop only deals with flushing.

func (*Concentrator) Start

func (c *Concentrator) Start()

Start starts the concentrator.

func (*Concentrator) Stop

func (c *Concentrator) Stop()

Stop stops the main Run loop.

type Count

type Count struct {
	Key     string `json:"key"`
	Name    string `json:"name"`    // the name of the trace/spans we count (was a member of TagSet)
	Measure string `json:"measure"` // represents the entity we count, e.g. "hits", "errors", "time" (was Name)
	TagSet  TagSet `json:"tagset"`  // set of tags for which we account this Distribution

	TopLevel float64 `json:"top_level"` // number of top-level spans contributing to this count

	Value float64 `json:"value"` // accumulated values
}

Count represents one specific "metric" we track for a given tagset A count keeps track of the total for a metric during a given time in a certain dimension. By default we keep count of "hits", "errors" and "durations". Others can be added (from the Metrics map in a span), but they have to be enabled manually.

Example: hits between X and X+5s for service:dogweb and resource:dash.list

func NewCount

func NewCount(m, ckey, name string, tgs TagSet) Count

NewCount returns a new Count for a metric and a given tag set

func (Count) Add

func (c Count) Add(v float64) Count

Add adds some values to one count

func (Count) Merge

func (c Count) Merge(c2 Count) Count

Merge is used when 2 Counts represent the same thing and adds Values

type Distribution

type Distribution struct {
	Key     string `json:"key"`
	Name    string `json:"name"`    // the name of the trace/spans we count (was a member of TagSet)
	Measure string `json:"measure"` // represents the entity we count, e.g. "hits", "errors", "time"
	TagSet  TagSet `json:"tagset"`  // set of tags for which we account this Distribution

	TopLevel float64 `json:"top_level"` // number of top-level spans contributing to this count

	Summary *quantile.SliceSummary `json:"summary"` // actual representation of data
}

Distribution represents a true image of the spectrum of values, allowing arbitrary quantile queries A distribution works the same way Counts do, but instead of accumulating values it keeps a sense of the repartition of the values. It uses the Greenwald-Khanna online summary algorithm.

A distribution can answer to an arbitrary quantile query within a given epsilon. For each "range" of values in our pseudo-histogram we keep a trace ID (a sample) so that we can give the user an example of a trace for a given quantile query.

func NewDistribution

func NewDistribution(m, ckey, name string, tgs TagSet) Distribution

NewDistribution returns a new Distribution for a metric and a given tag set

func (Distribution) Add

func (d Distribution) Add(v float64, sampleID uint64)

Add inserts the proper values in a given distribution from a span

func (Distribution) Copy

func (d Distribution) Copy() Distribution

Copy returns a distro with the same data but a different underlying summary

func (Distribution) Merge

func (d Distribution) Merge(d2 Distribution)

Merge is used when 2 Distributions represent the same thing and it merges the 2 underlying summaries

func (Distribution) Weigh

func (d Distribution) Weigh(weight float64) Distribution

Weigh applies a weight factor to a distribution and return the result as a new distribution.

type Input

type Input struct {
	Trace     WeightedTrace
	Sublayers SublayerMap
	Env       string
}

Input contains input for the concentractor.

type Payload

type Payload struct {
	HostName string   `json:"hostname"`
	Env      string   `json:"env"`
	Stats    []Bucket `json:"stats"`
}

Payload represents the payload to be flushed to the stats endpoint

type RawBucket

type RawBucket struct {
	// contains filtered or unexported fields
}

RawBucket is used to compute span data and aggregate it within a time-framed bucket. This should not be used outside the agent, use Bucket for this.

func NewRawBucket

func NewRawBucket(ts, d int64) *RawBucket

NewRawBucket opens a new calculation bucket for time ts and initializes it properly

func (*RawBucket) Export

func (sb *RawBucket) Export() Bucket

Export transforms a RawBucket into a Bucket, typically used before communicating data to the API, as RawBucket is the internal type while Bucket is the public, shared one.

func (*RawBucket) HandleSpan

func (sb *RawBucket) HandleSpan(s *WeightedSpan, env string, aggregators []string, sublayers []SublayerValue)

HandleSpan adds the span to this bucket stats, aggregated with the finest grain matching given aggregators

type SublayerMap

type SublayerMap map[*pb.Span][]SublayerValue

SublayerMap maps spans to their sublayer values.

type SublayerValue

type SublayerValue struct {
	Metric string
	Tag    Tag
	Value  float64
}

SublayerValue is just a span-metric placeholder for a given sublayer val

func ComputeSublayers

func ComputeSublayers(trace pb.Trace) []SublayerValue

ComputeSublayers extracts sublayer values by type and service for a trace

Description of the algorithm, with the following trace as an example:

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 |===|===|===|===|===|===|===|===|===|===|===|===|===|===|===| <-1------------------------------------------------->

<-2----------------->       <-3--------->
    <-4--------->
  <-5------------------->
                    <--6-------------------->
                                        <-7------------->

1: service=web-server, type=web, parent=nil 2: service=pg, type=db, parent=1 3: service=render, type=web, parent=1 4: service=pg-read, type=db, parent=2 5: service=redis, type=cache, parent=1 6: service=rpc1, type=rpc, parent=1 7: service=alert, type=rpc, parent=6

Step 1: Find all time intervals to consider (set of start/end time

of spans):

[0, 10, 15, 20, 50, 60, 70, 80, 110, 120, 130, 150]

Step 2: Map each time intervals to a set of "active" spans. A span

is considered active for a given time interval if it has no
direct child span at that time interval. This is done by
iterating over the spans, iterating over each time
intervals, and checking if the span has a child running
during that time interval. If not, it is considered active:

{
    0: [ 1 ],
    10: [ 2 ],
    15: [ 2, 5 ],
    20: [ 4, 5 ],
    ...
    110: [ 7 ],
    120: [ 1, 7 ],
    130: [ 7 ],
    150: [],
}

Step 4: Build a service and type duration mapping by:

  1. iterating over each time intervals

  2. computing the time interval duration portion (time interval duration / number of active spans)

  3. iterate over each active span of that time interval

  4. add to the active span's type and service duration the duration portion

    { web-server: 10, render: 15, pg: 12.5, pg-read: 15, redis: 27.5, rpc1: 30, alert: 40, } { web: 70, cache: 55, db: 55, rpc: 55, }

func (SublayerValue) GoString

func (v SublayerValue) GoString() string

GoString returns a description of a sublayer value.

func (SublayerValue) String

func (v SublayerValue) String() string

String returns a description of a sublayer value.

type Subtrace

type Subtrace struct {
	Root  *pb.Span
	Trace pb.Trace
}

Subtrace represents the combination of a root span and the trace consisting of all its descendant spans

func ExtractTopLevelSubtraces

func ExtractTopLevelSubtraces(t pb.Trace, root *pb.Span) []Subtrace

ExtractTopLevelSubtraces extracts all subtraces rooted in a toplevel span, ComputeTopLevel should be called before.

type Tag

type Tag struct {
	Name  string `json:"name"`
	Value string `json:"value"`
}

Tag represents a key / value dimension on traces and stats.

func NewTagFromString

func NewTagFromString(raw string) Tag

NewTagFromString returns a new Tag from a raw string

func (Tag) String

func (t Tag) String() string

String returns a string representation of a tag

type TagSet

type TagSet []Tag

TagSet is a combination of given tags, it is equivalent to contexts that we use for metrics. Although we choose a different terminology here to avoid confusion, and tag sets do not have a notion of activeness over time. A tag can be:

• one of the fixed ones we defined in the span structure: service, resource and host
• one of the arbitrary metadata key included in the span (it needs to be turned on manually)

When we track statistics by tag sets, we basically track every tag combination we're interested in to create dimensions, for instance:

  • (service)
  • (service, environment)
  • (service, host)
  • (service, resource, environment)
  • (service, resource)
  • ..

func MergeTagSets

func MergeTagSets(t1, t2 TagSet) TagSet

MergeTagSets merge two tag sets lazily

func NewTagSetFromString

func NewTagSetFromString(raw string) TagSet

NewTagSetFromString returns a new TagSet from a raw string

func (TagSet) Get

func (t TagSet) Get(name string) Tag

Get the tag with the particular name

func (TagSet) HasExactly

func (t TagSet) HasExactly(groups []string) bool

HasExactly returns true if we have tags only for the given groups.

func (TagSet) Key

func (t TagSet) Key() string

Key returns a string representing a new set of tags.

func (TagSet) Len

func (t TagSet) Len() int

func (TagSet) Less

func (t TagSet) Less(i, j int) bool

func (TagSet) Match

func (t TagSet) Match(groups []string) TagSet

Match returns a new tag set with only the tags matching the given groups.

func (TagSet) MatchFilters

func (t TagSet) MatchFilters(filters []string) TagSet

MatchFilters returns a tag set of the tags that match certain filters. A filter is defined as : "KEY:VAL" where:

  • KEY is a non-empty string
  • VALUE is a string (can be empty)

A tag {Name:k, Value:v} from the input tag set will match if:

  • KEY==k and VALUE is non-empty and v==VALUE
  • KEY==k and VALUE is empty (don't care about v)

func (TagSet) Swap

func (t TagSet) Swap(i, j int)

func (TagSet) TagKey

func (t TagSet) TagKey(m string) string

TagKey returns a unique key from the string given and the tagset, useful to index stuff on tagsets

func (TagSet) Unset

func (t TagSet) Unset(name string) TagSet

Unset returns a new tagset without a given value

type WeightedSpan

type WeightedSpan struct {
	Weight   float64 // Span weight. Similar to the trace root.Weight().
	TopLevel bool    // Is this span a service top-level or not. Similar to span.TopLevel().

	*pb.Span
}

WeightedSpan extends Span to contain weights required by the Concentrator.

type WeightedTrace

type WeightedTrace []*WeightedSpan

WeightedTrace is a slice of WeightedSpan pointers.

func NewWeightedTrace

func NewWeightedTrace(trace pb.Trace, root *pb.Span) WeightedTrace

NewWeightedTrace returns a weighted trace, with coefficient required by the concentrator.

Directories

Path Synopsis
Package quantile implements "Space-Efficient Online Computation of Quantile Summaries" (Greenwald, Khanna 2001): http://infolab.stanford.edu/~datar/courses/cs361a/papers/quantiles.pdf
Package quantile implements "Space-Efficient Online Computation of Quantile Summaries" (Greenwald, Khanna 2001): http://infolab.stanford.edu/~datar/courses/cs361a/papers/quantiles.pdf

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL