Documentation ¶
Overview ¶
Package summstat allows one to incrementally compute summary statistics for a data set.
It allows accurate median and percentiles to be returned, though these require the entire dataset to be retained. If the dataset cannot be kept in memory, a subset of it can be stored initially to determine an approximate range of interest to divide into bins for which counts will be tracked. This allows one to collect approximate percentile data without the memory overhead.
Index ¶
- type Sample
- type Stats
- func (s *Stats) AddSample(val Sample)
- func (s *Stats) AddSampleSince(t time.Time)
- func (s *Stats) AddStats(stats *Stats)
- func (s Stats) Bin(i int) (count int, low, high Sample)
- func (s Stats) Count() int
- func (s *Stats) CreateBins(nbins int, low, high Sample)
- func (s *Stats) CreateBinsDiscard(nbins int, discardPct float64)
- func (s Stats) Max() Sample
- func (s Stats) Mean() float64
- func (s Stats) Median() float64
- func (s Stats) Min() Sample
- func (s Stats) NBins() int
- func (s Stats) Percentile(pct float64) Sample
- func (s Stats) Spread() Sample
- func (s Stats) Stddev() float64
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Stats ¶
type Stats struct {
// contains filtered or unexported fields
}
A Stats represents descriptive statistics about Samples which are being added incrementally.
func (*Stats) AddSampleSince ¶
AddSampleSince adds the time duration since time t as a sample.
func (Stats) Bin ¶
Returns the count and low and high ends of the i'th bin.
The bin interval is (low,high]
func (*Stats) CreateBins ¶
CreateBins divides the sample space into nbins bins for tracking counts.
As samples are added, the count for the corresponding bin will be incremented and the sample value will not be stored.
This saves memory at the expense of granularity. Percentile() and Median() cannot be called after CreateBins() because they are no longer meaningful. Instead, use Bin(i) to inspect the distribution of data by bin. Any existing stored samples are discarded.
The bins created will be:
(-Inf,low], (low, s/nmid+low], (s/nmid+low, 2s/nmid], ..., (high,+Inf) where: s = high - low nmid = nbins-2
Thus, the space (high-low) is divided into nbins-2 equally sized pieces and the remaining two bins extend from -math.MaxFloat64 to low and high to math.MaxFloat64.
Low must be strictly less than high, so nbins must be at least 3.
func (*Stats) CreateBinsDiscard ¶
CreateBinsDiscard is shorthand for calling CreateBins(nbins, ...) with low value s.Percentile(discardPct) and high value s.Percentile(1-discardPct) with a check to make sure enough samples have been collected to make discardPct meaningful (1/discardPct samples are required).
func (Stats) Median ¶
Median returns the median of the samples.
It may not be called after CreateBins, which discards the samples from which the percentile is calculated.
func (Stats) Percentile ¶
Percentile returns the sample value at the given percentile.
It may not be called after CreateBins, which discards the samples from which the percentile is calculated.