Documentation ¶
Overview ¶
Package bayesian implements bayesian analysis for detecting change points.
Index ¶
- Constants
- func Lgamma(x float64) float64
- type BetaDistribution
- type ChangepointPredictor
- func (a ChangepointPredictor) ChangePointPositionDistribution(history []*inputbuffer.Run) *model.PositionDistribution
- func (a ChangepointPredictor) ChangePoints(history []*inputbuffer.Run, tail float64) []inputbuffer.ChangePoint
- func (a ChangepointPredictor) FindBestChangepoint(history []*inputbuffer.Run) (relativeLikelihood float64, position int)
- type SequenceLikelihood
Constants ¶
const ( // Confidence interval tail is the default tail value for confidence analysis. // 0.005 tail gives 99% confidence interval. ConfidenceIntervalTail = 0.005 )
Variables ¶
This section is empty.
Functions ¶
Types ¶
type BetaDistribution ¶
type BetaDistribution struct { // Alpha parameter of Beta distribution. // Must be greater than 0. Alpha float64 // Beta parameter of the Beta distribution. // Must be greater than 0. Beta float64 }
BetaDistribution specifies the parameters of a Beta distribution. See https://en.wikipedia.org/wiki/Beta_distribution.
In this package, we use beta distribution as the prior for a test's propensity to produce unexpected results, because it simplifies the maths.
Alpha = 1, Beta = 1 indicates an uninformative (uniform) prior.
type ChangepointPredictor ¶
type ChangepointPredictor struct { // Threshold for creating new change points. ChangepointLikelihood float64 // The prior for the rate at which a test's runs have any // unexpected test result. // This is the prior for estimating the ratio // HasUnexpected / Runs of a segment. // // Generally tests tend to be either consistently passing or // consistently failing, with a bias towards consistently // passing, so shape parameters Alpha < 1, Beta < 1, Alpha < Beta // are typically selected (e.g. alpha = 0.3, beta = 0.5). HasUnexpectedPrior BetaDistribution // The prior for the rate at which a test's runs have // only unexpected results, given they have at least // two results and one is unexpected. // // This is the prior for estimating UnexpectedAfterRetry / Retried. // Generally the result of retrying a fail inside a test run // either leads to a pass (fairly consistently) or another failure // (fairly consistently). Consequently, shape parameters Alpha < 1, // Beta < 1 are advised (e.g. alpha = 0.5, beta = 0.5). UnexpectedAfterRetryPrior BetaDistribution }
func (ChangepointPredictor) ChangePointPositionDistribution ¶
func (a ChangepointPredictor) ChangePointPositionDistribution(history []*inputbuffer.Run) *model.PositionDistribution
ChangePointPositionDistribution returns a quantization of the changepoint commit start position probability distribution for the given history, assuming it has exactly one changepoint.
See PositionDistribution for interpretation.
For theory background, please see go/bayesian-changepoint-estimation.
func (ChangepointPredictor) ChangePoints ¶
func (a ChangepointPredictor) ChangePoints(history []*inputbuffer.Run, tail float64) []inputbuffer.ChangePoint
ChangePoints runs changepoint detection and confidence interval analysis for history. history is sorted by commit position ascendingly (oldest commit first).
func (ChangepointPredictor) FindBestChangepoint ¶
func (a ChangepointPredictor) FindBestChangepoint(history []*inputbuffer.Run) (relativeLikelihood float64, position int)
FindBestChangepoint finds the change point position that maximises the likelihood of observing the given test history.
It returns the position of the change point in the history slice, as well as the change in log-likelihood attributable to the change point, relative to the `no change point` case.
The semantics of the returned position are as follows: a position p means the history is segmented as history[:p] and history[p:]. If the returned position is 0, it means no change point position was better than the `no change point` case.
This method requires the provided history to be sorted by commit position (either ascending or descending is fine). It allows multiple runs to be specified per commit position, by including those runs as adjacent elements in the history slice.
Note that if multiple runs are specified per commit position, the returned position will only ever be between two commit positions in the history, i.e. it holds that history[position-1].CommitPosition != history[position].CommitPosition (or position == 0).
This method assumes a uniform prior for all change point positions, including the no change point case. If we are to bias towards the no change point case, thresholding should be applied to relativeLikelihood before considering the change point real.
type SequenceLikelihood ¶
type SequenceLikelihood struct {
// contains filtered or unexported fields
}
SequenceLikelihood is used to compute the likelihood of observing a particular sequence of pass/fail results, assuming the failure rate p of the test remains constant over the sequence, and p is unknown but has the known prior Beta(alpha,beta) for some alpha and beta where alpha > 0 and beta > 0.
See go/bayesian-changepoint-estimation for details.
func NewSequenceLikelihood ¶
func NewSequenceLikelihood(prior BetaDistribution) SequenceLikelihood
func (SequenceLikelihood) LogLikelihood ¶
func (s SequenceLikelihood) LogLikelihood(x, n int) float64
LogLikelihood returns the log-likelihood of observing a particular sequence of pass/fail results (Y_0, Y_1 ... Y_{n-1}) of a known length. i.e. returns log(P(Y)).
See LogLikelihoodPartial for more details.
func (SequenceLikelihood) LogLikelihoodPartial ¶
func (s SequenceLikelihood) LogLikelihoodPartial(x, n int, maxProportion float64) float64
LogLikelihoodPartial returns the log-likelihood of:
- observing a particular binary sequence Y_0, Y_1 ... Y_{n-1}, knowing its length AND
- concurrently, the underlying proportion p (= P(Y_i = 1)) that led to that sequence being less than or equal to maxProportion.
i.e. the method returns log(P(Y AND p <= maxProportion)).
The following assumptions are made:
- Y ~ BernoilliProcess(p, n), i.e. Y is n independent samples of a bernoilli process with probability p of observing a 1.
- p ~ Beta(a, b), i.e. p is unknown but taken from a Beta distribution with parameters a and b. If a = b = 1 is provided, the Beta distribution collapses to the uniform (uninformative) prior. In practice, where test failure rate is being modelled, a and b < 1 make more sense as tests tend towards being either consistently failing or consistently passing.
maxProportion must be between 0.0 and 1.0 (inclusive), and if it is 1.0, the method returns P(Y).
While the method returns the probability of observing a precise sequence Y, the only statistics needed to evaluate this probability is x (the number of elements that are true) and n (the total), so these are the only values accepted as arguments. (As such, inputs are also constrained as 0 <= x <= n.)
Log-likelihoods are returned, as for larger values of n (e.g. n > 100) the probabilities for any particular sequence can get very small. Working in log-probabilities avoids floating point precision issues.