rules

package
v0.0.1-alpha Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 30, 2024 License: Apache-2.0 Imports: 30 Imported by: 0

Documentation

Index

Constants

View Source
const (
	KindAlerting  = "alerting"
	KindRecording = "recording"
)

Variables

This section is empty.

Functions

func DefaultEvalIterationFunc

func DefaultEvalIterationFunc(ctx context.Context, g *Group, evalTimestamp time.Time)

DefaultEvalIterationFunc is the default implementation of GroupEvalIterationFunc that is periodically invoked to evaluate the rules in a group at a given point in time and updates Group state and metrics accordingly. Custom GroupEvalIterationFunc implementations are recommended to invoke this function as well, to ensure correct Group state and metrics are maintained.

func GroupKey

func GroupKey(file, name string) string

GroupKey group names need not be unique across filenames.

func NewOriginContext

func NewOriginContext(ctx context.Context, rule RuleDetail) context.Context

NewOriginContext returns a new context with data about the origin attached.

Types

type Alert

type Alert struct {
	State AlertState

	Labels      labels.Labels
	Annotations labels.Labels

	// The value at the last evaluation of the alerting expression.
	Value float64
	// The interval during which the condition of this alert held true.
	// ResolvedAt will be 0 to indicate a still active alert.
	ActiveAt        time.Time
	FiredAt         time.Time
	ResolvedAt      time.Time
	LastSentAt      time.Time
	ValidUntil      time.Time
	KeepFiringSince time.Time
}

Alert is the user-level representation of a single instance of an alerting rule.

type AlertState

type AlertState int

AlertState denotes the state of an active alert.

const (
	// StateInactive is the state of an alert that is neither firing nor pending.
	StateInactive AlertState = iota
	// StatePending is the state of an alert that has been active for less than
	// the configured threshold duration.
	StatePending
	// StateFiring is the state of an alert that has been active for longer than
	// the configured threshold duration.
	StateFiring
)

func (AlertState) String

func (s AlertState) String() string

type AlertingRule

type AlertingRule struct {
	// contains filtered or unexported fields
}

An AlertingRule generates alerts from its vector expression.

func NewAlertingRule

func NewAlertingRule(
	name string, vec parser.Expr, hold, keepFiringFor time.Duration,
	labels, annotations, externalLabels labels.Labels, externalURL string,
	restored bool, logger log.Logger,
) *AlertingRule

NewAlertingRule constructs a new AlertingRule.

func (*AlertingRule) ActiveAlerts

func (r *AlertingRule) ActiveAlerts() []*Alert

ActiveAlerts returns a slice of active alerts.

func (*AlertingRule) ActiveAlertsCount

func (r *AlertingRule) ActiveAlertsCount() int

func (*AlertingRule) Annotations

func (r *AlertingRule) Annotations() labels.Labels

Annotations returns the annotations of the alerting rule.

func (*AlertingRule) Eval added in v1.4.0

func (r *AlertingRule) Eval(ctx context.Context, queryOffset time.Duration, ts time.Time, query QueryFunc, externalURL *url.URL, limit int) (promql.Vector, error)

Eval evaluates the rule expression and then creates pending alerts and fires or removes previously pending alerts accordingly.

func (*AlertingRule) ForEachActiveAlert

func (r *AlertingRule) ForEachActiveAlert(f func(*Alert))

ForEachActiveAlert runs the given function on each alert. This should be used when you want to use the actual alerts from the AlertingRule and not on its copy. If you want to run on a copy of alerts then don't use this, get the alerts from 'ActiveAlerts()'.

func (*AlertingRule) GetEvaluationDuration

func (r *AlertingRule) GetEvaluationDuration() time.Duration

GetEvaluationDuration returns the time in seconds it took to evaluate the alerting rule.

func (*AlertingRule) GetEvaluationTimestamp

func (r *AlertingRule) GetEvaluationTimestamp() time.Time

GetEvaluationTimestamp returns the time the evaluation took place.

func (*AlertingRule) Health

func (r *AlertingRule) Health() RuleHealth

Health returns the current health of the alerting rule.

func (*AlertingRule) HoldDuration

func (r *AlertingRule) HoldDuration() time.Duration

HoldDuration returns the hold duration of the alerting rule.

func (*AlertingRule) KeepFiringFor

func (r *AlertingRule) KeepFiringFor() time.Duration

KeepFiringFor returns the duration an alerting rule should keep firing for after resolution.

func (*AlertingRule) Labels

func (r *AlertingRule) Labels() labels.Labels

Labels returns the labels of the alerting rule.

func (*AlertingRule) LastError

func (r *AlertingRule) LastError() error

LastError returns the last error seen by the alerting rule.

func (*AlertingRule) Name

func (r *AlertingRule) Name() string

Name returns the name of the alerting rule.

func (*AlertingRule) NoDependencyRules

func (r *AlertingRule) NoDependencyRules() bool

func (*AlertingRule) NoDependentRules

func (r *AlertingRule) NoDependentRules() bool

func (*AlertingRule) Query

func (r *AlertingRule) Query() parser.Expr

Query returns the query expression of the alerting rule.

func (*AlertingRule) QueryForStateSeries

func (r *AlertingRule) QueryForStateSeries(ctx context.Context, q storage.Querier) (storage.SeriesSet, error)

QueryForStateSeries returns the series for ALERTS_FOR_STATE of the alert rule.

func (*AlertingRule) Restored

func (r *AlertingRule) Restored() bool

Restored returns the restoration state of the alerting rule.

func (*AlertingRule) SetEvaluationDuration

func (r *AlertingRule) SetEvaluationDuration(dur time.Duration)

SetEvaluationDuration updates evaluationDuration to the duration it took to evaluate the rule on its last evaluation.

func (*AlertingRule) SetEvaluationTimestamp

func (r *AlertingRule) SetEvaluationTimestamp(ts time.Time)

SetEvaluationTimestamp updates evaluationTimestamp to the timestamp of when the rule was last evaluated.

func (*AlertingRule) SetHealth

func (r *AlertingRule) SetHealth(health RuleHealth)

SetHealth sets the current health of the alerting rule.

func (*AlertingRule) SetLastError

func (r *AlertingRule) SetLastError(err error)

SetLastError sets the current error seen by the alerting rule.

func (*AlertingRule) SetNoDependencyRules

func (r *AlertingRule) SetNoDependencyRules(noDependencyRules bool)

func (*AlertingRule) SetNoDependentRules

func (r *AlertingRule) SetNoDependentRules(noDependentRules bool)

func (*AlertingRule) SetRestored

func (r *AlertingRule) SetRestored(restored bool)

SetRestored updates the restoration state of the alerting rule.

func (*AlertingRule) State

func (r *AlertingRule) State() AlertState

State returns the maximum state of alert instances for this rule. StateFiring > StatePending > StateInactive.

func (*AlertingRule) String

func (r *AlertingRule) String() string

type FileLoader

type FileLoader struct{}

FileLoader is the default GroupLoader implementation. It defers to rulefmt.ParseFile and parser.ParseExpr.

func (FileLoader) Load

func (FileLoader) Load(identifier string) (*rulefmt.RuleGroups, []error)

func (FileLoader) Parse

func (FileLoader) Parse(query string) (parser.Expr, error)

type Group

type Group struct {
	// contains filtered or unexported fields
}

Group is a set of rules that have a logical relation.

func NewGroup added in v1.4.0

func NewGroup(o GroupOptions) *Group

NewGroup makes a new Group with the given name, options, and rules.

func (*Group) AlertingRules

func (g *Group) AlertingRules() []*AlertingRule

AlertingRules returns the list of the group's alerting rules.

func (*Group) Context

func (g *Group) Context() context.Context

Context returns the group's context.

func (*Group) CopyState

func (g *Group) CopyState(from *Group)

CopyState copies the alerting rule and staleness related state from the given group.

Rules are matched based on their name and labels. If there are duplicates, the first is matched with the first, second with the second etc.

func (*Group) Equals

func (g *Group) Equals(ng *Group) bool

Equals return if two groups are the same.

func (*Group) Eval added in v1.4.0

func (g *Group) Eval(ctx context.Context, ts time.Time)

Eval runs a single evaluation cycle in which all rules are evaluated sequentially. Rules can be evaluated concurrently if the `concurrent-rule-eval` feature flag is enabled.

func (*Group) EvalTimestamp

func (g *Group) EvalTimestamp(startTime int64) time.Time

EvalTimestamp returns the immediately preceding consistently slotted evaluation time.

func (*Group) File

func (g *Group) File() string

File returns the group's file.

func (*Group) GetEvaluationTime

func (g *Group) GetEvaluationTime() time.Duration

GetEvaluationTime returns the time in seconds it took to evaluate the rule group.

func (*Group) GetLastEvalTimestamp

func (g *Group) GetLastEvalTimestamp() time.Time

GetLastEvalTimestamp returns the timestamp of the last evaluation.

func (*Group) GetLastEvaluation

func (g *Group) GetLastEvaluation() time.Time

GetLastEvaluation returns the time the last evaluation of the rule group took place.

func (*Group) HasAlertingRules

func (g *Group) HasAlertingRules() bool

HasAlertingRules returns true if the group contains at least one AlertingRule.

func (*Group) Interval

func (g *Group) Interval() time.Duration

Interval returns the group's interval.

func (*Group) Limit

func (g *Group) Limit() int

Limit returns the group's limit.

func (*Group) Logger

func (g *Group) Logger() log.Logger

func (*Group) Name

func (g *Group) Name() string

Name returns the group name.

func (*Group) QueryOffset

func (g *Group) QueryOffset() time.Duration

func (*Group) Queryable

func (g *Group) Queryable() storage.Queryable

Queryable returns the group's queryable.

func (*Group) RestoreForState

func (g *Group) RestoreForState(ts time.Time)

RestoreForState restores the 'for' state of the alerts by looking up last ActiveAt from storage.

func (*Group) Rules

func (g *Group) Rules(matcherSets ...[]*labels.Matcher) []Rule

Rules returns the group's rules.

type GroupEvalIterationFunc

type GroupEvalIterationFunc func(ctx context.Context, g *Group, evalTimestamp time.Time)

GroupEvalIterationFunc is used to implement and extend rule group evaluation iteration logic. It is configured in Group.evalIterationFunc, and periodically invoked at each group evaluation interval to evaluate the rules in the group at that point in time. DefaultEvalIterationFunc is the default implementation.

type GroupLoader

type GroupLoader interface {
	Load(identifier string) (*rulefmt.RuleGroups, []error)
	Parse(query string) (parser.Expr, error)
}

GroupLoader is responsible for loading rule groups from arbitrary sources and parsing them.

type GroupOptions

type GroupOptions struct {
	Name, File    string
	Interval      time.Duration
	Limit         int
	Rules         []Rule
	ShouldRestore bool
	Opts          *ManagerOptions
	QueryOffset   *time.Duration

	EvalIterationFunc GroupEvalIterationFunc
	// contains filtered or unexported fields
}

type Manager

type Manager struct {
	// contains filtered or unexported fields
}

The Manager manages recording and alerting rules.

func NewManager

func NewManager(o *ManagerOptions) *Manager

NewManager returns an implementation of Manager, ready to be started by calling the Run method.

func (*Manager) AlertingRules

func (m *Manager) AlertingRules() []*AlertingRule

AlertingRules returns the list of the manager's alerting rules.

func (*Manager) LoadGroups

func (m *Manager) LoadGroups(
	interval time.Duration, externalLabels labels.Labels, externalURL string, groupEvalIterationFunc GroupEvalIterationFunc, filenames ...string,
) (map[string]*Group, []error)

LoadGroups reads groups from a list of files.

func (*Manager) RuleGroups

func (m *Manager) RuleGroups() []*Group

RuleGroups returns the list of manager's rule groups.

func (*Manager) Rules

func (m *Manager) Rules(matcherSets ...[]*labels.Matcher) []Rule

Rules returns the list of the manager's rules.

func (*Manager) Run

func (m *Manager) Run()

Run starts processing of the rule manager. It is blocking.

func (*Manager) Stop

func (m *Manager) Stop()

Stop the rule manager's rule evaluation cycles.

func (*Manager) Update

func (m *Manager) Update(interval time.Duration, files []string, externalLabels labels.Labels, externalURL string, groupEvalIterationFunc GroupEvalIterationFunc) error

Update the rule manager's state as the config requires. If loading the new rules failed the old rule set is restored. This method will no-op in case the manager is already stopped.

type ManagerOptions

type ManagerOptions struct {
	ExternalURL               *url.URL
	QueryFunc                 QueryFunc
	NotifyFunc                NotifyFunc
	Context                   context.Context
	Appendable                storage.Appendable
	Queryable                 storage.Queryable
	Logger                    log.Logger
	Registerer                prometheus.Registerer
	OutageTolerance           time.Duration
	ForGracePeriod            time.Duration
	ResendDelay               time.Duration
	GroupLoader               GroupLoader
	DefaultRuleQueryOffset    func() time.Duration
	MaxConcurrentEvals        int64
	ConcurrentEvalsEnabled    bool
	RuleConcurrencyController RuleConcurrencyController
	RuleDependencyController  RuleDependencyController

	Metrics *Metrics
}

ManagerOptions bundles options for the Manager.

type Metrics

type Metrics struct {
	EvalDuration             prometheus.Summary
	IterationDuration        prometheus.Summary
	IterationsMissed         *prometheus.CounterVec
	IterationsScheduled      *prometheus.CounterVec
	EvalTotal                *prometheus.CounterVec
	EvalFailures             *prometheus.CounterVec
	GroupInterval            *prometheus.GaugeVec
	GroupLastEvalTime        *prometheus.GaugeVec
	GroupLastDuration        *prometheus.GaugeVec
	GroupLastRestoreDuration *prometheus.GaugeVec
	GroupRules               *prometheus.GaugeVec
	GroupSamples             *prometheus.GaugeVec
}

Metrics for rule evaluation.

func NewGroupMetrics

func NewGroupMetrics(reg prometheus.Registerer) *Metrics

NewGroupMetrics creates a new instance of Metrics and registers it with the provided registerer, if not nil.

type NotifyFunc

type NotifyFunc func(ctx context.Context, expr string, alerts ...*Alert)

NotifyFunc sends notifications about a set of alerts generated by the given expression.

func SendAlerts

func SendAlerts(s Sender, externalURL string) NotifyFunc

SendAlerts implements the rules.NotifyFunc for a Notifier.

type QueryFunc

type QueryFunc func(ctx context.Context, q string, t time.Time) (promql.Vector, error)

QueryFunc processes PromQL queries.

func EngineQueryFunc

func EngineQueryFunc(engine promql.QueryEngine, q storage.Queryable) QueryFunc

EngineQueryFunc returns a new query function that executes instant queries against the given engine. It converts scalar into vector results.

type RecordingRule

type RecordingRule struct {
	// contains filtered or unexported fields
}

A RecordingRule records its vector expression into new timeseries.

func NewRecordingRule

func NewRecordingRule(name string, vector parser.Expr, lset labels.Labels) *RecordingRule

NewRecordingRule returns a new recording rule.

func (*RecordingRule) Eval added in v1.4.0

func (rule *RecordingRule) Eval(ctx context.Context, queryOffset time.Duration, ts time.Time, query QueryFunc, _ *url.URL, limit int) (promql.Vector, error)

Eval evaluates the rule and then overrides the metric names and labels accordingly.

func (*RecordingRule) GetEvaluationDuration

func (rule *RecordingRule) GetEvaluationDuration() time.Duration

GetEvaluationDuration returns the time in seconds it took to evaluate the recording rule.

func (*RecordingRule) GetEvaluationTimestamp

func (rule *RecordingRule) GetEvaluationTimestamp() time.Time

GetEvaluationTimestamp returns the time the evaluation took place.

func (*RecordingRule) Health

func (rule *RecordingRule) Health() RuleHealth

Health returns the current health of the recording rule.

func (*RecordingRule) Labels

func (rule *RecordingRule) Labels() labels.Labels

Labels returns the rule labels.

func (*RecordingRule) LastError

func (rule *RecordingRule) LastError() error

LastError returns the last error seen by the recording rule.

func (*RecordingRule) Name

func (rule *RecordingRule) Name() string

Name returns the rule name.

func (*RecordingRule) NoDependencyRules

func (rule *RecordingRule) NoDependencyRules() bool

func (*RecordingRule) NoDependentRules

func (rule *RecordingRule) NoDependentRules() bool

func (*RecordingRule) Query

func (rule *RecordingRule) Query() parser.Expr

Query returns the rule query expression.

func (*RecordingRule) SetEvaluationDuration

func (rule *RecordingRule) SetEvaluationDuration(dur time.Duration)

SetEvaluationDuration updates evaluationDuration to the time in seconds it took to evaluate the rule on its last evaluation.

func (*RecordingRule) SetEvaluationTimestamp

func (rule *RecordingRule) SetEvaluationTimestamp(ts time.Time)

SetEvaluationTimestamp updates evaluationTimestamp to the timestamp of when the rule was last evaluated.

func (*RecordingRule) SetHealth

func (rule *RecordingRule) SetHealth(health RuleHealth)

SetHealth sets the current health of the recording rule.

func (*RecordingRule) SetLastError

func (rule *RecordingRule) SetLastError(err error)

SetLastError sets the current error seen by the recording rule.

func (*RecordingRule) SetNoDependencyRules

func (rule *RecordingRule) SetNoDependencyRules(noDependencyRules bool)

func (*RecordingRule) SetNoDependentRules

func (rule *RecordingRule) SetNoDependentRules(noDependentRules bool)

func (*RecordingRule) String

func (rule *RecordingRule) String() string

type Rule

type Rule interface {
	Name() string
	// Labels of the rule.
	Labels() labels.Labels
	// Eval evaluates the rule, including any associated recording or alerting actions.
	Eval(ctx context.Context, queryOffset time.Duration, evaluationTime time.Time, queryFunc QueryFunc, externalURL *url.URL, limit int) (promql.Vector, error)
	// String returns a human-readable string representation of the rule.
	String() string
	// Query returns the rule query expression.
	Query() parser.Expr
	// SetLastError sets the current error experienced by the rule.
	SetLastError(error)
	// LastError returns the last error experienced by the rule.
	LastError() error
	// SetHealth sets the current health of the rule.
	SetHealth(RuleHealth)
	// Health returns the current health of the rule.
	Health() RuleHealth
	SetEvaluationDuration(time.Duration)
	// GetEvaluationDuration returns last evaluation duration.
	// NOTE: Used dynamically by rules.html template.
	GetEvaluationDuration() time.Duration
	SetEvaluationTimestamp(time.Time)
	// GetEvaluationTimestamp returns last evaluation timestamp.
	// NOTE: Used dynamically by rules.html template.
	GetEvaluationTimestamp() time.Time

	// SetNoDependentRules sets whether there's no other rule in the rule group that depends on this rule.
	SetNoDependentRules(bool)

	// NoDependentRules returns true if it's guaranteed that in the rule group there's no other rule
	// which depends on this one. In case this function returns false there's no such guarantee, which
	// means there may or may not be other rules depending on this one.
	NoDependentRules() bool

	// SetNoDependencyRules sets whether this rule doesn't depend on the output of any rule in the rule group.
	SetNoDependencyRules(bool)

	// NoDependencyRules returns true if it's guaranteed that this rule doesn't depend on the output of
	// any other rule in the group. In case this function returns false there's no such guarantee, which
	// means the rule may or may not depend on other rules.
	NoDependencyRules() bool
}

A Rule encapsulates a vector expression which is evaluated at a specified interval and acted upon (currently either recorded or used for alerting).

type RuleConcurrencyController

type RuleConcurrencyController interface {
	// Allow determines if the given rule is allowed to be evaluated concurrently.
	// If Allow() returns true, then Done() must be called to release the acquired slot and corresponding cleanup is done.
	// It is important that both *Group and Rule are not retained and only be used for the duration of the call.
	Allow(ctx context.Context, group *Group, rule Rule) bool

	// Done releases a concurrent evaluation slot.
	Done(ctx context.Context)
}

RuleConcurrencyController controls concurrency for rules that are safe to be evaluated concurrently. Its purpose is to bound the amount of concurrency in rule evaluations to avoid overwhelming the Prometheus server with additional query load. Concurrency is controlled globally, not on a per-group basis.

type RuleDependencyController

type RuleDependencyController interface {
	// AnalyseRules analyses dependencies between the input rules. For each rule that it's guaranteed
	// not having any dependants and/or dependency, this function should call Rule.SetNoDependentRules(true)
	// and/or Rule.SetNoDependencyRules(true).
	AnalyseRules(rules []Rule)
}

RuleDependencyController controls whether a set of rules have dependencies between each other.

type RuleDetail

type RuleDetail struct {
	Name   string
	Query  string
	Labels labels.Labels
	Kind   string

	// NoDependentRules is set to true if it's guaranteed that in the rule group there's no other rule
	// which depends on this one.
	NoDependentRules bool

	// NoDependencyRules is set to true if it's guaranteed that this rule doesn't depend on any other
	// rule within the rule group.
	NoDependencyRules bool
}

RuleDetail contains information about the rule that is being evaluated.

func FromOriginContext

func FromOriginContext(ctx context.Context) RuleDetail

FromOriginContext returns the RuleDetail origin data from the context.

func NewRuleDetail

func NewRuleDetail(r Rule) RuleDetail

NewRuleDetail creates a RuleDetail from a given Rule.

type RuleHealth

type RuleHealth string

RuleHealth describes the health state of a rule.

const (
	HealthUnknown RuleHealth = "unknown"
	HealthGood    RuleHealth = "ok"
	HealthBad     RuleHealth = "err"
)

The possible health states of a rule based on the last execution.

type Sender

type Sender interface {
	Send(alerts ...*notifier.Alert)
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL