Documentation ¶
Index ¶
- Constants
- Variables
- func AreIncidentsFlaky(incidents []Incident, numThreshold int, durationThreshold int64, ...) bool
- type Incident
- type Store
- func (s *Store) AddNote(encodedKey string, note note.Note) (*Incident, error)
- func (s *Store) AlertArrival(m map[string]string) (*Incident, error)
- func (s *Store) Archive(encodedKey string) (*Incident, error)
- func (s *Store) Assign(encodedKey string, user string) (*Incident, error)
- func (s *Store) DeleteNote(encodedKey string, index int) (*Incident, error)
- func (s *Store) GetAll() ([]Incident, error)
- func (s *Store) GetRecentlyResolved() ([]Incident, error)
- func (s *Store) GetRecentlyResolvedForID(id, excludeKey string) ([]Incident, error)
- func (s *Store) GetRecentlyResolvedInRange(d string) ([]Incident, error)
- func (s *Store) GetRecentlyResolvedInRangeWithID(d, id string) ([]Incident, error)
Constants ¶
const ( ALERT_NAME = "alertname" CATEGORY = "category" SEVERITY = "severity" ID = "id" ASSIGNED_TO = "assigned_to" ABBR = "abbr" OWNER = "owner" ABBR_OWNER_REGEX = "abbr_owner_regex" K8S_POD_NAME = "kubernetes_pod_name" COMMITTED_IMAGE = "committedImage" LIVE_IMAGE = "liveImage" )
Well known keys for Incident.Params.
const ( TX_RETRIES = 5 NUM_RECENTLY_RESOLVED = 20 NUM_RECENTLY_RESOLVED_FOR_ID = 20 )
const ( DirtyCommittedK8sImageAlertName = "DirtyCommittedK8sImage" StaleK8sImageAlertName = "StaleK8sImage" DirtyRunningK8sConfigAlertName = "DirtyRunningK8sConfig" )
Well known alert names.
const DockerImageRegexString = ".+?-(?P<Owner>\\w+?)-\\w+?-(dirty|clean)$"
Matches images like gcr.io/skia-public/autoroll-be:2021-04-30T14_04_37Z-borenet-c3ecfbb-dirty
Variables ¶
var DockerImageRegex = regexp.MustCompile(DockerImageRegexString)
Functions ¶
func AreIncidentsFlaky ¶
func AreIncidentsFlaky(incidents []Incident, numThreshold int, durationThreshold int64, durationPercentage float32) bool
AreIncidentsFlaky is a utility function to help determine whether a slice of incidents are flaky. Flaky here is defined as alerts which occasionally show up and go away on their own with no actions taken to resolve them. They are also typically short lived.
numThreshold is the number of incidents required to have sufficient sample size. If len(incidents) < numThreshold then incidents are determined to be not flaky.
durationThreshold is the duration in seconds below which incidents could be considered to be flaky.
durationPercentage. If the percentage of incidents that have durations below durationThreshold is less than durationPercentage then the incidents are determined to be flaky. Eg: 0.50 for 50%. 1 for 100%.
Summary: The function uses the following to determine flakiness-
- durationPercentage of incidents lasted less than durationThreshold.
- Number of incidents must be >= durationThreshold to have sufficient sample size.
Types ¶
type Incident ¶
type Incident struct { Key string `json:"key" datastore:"key"` // Key is the web-safe serialized Datastore key for the incident. ID string `json:"id" datastore:"id"` // Also appears in Params. Active bool `json:"active" datastore:"active"` // Or archived. Start int64 `json:"start" datastore:"start"` // Time in seconds since the epoch. LastSeen int64 `json:"last_seen" datastore:"last_seen"` // Time in seconds since the epoch. Params paramtools.Params `json:"params" datastore:"-"` // Params ParamsSerial string `json:"-" datastore:"params_serial,noindex"` // Params serialized as JSON for easy storing in the datastore. Notes []note.Note `json:"notes" datastore:"notes,flatten"` }
Incident - An alert that is being acted on.
Each alert has an ID which is the same each time that exact alert is fired. Not to be confused with the Key which is the datastore key for a single incident of an alert firing. There will be many Incidents in the datastore with the same ID, but at most one will be Active.
func (*Incident) IsSilenced ¶
IsSilence returns if any of the given silences apply to this incident. Has support for regexes (see skbug.com/9587).
type Store ¶
type Store struct {
// contains filtered or unexported fields
}
Store and retrieve Incidents from Cloud Datastore.
func NewStore ¶
NewStore creates a new Store.
ds - Datastore client. ignoredAttr - A list of keys to ignore when calculating an Incidents ID.
func (*Store) AlertArrival ¶
AlertArrival turns alerts into Incidents, or archives Incidents if the arriving state is resolved.
Note that it is possible for the returned incident to be nil even if the returned error is non-nil. An example of when this could happen: If we receive an alert for an incident that is no longer active.
func (*Store) DeleteNote ¶
func (*Store) GetRecentlyResolved ¶
GetRecentlyResolved returns the N most recently archived Incidents.
func (*Store) GetRecentlyResolvedForID ¶
GetRecentlyResolvedForID returns a list of the N most recent archived Incidents that don't match the given key.
func (*Store) GetRecentlyResolvedInRange ¶
GetRecentlyResolvedInRange returns the most recently archived Incidents in the given range.
d - The range in human units, e.g. "1w".
func (*Store) GetRecentlyResolvedInRangeWithID ¶
GetRecentlyResolvedInRangeWithID returns the most recently archived Incidents in the given range.
d - The range in human units, e.g. "1w". id - The id of the incidents to return.