Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type HealthStatus ¶
type HealthStatus bool
const ( Unhealthy HealthStatus = false Healthy HealthStatus = true )
func (HealthStatus) AsFloat64 ¶
func (h HealthStatus) AsFloat64() float64
func (HealthStatus) String ¶
func (h HealthStatus) String() string
type Watcher ¶
type Watcher struct { Name string C chan HealthStatus // Writing a health status to C allows you to mark the watcher as healthy or unhealthy. // contains filtered or unexported fields }
Watcher watches for Alertmanager checkins, logs information when the health status changes, and serves an HTTP status page with information about the current state.
The idea is that Watcher acts as an Alertmanager webhook, and you set up an alert to always be firing. When the HTTP handler receives the alert from Alertmanager, the watcher transitions to "healthy" for the duration defined by `threshold`. If the alert stops being fired, eventually we become unhealthy and serve that status on the monitoring endpoint. You can then hook that into generic "website down" monitoring and be alerted that you can't receive alerts.
We implement the watcher with a timer that expires a certain time after the last "healthy" message. This is not strictly necessary, but allows us to log a message at the exact instant that we start serving an "unhealthy" status. The code would be much simpler if we just subtracted the last healthy time from the current time when someone asked for the status.
func NewWatcher ¶
NewWatcher creates a new watcher in the "unhealthy" state. To mark the status as healthy, send Healthy to watcher.C.
func (*Watcher) HandleAlertmanagerPing ¶
func (w *Watcher) HandleAlertmanagerPing(wr http.ResponseWriter, req *http.Request)
HandleAlertmanagerPing is an http.HandlerFunc that accepts alerts from Alertmanager via its webhook API, and sets the watcher status to Healthy if the alert is well-formed.
A bad request does not change the status to unhealthy; only the timer does that.
func (*Watcher) HandleHealthCheck ¶
func (w *Watcher) HandleHealthCheck(wr http.ResponseWriter, req *http.Request)
HandleHealthCheck is an http.HandlerFunc that accepts HTTP requests from an external "website monitoring" service, and returns a 200 status code if Alertmanager is healthy, or a 500 Internal Server Error code if Alertmanager is unhealthy. We choose 500 because it is most likely to be treated as an error by your health-checking service, even though it is technically incorrect.
This endpoint is NOT intended to be a liveness/readiness/health probe for the alertmanager-status service itself!
func (*Watcher) HandleLiveness ¶
func (w *Watcher) HandleLiveness(wr http.ResponseWriter, req *http.Request)
HandleLiveness is an http.HandlerFunc that returns status 200 if the event loop is not locked up. This works as a liveness probe and a readiness probe.