README ¶
Health Checking
Health Check Types
Readiness
Readiness is a special type of health check. Readiness checks will only run until they pass for the first time. After a readiness check passes, it will never be run again. These checks are typically used to indicate that the startup of a component has finished.
Health
Health checks typically indicate that a component is operating as expected. The health of a component may flip due to any arbitrary heuristic the component exposes.
Liveness
Liveness checks are intended to indicate that a component has become unhealthy and has no way to recover.
Naming and Tags
All registered checks must have a unique name which will be included in the health check results.
Additionally, checks can optionally specify an arbitrary number of tags which can be used to group health checks together.
Special Tags
- "All" is a tag that is automatically added for every check that is registered.
- "Application" checks are checks that are globally applicable. This means that it is not possible to filter application-wide health checks from a response.
Health Check Worker
Readiness, Health, and Liveness checks are all implemented by using their own health check worker.
A health check worker starts a goroutine that updates the health of all registered checks every freq
. By default freq
is set to 30s
.
When a health check is added it will always initially report as unhealthy.
Every health check runs in its own goroutine to maximize concurrency. It is guaranteed that no locks from the health checker are held during the execution of the health check.
When the health check worker is stopped, it will finish executing any currently running health checks and then terminate its primary goroutine. After the health check worker is stopped, the health checks will never run again.
Documentation ¶
Index ¶
- Constants
- func AwaitAlive(ctx context.Context, c Client, freq time.Duration, tags []string, ...) (bool, error)
- func AwaitHealthy(ctx context.Context, c Client, freq time.Duration, tags []string, ...) (bool, error)
- func AwaitReady(ctx context.Context, c Client, freq time.Duration, tags []string, ...) (bool, error)
- func NewGetAndPostHandler(log logging.Logger, reporter Reporter) (http.Handler, error)
- func NewGetHandler(reporter func(tags ...string) (map[string]Result, bool)) http.Handler
- type APIArgs
- type APIReply
- type Checker
- type CheckerFunc
- type Client
- type Health
- type Registerer
- type Reporter
- type Result
- type Service
Constants ¶
const ( // AllTag is automatically added to every registered check. AllTag = "all" // ApplicationTag checks will act as if they specified every tag that has // been registered. // Registering a health check with this tag will ensure that it is always // included in all health query results. ApplicationTag = "application" )
Variables ¶
This section is empty.
Functions ¶
func AwaitAlive ¶ added in v1.9.5
func AwaitAlive(ctx context.Context, c Client, freq time.Duration, tags []string, options ...rpc.Option) (bool, error)
AwaitAlive polls the node every [freq] until the node reports liveness. Only returns an error if [ctx] returns an error.
func AwaitHealthy ¶ added in v1.9.5
func AwaitHealthy(ctx context.Context, c Client, freq time.Duration, tags []string, options ...rpc.Option) (bool, error)
AwaitHealthy polls the node every [freq] until the node reports healthy. Only returns an error if [ctx] returns an error.
func AwaitReady ¶ added in v1.9.5
func AwaitReady(ctx context.Context, c Client, freq time.Duration, tags []string, options ...rpc.Option) (bool, error)
AwaitReady polls the node every [freq] until the node reports ready. Only returns an error if [ctx] returns an error.
func NewGetAndPostHandler ¶
NewGetAndPostHandler returns a health handler that supports GET and jsonrpc POST requests.
Types ¶
type APIArgs ¶ added in v1.10.0
type APIArgs struct {
Tags []string `json:"tags"`
}
APIArgs is the arguments for Readiness, Health, and Liveness.
type Checker ¶
type Checker interface { // HealthCheck returns health check results and, if not healthy, a non-nil // error // // It is expected that the results are json marshallable. HealthCheck(context.Context) (interface{}, error) }
Checker can have its health checked
type CheckerFunc ¶
func (CheckerFunc) HealthCheck ¶
func (f CheckerFunc) HealthCheck(ctx context.Context) (interface{}, error)
type Client ¶
type Client interface { // Readiness returns if the node has finished initialization Readiness(ctx context.Context, tags []string, options ...rpc.Option) (*APIReply, error) // Health returns a summation of the health of the node Health(ctx context.Context, tags []string, options ...rpc.Option) (*APIReply, error) // Liveness returns if the node is in need of a restart Liveness(ctx context.Context, tags []string, options ...rpc.Option) (*APIReply, error) }
Client interface for Avalanche Health API Endpoint For helpers to wait for Readiness, Health, or Liveness, see AwaitReady, AwaitHealthy, and AwaitAlive.
type Health ¶
type Health interface { Registerer Reporter // Start running periodic health checks at the specified frequency. // Repeated calls to Start will be no-ops. Start(ctx context.Context, freq time.Duration) // Stop running periodic health checks. Stop should only be called after // Start. Once Stop returns, no more health checks will be executed. Stop() }
Health defines the full health service interface for registering, reporting and refreshing health checks.
func New ¶
func New(log logging.Logger, registerer prometheus.Registerer) (Health, error)
type Registerer ¶
type Registerer interface { RegisterReadinessCheck(name string, checker Checker, tags ...string) error RegisterHealthCheck(name string, checker Checker, tags ...string) error RegisterLivenessCheck(name string, checker Checker, tags ...string) error }
Registerer defines how to register new components to check the health of.
type Reporter ¶
type Reporter interface { Readiness(tags ...string) (map[string]Result, bool) Health(tags ...string) (map[string]Result, bool) Liveness(tags ...string) (map[string]Result, bool) }
Reporter returns the current health status.
type Result ¶
type Result struct { // Details of the HealthCheck. Details interface{} `json:"message,omitempty"` // Error is the string representation of the error returned by the failing // HealthCheck. The value is nil if the check passed. Error *string `json:"error,omitempty"` // Timestamp of the last HealthCheck. Timestamp time.Time `json:"timestamp,omitempty"` // Duration is the amount of time this HealthCheck last took to evaluate. Duration time.Duration `json:"duration"` // ContiguousFailures the HealthCheck has returned. ContiguousFailures int64 `json:"contiguousFailures,omitempty"` // TimeOfFirstFailure of the HealthCheck, TimeOfFirstFailure *time.Time `json:"timeOfFirstFailure,omitempty"` }