health

package module
v0.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 17, 2023 License: MIT Imports: 7 Imported by: 91

README

Health

A simple and flexible health check library for Go.

Build codecov Go Report Card GolangCI Mentioned in Awesome Go FOSSA Status

Documentation · Report Bug · Request Feature

Table of Contents

  1. Getting started
  2. Synchronous vs. Asynchronous Checks
  3. Caching
  4. Listening to Status Changes
  5. Middleware and Interceptors
  6. Compatibility With Other Libraries
  7. License

Getting Started

This library provides a http.Handler that acts as a health endpoint. It can be used by cloud infrastructure or other services to determine the availability of an application.

Rather than simply returning a response with HTTP status code 200, this library allows building health checks that test the availability of all required dependencies. The HTTP response contains the aggregated health result and details about the health status of each component.

Example

package main

import (
	"context"
	"database/sql"
	"fmt"
	"github.com/alexliesenfeld/health"
	_ "github.com/mattn/go-sqlite3"
	"log"
	"net/http"
	"time"
)

// This is a very simple example that shows the basic features of this library.
func main() {
	db, _ := sql.Open("sqlite3", "simple.sqlite")
	defer db.Close()

	// Create a new Checker.
	checker := health.NewChecker(
		
		// Set the time-to-live for our cache to 1 second (default).
		health.WithCacheDuration(1*time.Second),

		// Configure a global timeout that will be applied to all checks.
		health.WithTimeout(10*time.Second),

		// A check configuration to see if our database connection is up.
		// The check function will be executed for each HTTP request.
		health.WithCheck(health.Check{
			Name:    "database",      // A unique check name.
			Timeout: 2 * time.Second, // A check specific timeout.
			Check:   db.PingContext,
		}),

		// The following check will be executed periodically every 15 seconds
		// started with an initial delay of 3 seconds. The check function will NOT
		// be executed for each HTTP request.
		health.WithPeriodicCheck(15*time.Second, 3*time.Second, health.Check{
			Name: "search",
			// The check function checks the health of a component. If an error is
			// returned, the component is considered unavailable (or "down").
			// The context contains a deadline according to the configured timeouts.
			Check: func(ctx context.Context) error {
				return fmt.Errorf("this makes the check fail")
			},
		}),

		// Set a status listener that will be invoked when the health status changes.
		// More powerful hooks are also available (see docs).
		health.WithStatusListener(func(ctx context.Context, state health.CheckerState) {
			log.Println(fmt.Sprintf("health status changed to %s", state.Status))
		}),
	)

	// Create a new health check http.Handler that returns the health status
	// serialized as a JSON string. You can pass pass further configuration
	// options to NewHandler to modify default configuration.
	http.Handle("/health", health.NewHandler(checker))
	log.Fatalln(http.ListenAndServe(":3000", nil))
}

Because our search component is down, the request curl http://localhost:3000/health would yield a response with HTTP status code 503 (Service Unavailable), and the following JSON response body:

{
  "status": "down",
  "details": {
    "database": {
      "status": "up",
      "timestamp": "2021-07-01T08:05:14.603364Z"
    },
    "search": {
      "status": "down",
      "timestamp": "2021-07-01T08:05:08.522685Z",
      "error": "this makes the check fail"
    }
  }
}

This example shows all features of this library.

Synchronous vs. Asynchronous Checks

With "synchronous" health checks we mean that every HTTP request initiates a health check and waits until all check functions complete before returning an aggregated health result. You can configure synchronous checks using the WithCheck configuration option (see example above).

Synchronous checks can be sufficient for smaller applications but might not scale well for more involved applications. Sometimes an application needs to read a large amount of data, can experience latency issues or make an expensive calculation to tell something about its health. With synchronous health checks the application will not be able to respond quickly to health check requests (see here why this is necessary to avoid service disruptions in modern cloud infrastructure).

Rather than executing health check functions on every HTTP request, periodic (or "asynchronous") health checks execute the check function on a fixed schedule. With this approach, the health status is always read from a local cache that is regularly updated in the background. This allows responding to HTTP requests instantly without waiting for check functions to complete.

Periodic checks can be configured using the WithPeriodicCheck configuration option (see example above).

This library allows you to mix synchronous and asynchronous check functions, so you can start out simple and easily transition into a more scalable and robust health check implementation later.

Caching

Health check results are cached to avoid sending too many request to the services that your program checks and to mitigate "denial of service" attacks. The TTL is set to 1 second by default. If you do not want to use caching altogether, you can disable it using the health.WithDisabledCache() configuration option.

Listening to Status Changes

It can be useful to react to health status changes. For example, you might want to log status changes or adjust some metrics, so you can easier correlate logs during root cause analysis or perform actions to mitigate the impact of an unhealthy component.

This library allows you to configure listener functions that will be called when either the overall/aggregated health status changes, or that of a specific component.

Example

health.WithPeriodicCheck(5*time.Second, 0, health.Check{
    Name:   "search",
    Check:  myCheckFunc,
    StatusListener: func (ctx context.Context, name string, state CheckState) ) {
	    log.Printf("status of component '%s' changed to %s", name, state.Status)
    },
}),

health.WithStatusListener(func (ctx context.Context, state CheckerState)) {
    log.Printf("overall system health status changed to %s", state.Status)
}),

Middleware and Interceptors

It can be useful to hook into the checking lifecycle to do some processing before and after a health check. For example, you might want to add some tracing information to the Context before the check function executes, do some logging or modify the check result before sending the HTTP response (e.g., removing details on failed authentication).

This library provides two mechanisms that allow you to hook into processing:

  • Middleware gives you the possibility to intercept all calls of Checker.Check, which corresponds to every incoming HTTP request. In contrary to the usually used middleware pattern, this middleware allows you to access check related information and post-process a check result before sending it in an HTTP response.

    Middleware Description
    BasicAuth Reduces exposed health details based on authentication success. Uses basic auth for authentication.
    CustomAuth Same as BasicAuth middleware, but allows using an arbitrary function for authentication.
    FullDetailsOnQueryParam Disables health details unless the request contains a previously configured query parameter name.
    BasicLogger Basic request-oriented logging functionality.
  • Interceptors make it possible to intercept all calls to a check function. This is useful if you have cross-functional code that needs to be reusable and should have access to check state information.

    Interceptor Description
    BasicLogger Basic component check function logging functionality

Compatibility With Other Libraries

Most existing Go health check libraries come with their own implementations of tool specific check functions (such as for Redis, memcached, Postgres, etc.). Rather than reinventing the wheel and come up with yet another library specific implementation of check functions, the goal was to design this library in a way that makes it easy to reuse existing solutions. The following (non-exhaustive) list of health check implementations should work with this library without or minimal adjustments:

  • github.com/hellofresh/health-go ( see full example here)
    import httpCheck "github.com/hellofresh/health-go/v4/checks/http"
    ...
    health.WithCheck(health.Check{
       Name:    "google",
       Check:   httpCheck.New(httpCheck.Config{
          URL: "https://www.google.com",
       }),
    }),
    
  • github.com/etherlabsio/healthcheck ( see full example here)
    import "github.com/etherlabsio/healthcheck/v2/checkers"
    ...
    health.WithCheck(health.Check{
      Name:    "database",
      Check:   checkers.DiskSpace("/var/log", 90).Check,
    })
    
  • github.com/heptiolabs/healthcheck ( see full example here)
    import "github.com/heptiolabs/healthcheck"
    ...
    health.WithCheck(health.Check{
        Name: "google",
        Check: func(ctx context.Context) error {
           deadline, _ := ctx.Deadline()
           timeout := time.Since(deadline)
           return healthcheck.HTTPGetCheck("https://www.google.com", timeout)()
        },
    }),
    
  • github.com/InVisionApp/go-health ( see full example here)
      import "github.com/InVisionApp/go-health/checkers"
      ...
      // Create check as usual (no error checking for brevity)
      googleURL, err := url.Parse("https://www.google.com")
      check, err := checkers.NewHTTP(&checkers.HTTPConfig{
          URL: googleURL,
      })
      ...
      // Add the check in the Checker configuration.
      health.WithCheck(health.Check{
          Name: "google",
          Check: func(_ context.Context) error {
              _, err := check.Status() 
              return err
          },
      })
    

License

health is free software: you can redistribute it and/or modify it under the terms of the MIT Public License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the MIT Public License for more details.

FOSSA Status

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	CheckTimeoutErr = errors.New("check timed out")
)

Functions

func NewHandler added in v0.2.0

func NewHandler(checker Checker, options ...HandlerOption) http.HandlerFunc

NewHandler creates a new health check http.Handler.

Types

type AvailabilityStatus added in v0.3.0

type AvailabilityStatus string

AvailabilityStatus expresses the availability of either a component or the whole system.

const (
	// StatusUnknown holds the information that the availability
	// status is not known, because not all checks were executed yet.
	StatusUnknown AvailabilityStatus = "unknown"
	// StatusUp holds the information that the system or a component
	// is up and running.
	StatusUp AvailabilityStatus = "up"
	// StatusDown holds the information that the system or a component
	// down and not available.
	StatusDown AvailabilityStatus = "down"
)

type Check

type Check struct {
	// The Name must be unique among all checks. Name is a required attribute.
	Name string // Required

	// Check is the check function that will be executed to check availability.
	// This function must return an error if the checked service is considered
	// not available. Check is a required attribute.
	Check func(ctx context.Context) error // Required

	// Timeout will override the global timeout value, if it is smaller than
	// the global timeout (see WithTimeout).
	Timeout time.Duration // Optional

	// MaxTimeInError will set a duration for how long a service must be
	// in an error state until it is considered down/unavailable.
	MaxTimeInError time.Duration // Optional

	// MaxContiguousFails will set a maximum number of contiguous
	// check fails until the service is considered down/unavailable.
	MaxContiguousFails uint // Optional

	// StatusListener allows to set a listener that will be called
	// whenever the AvailabilityStatus (e.g. from "up" to "down").
	StatusListener func(ctx context.Context, name string, state CheckState) // Optional

	// Interceptors holds a list of Interceptor instances that will be executed one after another in the
	// order as they appear in the list.
	Interceptors []Interceptor

	// DisablePanicRecovery disables automatic recovery from panics. If left in its default value (false),
	// panics will be automatically converted into errors instead.
	DisablePanicRecovery bool
	// contains filtered or unexported fields
}

Check allows to configure health checks.

type CheckResult added in v0.4.0

type CheckResult struct {
	// Status is the availability status of a component.
	Status AvailabilityStatus `json:"status"`
	// Timestamp holds the time when the check was executed.
	Timestamp time.Time `json:"timestamp,omitempty"`
	// Error contains the check error message, if the check failed.
	Error error `json:"error,omitempty"`
}

CheckResult holds a components health information. Attention: This type is converted from/to JSON using a custom marshalling/unmarshalling function (see type jsonCheckResult). This is required because some fields are not converted automatically by the standard json.Marshal/json.Unmarshal functions (such as the error interface). The JSON tags you see here, are just there for the readers' convenience.

func (CheckResult) MarshalJSON added in v0.8.0

func (cr CheckResult) MarshalJSON() ([]byte, error)

MarshalJSON provides a custom marshaller for the CheckResult type.

func (*CheckResult) UnmarshalJSON added in v0.8.0

func (cr *CheckResult) UnmarshalJSON(data []byte) error

type CheckState added in v0.3.0

type CheckState struct {
	// LastCheckedAt holds the time of when the check was last executed.
	LastCheckedAt time.Time
	// LastCheckedAt holds the last time of when the check did not return an error.
	LastSuccessAt time.Time
	// LastFailureAt holds the last time of when the check did return an error.
	LastFailureAt time.Time
	// FirstCheckStartedAt holds the time of when the first check was started.
	FirstCheckStartedAt time.Time
	// ContiguousFails holds the number of how often the check failed in a row.
	ContiguousFails uint
	// Result holds the error of the last check (nil if successful).
	Result error
	// The current availability status of the check.
	Status AvailabilityStatus
}

CheckState represents the current state of a component check.

type Checker added in v0.3.0

type Checker interface {
	// Start will start all necessary background workers and prepare
	// the checker for further usage.
	Start()
	// Stop stops will stop the checker.
	Stop()
	// Check runs all synchronous (i.e., non-periodic) check functions.
	// It returns the aggregated health status (combined from the results
	// of this executions synchronous checks and the previously reported
	// results of asynchronous/periodic checks. This function expects a
	// context, that may contain deadlines to which will be adhered to.
	// The context will be passed to all downstream calls
	// (such as listeners, component check functions, and interceptors).
	Check(ctx context.Context) CheckerResult
	// GetRunningPeriodicCheckCount returns the number of currently
	// running periodic checks.
	GetRunningPeriodicCheckCount() int
	// IsStarted returns true, if the Checker was started (see Checker.Start)
	// and is currently still running. Returns false otherwise.
	IsStarted() bool
}

Checker is the main checker interface. It provides all health checking logic.

func NewChecker added in v0.3.0

func NewChecker(options ...CheckerOption) Checker

NewChecker creates a new Checker. The provided options will be used to modify its configuration. If the Checker was not yet started (see Checker.IsStarted), it will be started automatically (see Checker.Start). You can disable this autostart by adding the WithDisabledAutostart configuration option.

type CheckerOption added in v0.4.0

type CheckerOption func(config *checkerConfig)

CheckerOption is a configuration option for a Checker.

func WithCacheDuration

func WithCacheDuration(duration time.Duration) CheckerOption

WithCacheDuration sets the duration for how long the aggregated health check result will be cached. By default, the cache TTL (i.e, the duration for how long responses will be cached) is set to 1 second. Caching will prevent that each incoming HTTP request triggers a new health check. A duration of 0 will effectively disable the cache and has the same effect as WithDisabledCache.

func WithCheck

func WithCheck(check Check) CheckerOption

WithCheck adds a new health check that contributes to the overall service availability status. This check will be triggered each time Checker.Check is called (i.e., for each HTTP request). If health checks are expensive, or you expect a higher amount of requests on the health endpoint, consider using WithPeriodicCheck instead.

func WithDisabledAutostart added in v0.4.0

func WithDisabledAutostart() CheckerOption

WithDisabledAutostart disables automatic startup of a Checker instance.

func WithDisabledCache

func WithDisabledCache() CheckerOption

WithDisabledCache disabled the check cache. This is not recommended in most cases. This will effectively lead to a health endpoint that initiates a new health check for each incoming HTTP request. This may have an impact on the systems that are being checked (especially if health checks are expensive). Caching also mitigates "denial of service" attacks. Caching is enabled by default.

func WithDisabledDetails added in v0.3.0

func WithDisabledDetails() CheckerOption

WithDisabledDetails disables all data in the JSON response body. The AvailabilityStatus will be the only content. Example: { "status":"down" }. Enabled by default.

func WithInfo added in v0.8.0

func WithInfo(values map[string]interface{}) CheckerOption

WithInfo sets values that will be available in every health check result. For example, you can use this option if you want to set information about your system that will be returned in every health check result, such as version number, Git SHA, build date, etc. These values will be available in CheckerResult.Info. If you use the default HTTP handler of this library (see NewHandler) or convert the CheckerResult to JSON on your own, these values will be available in the "info" field.

func WithInterceptors added in v0.5.0

func WithInterceptors(interceptors ...Interceptor) CheckerOption

WithInterceptors adds a list of interceptors that will be applied to every check function. Interceptors may intercept the function call and do some pre- and post-processing, having the check state and check function result at hand. The interceptors will be executed in the order they are passed to this function.

func WithPeriodicCheck

func WithPeriodicCheck(refreshPeriod time.Duration, initialDelay time.Duration, check Check) CheckerOption

WithPeriodicCheck adds a new health check that contributes to the overall service availability status. The health check will be performed on a fixed schedule and will not be executed for each HTTP request (as in contrast to WithCheck). This allows to process a much higher number of HTTP requests without actually calling the checked services too often or to execute long-running checks. This way Checker.Check (and the health endpoint) always returns the last result of the periodic check.

func WithStatusListener added in v0.3.0

func WithStatusListener(listener func(ctx context.Context, state CheckerState)) CheckerOption

WithStatusListener registers a listener function that will be called whenever the overall/aggregated system health status changes (e.g. from "up" to "down"). Attention: Because this listener is also executed for synchronous (i.e, request-based) health checks, it should not block processing.

func WithTimeout

func WithTimeout(timeout time.Duration) CheckerOption

WithTimeout defines a timeout duration for all checks. You can override this timeout by using the timeout value in the Check configuration. Default value is 10 seconds.

type CheckerResult added in v0.4.0

type CheckerResult struct {
	// Info contains additional information about this health result.
	Info map[string]interface{} `json:"info,omitempty"`
	// Status is the aggregated system availability status.
	Status AvailabilityStatus `json:"status"`
	// Details contains health information for all checked components.
	Details map[string]CheckResult `json:"details,omitempty"`
}

CheckerResult holds the aggregated system availability status and detailed information about the individual checks.

type CheckerState added in v0.4.0

type CheckerState struct {
	// Status is the aggregated system health status.
	Status AvailabilityStatus
	// CheckState contains the state of all checks.
	CheckState map[string]CheckState
}

CheckerState represents the current state of the Checker.

type HandlerConfig added in v0.3.0

type HandlerConfig struct {
	// contains filtered or unexported fields
}

type HandlerOption added in v0.4.0

type HandlerOption func(*HandlerConfig)

HandlerOption is a configuration option for a Handler (see NewHandler).

func WithMiddleware

func WithMiddleware(middleware ...Middleware) HandlerOption

WithMiddleware configures a middleware that will be used by the handler to pro- and post-process HTTP requests and health checks. Refer to the documentation of type Middleware for more information.

func WithResultWriter added in v0.4.0

func WithResultWriter(writer ResultWriter) HandlerOption

WithResultWriter is responsible for writing a health check result (see CheckerResult) into an HTTP response. By default, JSONResultWriter will be used.

func WithStatusCodeDown added in v0.4.0

func WithStatusCodeDown(httpStatus int) HandlerOption

WithStatusCodeDown sets an HTTP status code that will be used for responses where the system is considered to be unavailable ("down"). Default is HTTP status code 503 (Service Unavailable).

func WithStatusCodeUp added in v0.4.0

func WithStatusCodeUp(httpStatus int) HandlerOption

WithStatusCodeUp sets an HTTP status code that will be used for responses where the system is considered to be available ("up"). Default is HTTP status code 200 (OK).

type Interceptor added in v0.4.0

type Interceptor func(next InterceptorFunc) InterceptorFunc

Interceptor is factory function that allows creating new instances of a InterceptorFunc. The concept behind Interceptor is similar to the middleware pattern. A InterceptorFunc that is created by calling a Interceptor is expected to forward the function call to the next InterceptorFunc (passed to the Interceptor in parameter 'next'). This way, a chain of interceptors is constructed that will eventually invoke of the components health check function. Each interceptor must therefore invoke the 'next' interceptor. If the 'next' InterceptorFunc is not called, the components check health function will never be executed.

type InterceptorFunc added in v0.4.0

type InterceptorFunc func(ctx context.Context, checkName string, state CheckState) CheckState

InterceptorFunc is an interceptor function that intercepts any call to a components health check function.

type JSONResultWriter added in v0.4.0

type JSONResultWriter struct{}

JSONResultWriter writes a CheckerResult in JSON format into an http.ResponseWriter. This ResultWriter is set by default.

func NewJSONResultWriter added in v0.4.0

func NewJSONResultWriter() *JSONResultWriter

NewJSONResultWriter creates a new instance of a JSONResultWriter.

func (*JSONResultWriter) Write added in v0.4.0

func (rw *JSONResultWriter) Write(result *CheckerResult, statusCode int, w http.ResponseWriter, r *http.Request) error

Write implements ResultWriter.Write.

type Middleware

type Middleware func(next MiddlewareFunc) MiddlewareFunc

Middleware is factory function that allows creating new instances of a MiddlewareFunc. A MiddlewareFunc is expected to forward the function call to the next MiddlewareFunc (passed in parameter 'next'). This way, a chain of interceptors is constructed that will eventually invoke of the Checker.Check function. Each interceptor must therefore invoke the 'next' interceptor. If the 'next' MiddlewareFunc is not called, Checker.Check will never be executed.

type MiddlewareFunc added in v0.4.0

type MiddlewareFunc func(r *http.Request) CheckerResult

MiddlewareFunc is a middleware for a health Handler (see NewHandler). It is invoked each time an HTTP request is processed.

type ResultWriter added in v0.4.0

type ResultWriter interface {
	// Write writes a CheckerResult into a http.ResponseWriter in a format
	// that the ResultWriter supports (such as XML, JSON, etc.).
	// A ResultWriter is expected to write at least the following information into the http.ResponseWriter:
	// (1) A MIME type header (e.g., "Content-Type" : "application/json"),
	// (2) the HTTP status code that is passed in parameter statusCode (this is necessary due to ordering constraints
	// when writing into a http.ResponseWriter (see https://github.com/alexliesenfeld/health/issues/9), and
	// (3) the response body in the format that the ResultWriter supports.
	Write(result *CheckerResult, statusCode int, w http.ResponseWriter, r *http.Request) error
}

ResultWriter enabled a Handler (see NewHandler) to write the CheckerResult to an http.ResponseWriter in a specific format. For example, the JSONResultWriter writes the result in JSON format into the response body).

Directories

Path Synopsis
checks module
examples module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL