Documentation ¶
Overview ¶
Package system manages the startup, running, metrics and shutdown of a Go service.
Most, if not all, services need to run a bunch of things in the background (such as HTTP servers, healthchecks, metrics and worker loops). They also need to shutdown cleanly when told to. Particularly for services offering REST APIs in Kubernetes, they should also wait "little time" before shutting down, in order to avoid disconnecting active users.
This package rolls all this up in an easy to consume form.
See the example project main func for a full canonical example of its usage.
Index ¶
- type GaugeProducer
- type HealthChecker
- type MetricProducer
- type System
- func (r *System) AddCleanup(c func(ctx context.Context) error)
- func (r *System) AddGauges(g GaugeProducer)
- func (r *System) AddHealthCheck(h HealthChecker)
- func (r *System) AddMetrics(m MetricProducer)
- func (r *System) AddService(s func(ctx context.Context) error)
- func (r *System) Cleanup(ctx context.Context)
- func (r *System) HealthChecks() []HealthChecker
- func (r *System) Run(ctx context.Context, terminationDelay time.Duration) (err error)
- type TaggedValue
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type GaugeProducer ¶
type GaugeProducer interface { // GaugeName The name for this group of metrics //(Name might be cleaner, but is much more likely to conflict in implementations) GaugeName() string // Gauges are instantaneous name value pairs Gauges(context.Context) map[string][]TaggedValue }
type HealthChecker ¶
type MetricProducer ¶
type System ¶
type System struct {
// contains filtered or unexported fields
}
System is a list of concurrent services that provides useful features for those services, such as coordinated cancellation and service metrics. It can collect a set of health check functions and return them as a list (to pass into single health check handler for instance).
Example ¶
package main import ( "context" "errors" "flag" "fmt" "os" "time" "github.com/circleci/ex/httpserver" "github.com/circleci/ex/httpserver/ginrouter" "github.com/circleci/ex/httpserver/healthcheck" "github.com/circleci/ex/system" "github.com/circleci/ex/termination" "github.com/circleci/ex/testing/testcontext" ) type cli struct { ShutdownDelay time.Duration AdminAddr string APIAddr string } func main() { err := run() if err != nil && !errors.Is(err, termination.ErrTerminated) { fmt.Println("Unexpected Error: ", err) os.Exit(1) } fmt.Println("exited 0") } func run() (err error) { cli := cli{} flag.DurationVar(&cli.ShutdownDelay, "shutdown-delay", 5*time.Second, "Delay shutdown by this amount") flag.StringVar(&cli.AdminAddr, "admin-addr", ":8001", "The address for the admin API to listen on") flag.StringVar(&cli.APIAddr, "api-addr", ":8000", "The address for the API to listen on") flag.Parse() // Use a properly wired o11y in a real application ctx := testcontext.Background() sys := system.New() defer sys.Cleanup(ctx) err = loadAPI(ctx, cli, sys) if err != nil { return err } // Should be last so it collects all the health checks _, err = healthcheck.Load(ctx, cli.AdminAddr, sys) if err != nil { return err } return sys.Run(ctx, cli.ShutdownDelay) } func loadAPI(ctx context.Context, cli cli, sys *system.System) error { r := ginrouter.Default(ctx, "api") _, err := httpserver.Load(ctx, httpserver.Config{ Name: "api", Addr: cli.APIAddr, Handler: r, }, sys) return err }
Output: exited 0
func New ¶
func New() *System
New create a new system with a context that can be used to coordinate cancellation of the added services. The context is also expected to contain an o11y provider that will be used to produce metrics. It is expected that the context cancelled when the service receives a signal, but will also be cancelled when any of the services returns an error. The context may be cancelled by the caller, to stop the services.
func (*System) AddCleanup ¶
AddCleanup stores function in the system that will be called when Cleanup is called. The functions added here will be invoked when Cleanup is called, which is typically. after Run has returned.
func (*System) AddGauges ¶
func (r *System) AddGauges(g GaugeProducer)
func (*System) AddHealthCheck ¶
func (r *System) AddHealthCheck(h HealthChecker)
AddHealthCheck stores a health checker for later retrieval. It is generally a good idea for each service added to also add a health checker to represent the liveness and readiness of the service, though some services will be simple enough that a health checker is not needed.
func (*System) AddMetrics ¶
func (r *System) AddMetrics(m MetricProducer)
AddMetrics adds a metrics producer the the list of producers. These producers will be called periodically and the resultant gauges published via the system context.
func (*System) AddService ¶
AddService adds the service function to the list of coordinated services. Once the system Run is called each service function will be invoked. The context passed into each service can be used to coordinate graceful shutdown. Each service should monitor the context for cancellation then stop taking on new work, and allow in flight work to complete (often called 'draining') before returning. It is expected that services that need to do any final work to exit gracefully will have added a cleanup function. If a service depends on other services or utilities (such as a database connection) to complete in-flight work then the depended upon systems should remain active enough during a context cancellation, and only full shut down via a cleanup function (for instance closing a database connection).
func (*System) Cleanup ¶
Cleanup calls each function previously added with AddCleanup. It is expected to be called after Run has returned to do any final work to allow the system to exit gracefully.
func (*System) HealthChecks ¶
func (r *System) HealthChecks() []HealthChecker
HealthChecks returns the list of previously stored health checkers. This list can be used to report on the liveness and readiness of the system.
func (*System) Run ¶
Run runs any services added to the system, it also adds a signal handler that a worker to gather and publish system metrics. Run is blocking and will only return when all it's services have finished. The error returned will be the first error returned from any of the services. The terminationDelay passed in is the amount of time to wait between receiving a signal and cancelling the system context