README
¶
annotations-publish-healthchecker
Introduction
This is a service, that reports whether the annotations publish flow works as expected. It checks and caches a "healthiness" status every minute. The responses from the __health and __details endpoints will be provided based on this cache.
Note
: the results are given for a certain period in the past, the latest results being ignored.
(This happens due to the current implementation of the monitoring flow, which closes the transactions with a given delay. In this time period we have no knowledge about the successfulness of an annotation publish.)
This check is looking for unclosed annotation-publish transactions (transactions with no PublishEnd events). Since the monitoring service closes the transactions every 5 minutes, this healthchecker verifies the transactions happening before the latest 5 minutes (ignoring another 2 minutes SLA window, period when the transactions still can be closed) and it checks for a period of 10 minutes.
Installation
Download the source code, dependencies and test dependencies:
go get -u github.com/kardianos/govendor
go get -u github.com/Financial-Times/annotations-publish-healthchecker
cd $GOPATH/src/github.com/Financial-Times/annotations-publish-healthchecker
govendor sync
go build .
Running locally
-
Run the tests and install the binary:
govendor sync govendor test -v -race go install
-
Run the binary (using the
help
flag to see the available optional arguments):$GOPATH/bin/annotations-publish-healthchecker [--help]
Options:
--app-system-code="annotations-publish-healthchecker" System Code of the application ($APP_SYSTEM_CODE)
--app-name="Annotations Publish Healthchecker" Application name ($APP_NAME)
--port="8080" Port to listen on ($APP_PORT)
--event-reader="http://localhost:8080/__splunk-event-reader" URL for the Splunk Event Reader
Build and deployment
- Built by Docker Hub on merge to master: coco/annotations-publish-healthchecker
- CI provided by CircleCI: annotations-publish-healthchecker
Service endpoints
e.g.
GET
Using curl:
curl http://localhost:8080/__health | json_pp`
curl http://localhost:8080/__details | json_pp`
Or using httpie:
http GET http://localhost:8080/__details
The expected response will contain information about the health of the annotations publish flow.
The response for the __details
endpoint looks like this:
{
failed_transactions: [ ],
event_reader_checking_period: "Between -15m and -5m",
event_reader_checking_time: "2017-12-19T16:43:06.351754912+02:00",
event_reader_was_reachable: true
}
The response indicates:
failed_transactions
: list of the transactions that have recently failed (transaction_id
,uuid
,publish_start
time - if known)event_reader_checking_period
: the period that the check was executed for (defaults to an interval of 10 minutes, with a 5 minute delay)event_reader_checking_time
: the exact time when the sanity check happenedevent_reader_was_reachable
: whether the last sanity check was successful (the event reader could be reached) - otherwise we cannot know that the publishing flow is working properly
Utility endpoints
Endpoints that are there for support or testing, e.g read endpoints on the writers
Healthchecks
Admin endpoints are:
/__gtg
/__health
The health endpoint executes two checks:
Splunk Event Reader is reachable
- This check verifies whether the latest call to the splunk-event-reader was successful, hence the healthcheck results are relevantAnnotations Publish Failures
- Splunk-event-reader is reachable, and at least 2 publish failures were detected for the latest call.
/__build-info
Logging
Documentation
¶
There is no documentation for this package.