dye-injector
An end-to-end prober for
float's log collector
service. Measures the time it takes for a log to go through syslog and
reach the Elasticsearch index. This metric can be used to detect
generic problems with the log collection service.
The server will periodically emit a log with a unique identifier (the
dye marker), and then it will start querying Elasticsearch for this
identifier until it finds a result.
Metrics
The server runs a HTTP server to export the following Prometheus
metrics:
- log_collection_e2e_success is 1 if the unique identifier was
successfully found in the ES index within the allocated timeout;
- log_collection_e2e_duration_seconds is the time, in seconds, it
took for the unique identifier to appear in the index (it is a
measure of the ingestion latency of the log collection system).
Other command-line options
The probing interval (how often a marker is sent) can be controlled
with the --interval option. The server will then wait for the amount
of time specified by --timeout for the marker to appear in the
index. The timeout can be longer than the interval: the next probe
will just be delayed until the current one completes.
The server waits for the marker to show up by repeatedly querying
Elasticsearch: the period of these queries (which is also the time
resolution of the reported latency) can be controlled with the
--poll-interval option. The queries should be very lightweight, so
there should be no reason to increase it.