README
¶
Repaird
This small deamon detects and fixes problems on each host. First use-case is fixing Ceph mounts on a host by detecting that they are stale.
Design
There is a detecting phase, if any problems are detected the action from is performed. Each distinct
problem should live in its own *.go
file in the repair/
subdirectory. All detectors are run
serially, although this might change, if it proofs to become a problem.
- No logging should be done from these functions.
- Every detector will be run at rougly 1m interval.
- A maximum of 3 repairs will be attempted in 12m. If this maximum is reached, it will sleep for an hour.
Metrics
Per named problem:
repair_attempted_count_total{name="<name>"}
-- found something to repair for 'name'repair_failed_count_total{name="<name>"}
-- repair failed for 'name'
Documentation
¶
There is no documentation for this package.
Click to show internal directories.
Click to hide internal directories.