README
¶
(Forked) Cachet Monitor
Features
- Creates & Resolves Incidents
- Posts monitor lag to cachet graphs
- HTTP Checks (body/status code)
- DNS Checks
- Updates Component to Partial Outage
- Updates Component to Major Outage if already in Partial Outage (works with distributed monitors)
- Can be run on multiple servers and geo regions
- NEW TCP Checks
- NEW SAP Cloud Application Status Checks
- NEW Configuration schema file
Example Configuration
Note: configuration can be in json or yaml format. example.config.json
, example.config.yaml
files.
api:
# cachet url
url: https://demo.cachethq.io/api/v1
# cachet api token
token: 9yMHsdioQosnyVK4iCVR
insecure: false
# https://golang.org/src/time/format.go#L57
date_format: 02/01/2006 15:04:05 MST
monitors:
# http monitor example
- name: google
# test url
target: https://google.com
# strict certificate checking for https
strict: true
# HTTP method
method: POST
# set to update component (either component_id or metric_id are required)
component_id: 1
# set to post lag to cachet metric (graph)
metric_id: 4
# custom templates (see readme for details)
# leave empty for defaults
template:
investigating:
subject: "{{ .Monitor.Name }} - {{ .SystemName }}"
message: "{{ .Monitor.Name }} check **failed** (server time: {{ .now }})\n\n{{ .FailReason }}"
fixed:
subject: "I HAVE BEEN FIXED"
# seconds between checks
interval: 1
# seconds for timeout
timeout: 1
# If % of downtime is over this threshold, open an incident
threshold: 80
# custom HTTP headers
headers:
Authorization: Basic <hash>
# expected status code (either status code or body must be supplied)
expected_status_code: 200
# regex to match body
expected_body: "P.*NG"
# dns monitor example
- name: dns
# fqdn
target: matej.me.
# question type (A/AAAA/CNAME/...)
question: mx
type: dns
# set component_id/metric_id
component_id: 2
# poll every 1s
interval: 1
timeout: 1
# custom DNS server (defaults to system)
dns: 8.8.4.4:53
answers:
# exact/regex check
- regex: [1-9] alt[1-9].aspmx.l.google.com.
- exact: 10 aspmx2.googlemail.com.
- exact: 1 aspmx.l.google.com.
- exact: 10 aspmx3.googlemail.com.
Installation
- Download binary from release page
- Add the binary to an executable path (/usr/bin, etc.)
- Create a configuration following provided examples
cachet-monitor -c /etc/cachet-monitor.yaml
pro tip: run in background using nohup cachet-monitor 2>&1 > /var/log/cachet-monitor.log &
, or use a tmux/screen session
Usage:
cachet-monitor (-c PATH | --config PATH) [--log=LOGPATH] [--name=NAME] [--immediate]
cachet-monitor -h | --help | --version
Arguments:
PATH path to config.json
LOGPATH path to log output (defaults to STDOUT)
NAME name of this logger
Examples:
cachet-monitor -c /root/cachet-monitor.json
cachet-monitor -c /root/cachet-monitor.json --log=/var/log/cachet-monitor.log --name="development machine"
Options:
-c PATH.json --config PATH Path to configuration file
-h --help Show this screen.
--version Show version
--immediate Tick immediately (by default waits for first defined interval)
Environment varaibles:
CACHET_API override API url from configuration
CACHET_TOKEN override API token from configuration
CACHET_DEV set to enable dev logging
Init script
If your system is running systemd (like Debian, Ubuntu 16.04, Fedora, RHEL7, or Archlinux) you can use the provided example file: example.cachet-monitor.service.
- Simply put it in the right place with
cp example.cachet-monitor.service /etc/systemd/system/cachet-monitor.service
- Then do a
systemctl daemon-reload
in your terminal to update Systemd configuration - Finally you can start cachet-monitor on every startup with
systemctl enable cachet-monitor.service
! 👍
Templates
This package makes use of text/template
. Default HTTP template
The following variables are available:
Root objects | Description |
---|---|
.SystemName |
system name |
.API |
api object from configuration |
.Monitor |
monitor object from configuration |
.now |
formatted date string |
Monitor variables |
---|
.Name |
.Target |
.Type |
.Strict |
.MetricID |
... |
All monitor variables are available from monitor.go
Vision and goals
We made this tool because we felt the need to have our own monitoring software (leveraging on Cachet). The idea is a stateless program which collects data and pushes it to a central cachet instance.
This gives us power to have an army of geographically distributed loggers and reveal issues in both latency & downtime on client websites.
Package usage
When using cachet-monitor
as a package in another program, you should follow what cli/main.go
does. It is important to call Validate
on CachetMonitor
and all the monitors inside.
Documentation
¶
Index ¶
- Constants
- func CheckTCPPortAlive(ip, port string, timeout int64) (bool, error)
- func GetMonitorType(t string) string
- type AbstractMonitor
- func (mon *AbstractMonitor) AnalyseData()
- func (mon *AbstractMonitor) ClockStart(cfg *CachetMonitor, iface MonitorInterface, wg *sync.WaitGroup)
- func (mon *AbstractMonitor) ClockStop()
- func (mon *AbstractMonitor) Describe() []string
- func (mon *AbstractMonitor) GetMonitor() *AbstractMonitor
- func (mon *AbstractMonitor) Validate() []string
- type CachetAPI
- type CachetMonitor
- type CachetResponse
- type DNSAnswer
- type DNSMonitor
- type HTTPMonitor
- type Incident
- type MessageTemplate
- type MonitorInterface
- type TCPMonitor
Constants ¶
const DefaultInterval = time.Second * 60
const DefaultTimeFormat = "15:04:05 Jan 2 MST"
const DefaultTimeout = time.Second
const HistorySize = 10
Variables ¶
This section is empty.
Functions ¶
func CheckTCPPortAlive ¶
CheckTCPPortAlive func
Types ¶
type AbstractMonitor ¶
type AbstractMonitor struct { Name string Target string // (default)http / dns Type string Strict bool Interval time.Duration Timeout time.Duration MetricID int `mapstructure:"metric_id"` ComponentID int `mapstructure:"component_id"` // Templating stuff Template struct { Investigating MessageTemplate Fixed MessageTemplate } // Threshold = percentage / number of down incidents Threshold float32 ThresholdCount bool `mapstructure:"threshold_count"` // contains filtered or unexported fields }
AbstractMonitor data model
func (*AbstractMonitor) AnalyseData ¶
func (mon *AbstractMonitor) AnalyseData()
TODO: test AnalyseData decides if the monitor is statistically up or down and creates / resolves an incident
func (*AbstractMonitor) ClockStart ¶
func (mon *AbstractMonitor) ClockStart(cfg *CachetMonitor, iface MonitorInterface, wg *sync.WaitGroup)
func (*AbstractMonitor) ClockStop ¶
func (mon *AbstractMonitor) ClockStop()
func (*AbstractMonitor) Describe ¶
func (mon *AbstractMonitor) Describe() []string
func (*AbstractMonitor) GetMonitor ¶
func (mon *AbstractMonitor) GetMonitor() *AbstractMonitor
func (*AbstractMonitor) Validate ¶
func (mon *AbstractMonitor) Validate() []string
type CachetAPI ¶
type CachetAPI struct { URL string `json:"url"` Token string `json:"token"` Insecure bool `json:"insecure"` }
func (CachetAPI) NewRequest ¶
func (api CachetAPI) NewRequest(requestType, url string, reqBody []byte) (*http.Response, CachetResponse, error)
TODO: test NewRequest wraps http.NewRequest
func (CachetAPI) SendMetric ¶
SendMetric adds a data point to a cachet monitor
type CachetMonitor ¶
type CachetMonitor struct { SystemName string `json:"system_name" yaml:"system_name"` DateFormat string `json:"date_format" yaml:"date_format"` API CachetAPI `json:"api"` RawMonitors []map[string]interface{} `json:"monitors" yaml:"monitors"` Monitors []MonitorInterface `json:"-" yaml:"-"` Immediate bool `json:"-" yaml:"-"` }
type CachetResponse ¶
type CachetResponse struct {
Data json.RawMessage `json:"data"`
}
type DNSMonitor ¶
type DNSMonitor struct { AbstractMonitor `mapstructure:",squash"` // IP:port format or blank to use system defined DNS DNS string // A(default), AAAA, MX, ... Question string Answers []DNSAnswer // contains filtered or unexported fields }
func (*DNSMonitor) Validate ¶
func (monitor *DNSMonitor) Validate() []string
type HTTPMonitor ¶
type HTTPMonitor struct { AbstractMonitor `mapstructure:",squash"` Method string ExpectedStatusCode int `mapstructure:"expected_status_code"` Headers map[string]string // compiled to Regexp ExpectedBody string `mapstructure:"expected_body"` // contains filtered or unexported fields }
func (*HTTPMonitor) Describe ¶
func (mon *HTTPMonitor) Describe() []string
type Incident ¶
type Incident struct { ID int `json:"id"` Name string `json:"name"` Message string `json:"message"` Status int `json:"status"` Visible int `json"visible"` Notify bool `json:"notify"` ComponentID int `json:"component_id"` ComponentStatus int `json:"component_status"` }
Incident Cachet data model
func (*Incident) GetComponentStatus ¶
func (incident *Incident) GetComponentStatus(cfg *CachetMonitor) (int, error)
func (*Incident) Send ¶
func (incident *Incident) Send(cfg *CachetMonitor) error
Send - Create or Update incident
func (*Incident) SetIdentified ¶
func (incident *Incident) SetIdentified()
SetIdentified sets status to Identified
func (*Incident) SetInvestigating ¶
func (incident *Incident) SetInvestigating()
SetInvestigating sets status to Investigating
func (*Incident) SetWatching ¶
func (incident *Incident) SetWatching()
SetWatching sets status to Watching
type MessageTemplate ¶
type MessageTemplate struct { Subject string `json:"subject"` Message string `json:"message"` // contains filtered or unexported fields }
func (*MessageTemplate) Exec ¶
func (t *MessageTemplate) Exec(data interface{}) (string, string)
func (*MessageTemplate) SetDefault ¶
func (t *MessageTemplate) SetDefault(d MessageTemplate)
type MonitorInterface ¶
type MonitorInterface interface { ClockStart(*CachetMonitor, MonitorInterface, *sync.WaitGroup) ClockStop() Validate() []string GetMonitor() *AbstractMonitor Describe() []string // contains filtered or unexported methods }
type TCPMonitor ¶
type TCPMonitor struct { AbstractMonitor `mapstructure:",squash"` Port string }
TCPMonitor struct