chihuahua

package module
v0.0.0-...-d04f0d8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 3, 2021 License: MIT Imports: 19 Imported by: 0

README

chihuahua

The smallest watchdog on earth. Tiny, monitoring-plugins compatible monitoring with a status page, built with Go.

  • Run nagios/monitoring-plugins checks on a remote server
  • Alert on state change via Gotify or email
  • Show a nice little status page

Screenshot

Getting Started

docker run -d -p 8080:80 -v "$PWD/data:/data" momar/chihuahua
nano data/chihuahua.hcl

Chihuahua now runs at http://localhost:8080.

You can find a full example configuration file at chihuahua.hcl, and should also read the full configuration manual, but to get started, you can use a simple configuration like this:

check "cpu" {
  # This check will be run on every server, unless the 
  # server specifies an overriding check called "cpu".
  name = "CPU Load"
  command = "check_load -r -w 3,2,1.25 -c 4,3,2"
}

server "server-local" {
  connection = "local"
}

server "server-01" {
  connection = "ssh chihuahua@example.org"
  check "ram" {
    command = "check_memory -w 8 -c 3" # parameters in percent
  }
}

notifier "email-myself" {
  # emails are delayed by 5 minutes by default to accumulate
  # multiple notifications into a single notification email.
  type = "smtp"
  server = "user:password@smtp.example.org"
  from = "Chihuahua <chihuahua@example.org>"
  to = ["myself@example.org"]
}
Use a systemd service to install the server without Docker (unsupported)
wget https://codeberg.org/momar/chihuahua/releases/download/v1.4/chihuahua-x64.gz -O- | gunzip > /tmp/chihuahua
sudo install -m755 /tmp/chihuahua /usr/local/bin/chihuahua

mkdir -p /usr/local/lib/chihuahua/.ssh
useradd -d /usr/local/lib/chihuahua -M -r -s /usr/sbin/nologin chihuahua
ssh-keygen -b 2048 -f /usr/local/lib/chihuahua/.ssh/id_rsa -P "" -C "Chihuahua Monitoring"
nano /etc/chihuahua.yml

sudo wget https://codeberg.org/momar/chihuahua/raw/branch/master/chihuahua.service -O /etc/systemd/system/chihuahua.service
sudo systemctl enable chihuahua.service
sudo systemctl start chihuahua.service
Set up a server for connections (Debian/Ubuntu/Alpine/...)
curl -Ls http://status.example.org:8080/setup.sh | sudo sh

You can now use connection = "ssh chihuahua@example.org" to connect to your server with a limited user.

To completely remove the Chihuahua setup from your server, use the following commands:

sudo userdel -r chihuahua
sudo apt-get remove --auto-remove monitoring-plugins # or "apk del monitoring-plugins" on Alpine
sudo rm /usr/local/bin/check_sudo
sudo sed -i '/^chihuahua /d' /etc/sudoers

API

TODO: this should be documented more thoroughly - maybe provide an API Blueprint?

GET /setup.sh
GET /checks
GET /checks/:server
GET /checks/:server/:check

Development

Requires Go

git clone https://codeberg.org/momar/chihuahua.git && cd chihuahua
cp resources/chihuahua.hcl .
go run ./cmd --debug

Roadmap

  • Add custom messages to checks
  • Add private/hidden checks
  • More notification providers (mainly Clockwork SMS)
  • Gitea integration
  • Provide a Prometheus exporter

Documentation

Index

Constants

View Source
const (
	// StatusOk is the result of a check that returned with the exit code 0
	StatusOk CheckStatus = 0
	// StatusWarning is the result of a check that returned with the exit code 1
	StatusWarning CheckStatus = 1
	// StatusCritical is the result of a check that returned with the exit code 2
	StatusCritical CheckStatus = 2
	// StatusUnknown is the result of a check that returned with a different exit code or threw an error during execution
	StatusUnknown CheckStatus = 3

	// UnitNumber is the unit used for a number of things (e.g. users, processes, load averages)
	UnitNumber CheckUnit = ""
	// UnitSeconds is the unit used for an elapsed time in seconds
	UnitSeconds CheckUnit = "s"
	// UnitMilliseconds is the unit used for an elapsed time in milliseconds
	UnitMilliseconds CheckUnit = "ms"
	// UnitMicroseconds is the unit used for an elapsed time in microseconds
	UnitMicroseconds CheckUnit = "us"
	// UnitPercentage is the unit used for a percentage, normally between 0 and 100
	UnitPercentage CheckUnit = "%"
	// UnitBytes is the unit used for data sizes in bytes
	UnitBytes CheckUnit = "B"
	// UnitKilobytes is the unit used for data sizes in kilobytes
	UnitKilobytes CheckUnit = "KB"
	// UnitMegabytes is the unit used for data sizes in megabytes
	UnitMegabytes CheckUnit = "MB"
	// UnitGigabytes is the unit used for data sizes in gigabytes
	UnitGigabytes CheckUnit = "GB"
	// UnitTerabytes is the unit used for data sizes in terabytes
	UnitTerabytes CheckUnit = "TB"
	// UnitCounter is the unit used for a continuous counter (such as bytes transmitted on an interface)
	UnitCounter CheckUnit = "c"
)

Variables

View Source
var ConnectionTimeout = 30 * time.Second
View Source
var Notifiers = map[string]Notifier{}

Notifiers is a map of registered notification providers

View Source
var Working = sync.RWMutex{}

Functions

func Api

func Api(cfg *Config)

func GenerateKeys

func GenerateKeys()

func GetCheck

func GetCheck(cfg *Config) func(c *gin.Context)

GetCheck returns the check results

func RunOnce

func RunOnce(cfg *Config) []error

func Schedule

func Schedule(cfg *Config)

func SendUpdate

func SendUpdate(cfg *Config)

func UpdateKeys

func UpdateKeys()

Types

type Check

type Check struct {
	ID   string
	Name string

	// Command is the check command line to run inside the shell (e.g. `/usr/lib/monitoring-plugins/check_ping -H 8.8.8.8 -w 100,25% -c 200,50%`)
	Command string `json:"-"`

	Disable   bool
	Notifiers []string      `json:"-"`
	Verify    uint          `json:"-"`
	Interval  time.Duration `json:"-"`

	Result CheckResult

	Parent  *ServerOrGroup `json:"-"` // would lead to a loop if exposed to JSON!
	Job     *scheduler.Job `json:"-"`
	JobLock sync.Mutex     `json:"-"`
}

Check describes a command that shall be run in a specific shell, and (if the check has already been run) the result of that command interpreted according to the monitoring-plugins documentation (https://www.monitoring-plugins.org/doc/guidelines.html)

func (*Check) FullID

func (chk *Check) FullID() []string

func (*Check) FullName

func (chk *Check) FullName() []string

func (*Check) Notify

func (chk *Check) Notify(cfg *Config, previousState CheckResult)

func (*Check) Run

func (chk *Check) Run(cfg *Config)

func (*Check) Schedule

func (chk *Check) Schedule(cfg *Config)

if a check takes longer than the interval, the next occurence is skipped.

type CheckPerformance

type CheckPerformance struct {
	// Unit is the unit of measurement (UOM) or the part
	Unit CheckUnit
	// Value is the current value, or NaN if the actual value couldn't be determined (UOM "U" or parsing issues (which additionally cause a warning))
	Value float64

	// Min is the smallest possible value, or NaN if it does not apply or in the case of parsing issues (which additionally cause a warning)
	Min float64
	// Max is the biggest possible value, or NaN if it does not apply or in the case of parsing issues (which additionally cause a warning)
	Max float64

	// Warning is the range definition that will result in a warning alert, or nil if it does not apply or in the case of parsing issues (which additionally cause a warning)
	Warning CheckRange
	// Critical is the range definition that will result in a critical alert, or nil if it does not apply or in the case of parsing issues (which additionally cause a warning)
	Critical CheckRange
}

CheckPerformance describes a performance data part of a completed check

type CheckRange

type CheckRange struct {
	// Start is the lower bound of the value (will send an alert if the actual value is smaller), or -Inf if it does not apply
	Start float64
	// End is the upper bound of the value (will send an alert if the actual value is bigger), or Inf if it does not apply
	End float64
	// Inside changes the behaviour (if set to true) to send an alert if the actual value is BIGGER than Start AND SMALLER than End
	Inside bool
}

CheckRange describes a range for warning and critical values for a performance data part of a completed check

type CheckResult

type CheckResult struct {
	// Status is the result of the check after it has been run
	Status CheckStatus
	// Error contains the STDERR output of the check command, and should normally be empty - if it is non-empty, it is very probable that the check couldn't be initiated correctly
	Error string
	// Details contains the STDOUT output of the check command
	Details string

	// Performance contains the performance data parts of the check, mapped to their label
	Performance map[string]CheckPerformance

	// LastUpdate is the last execution date of the check
	LastUpdate time.Time
}

type CheckStatus

type CheckStatus int

CheckStatus describes the result of a check (StatusOk, StatusWarning, StatusCritical, StatusUnknown)

func (CheckStatus) String

func (s CheckStatus) String() string

type CheckUnit

type CheckUnit string

CheckUnit describes the unit of measurement (UOM) for a check value

type Config

type Config struct {
	Servers   []*ServerOrGroup
	Notifiers map[string]NotifierConfig
	RootURL   string
	Silent    bool
}

func (*Config) Walk

func (cfg *Config) Walk(fn func(*ServerOrGroup))

type Filter

type Filter struct {
	If     *vm.Program
	Accept *bool
}

type Notifier

type Notifier interface {
	Notify(*Config, Check, CheckResult)
}

type NotifierConfig

type NotifierConfig struct {
	Verify   int
	Filters  []Filter
	Notifier Notifier
}

func (NotifierConfig) Filter

func (c NotifierConfig) Filter(check Check, previousResult CheckResult) bool

func (NotifierConfig) Notify

func (c NotifierConfig) Notify(cfg *Config, notifier string, chk *Check, previousState CheckResult)

type NotifierWithContext

type NotifierWithContext interface {
	Notifier
	Export() interface{}
	Import(interface{})
}

type ServerOrGroup

type ServerOrGroup struct {
	ID               string
	Name             string
	ConnectionType   string           `json:",omitempty"`
	ConnectionParams string           `json:"-"`
	Checks           []*Check         `json:",omitempty"`
	Children         []*ServerOrGroup `json:",omitempty"`
	Parent           *ServerOrGroup   `json:"-"` // would lead to a loop if exposed to JSON!
}

func (*ServerOrGroup) FullID

func (sog *ServerOrGroup) FullID() []string

func (*ServerOrGroup) FullName

func (sog *ServerOrGroup) FullName() []string

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL