slotalk

command module

v0.0.4 Latest Latest Go to latest Published: Jun 7, 2023 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/tfadeyi/slotalk

Links

Open Source Insights

README ¶

Slotalk

⚠ The tool is still not ready for real production use yet.

Slotalk is a CLI tool that allows developers to embed Sloth SLO/SLI specifications as in-code annotations rather than a YAML file.

Similar to how Swaggo does for Swagger docs, Slotalk moves the SLO/SLI specification closer to where its relevant Prometheus metric was defined.

Slotalk can be used in tandem with the Sloth CLI to generate Prometheus alerts groups from the in-code annotations, which can be used in any Prometheus/Grafana monitoring system to keep track of the service's SLOs. See examples below.

Motivation

Experimentation, this was the main motivation behind development, testing libraries like: go/ast, wazero, participle.
Developer experience, finding ways to improve developer experience when it comes to more platform engineering concepts like SLIs and SLOs. I want to see if moving these concepts closer to devs, would make them less of an afterthought.
More Experimentation, many of the cloud native tools I've seen, and I've worked on have been very targeted towards DevOps/SecOps and Platform Engineering personas, so I wanted try my hand on building something for developers.
Trying ways to avoid writing YAML...

Prerequisites

Sloth CLI (optional)
Go
Nix (optional)

Try it!

Nix

Generate Prometheus SLO alert rules from an example metrics.go.

# creates a nix demo shell with slotalk and sloth. just follow the shell instructions
nix develop github:tfadeyi/slotalk#demo

Source

Generate Prometheus SLO alert rules from an example metrics.go.

Install Slotalk

# install the latest version of slotalk
go install github.com/tfadeyi/slotalk@latest
# install the latest version of sloth
go install github.com/slok/sloth/cmd/sloth@latest

Run slotalk and sloth to generate Prometheus alert rules from code annotations.

curl https://gist.githubusercontent.com/tfadeyi/df60aebd858d1c76428c045d4df7b114/raw/dfb96773dfb64086280845b9a0776012cbd7d26b/metrics.go > metrics.go
cat metrics.go | slotalk init -f - > ./sloth_defs.yaml
sloth generate -i ./sloth_defs.yaml -o ./rules.yml

You now should have Prometheus alerting rules that can be added to your Prometheus configuration.

Prometheus configuration

# my global config
global:
  scrape_interval: 5s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 5s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
 - "rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "exporter"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9301"]

Installation

Go install

# install the latest version of slotalk
go install github.com/tfadeyi/slotalk@latest

Nix

nix run github:tfadeyi/slotalk

Pre-released binaries

Download a pre-compiled binary from the release page.

curl -LJO https://github.com/tfadeyi/slotalk/releases/download/v0.0.3/slotalk-linux-amd64.tar.gz && \
tar -xzvf slotalk-linux-amd64.tar.gz && \
cd slotalk-linux-amd64

Docker

docker pull ghcr.io/tfadeyi/slotalk:latest

Get Started

Add comments to your source code. See Declarative Comments.
Run slotalk init in the project's root. This will parse your source code annotations and print the sloth definitions to standard out.
```
./slotalk init
```
You can also specify the specific file to parse by using the -f flag.
```
./slotalk init -f metrics.go
```
Another way would be to pass the input file through pipe.
```
cat metrics.go | ./slotalk init -f -
```

CLI usage

Usage:
  slotalk init [flags]

Flags:
      --dirs strings     Comma separated list of directories to be parses by the tool (default [/home/jetstack-oluwole/go/src/github.com/tfadeyi/slotalk])
  -f, --file string      Source file to parse.
      --format strings   Output format (yaml,json). (default [yaml])
  -h, --help             help for init
      --lang string      Language of the source files. (go) (default "go")

Global Flags:
      --log-level string   Only log messages with the given severity or above. One of: [none, debug, info, warn], errors will always be printed (default "info")

Declarative Comments (Sloth)

The Sloth definitions are added through declarative comments, as shown below.

// @sloth service chatgpt
// @sloth.slo name chat-gpt-availability
// @sloth.slo objective 95.0
// @sloth.sli error_query sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[{{.window}}])) OR on() vector(0)
// @sloth.sli total_query sum(rate(tenant_login_operations_total{client="chat-gpt"}[{{.window}}]))
// @sloth.slo description 95% of logins to the chat-gpt app should be successful.
// @sloth.alerting name ChatGPTAvailability

Service definitions

annotation	description	example
service	Required. The name of the service the definitions refer to.	@sloth service chat-gpt
version	The version of the Sloth specification.	@sloth version prometheus/v1
labels	The labels associated to the Sloth service.	@sloth labels foo bar @sloth labels test slo

SLO definitions

annotation	description	example
name	Required. The name of the SLO.	@sloth.slo name availability
objective	Required. The SLO Objective is target of the SLO the percentage (0, 100] (e.g 99.9).	@sloth.slo objective 95.0
description	Description is the description of the SLO.	@sloth.slo description 95% of logins to the chat-gpt app should be successful annotations. (can be multilined)
labels	Labels are the Prometheus labels that will have all the recording and alerting rules for this specific SLO. These labels are merged with the previous level labels.	@sloth.slo labels foo bar @sloth labels test slo

Alerting definitions

annotation	description	example
name	Required. Name is the name used by the alerts generated for this SLO.	@sloth.alerting name ChatGPTAvailability
labels	Labels are the Prometheus labels that will have all the alerts generated by this SLO.	@sloth.alerting labels severity critical
annotations	Annotations are the Prometheus annotations that will have all the alerts generated by this SLO.	@sloth.alerting annotations runbook: "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh"

Page Alerting definitions

annotation	description	example
labels	Labels are the Prometheus labels for the specific alert. For example can be useful to route the Page alert to specific Slack channel.	@sloth.alerting.page labels severity critical
annotations	Annotations are the Prometheus annotations for the specific alert.	@sloth.alerting.page annotations tier application

Ticket Alerting definitions

annotation	description	example
labels	Labels are the Prometheus labels for the specific alert. For example can be useful to route the Page alert to specific Slack channel.	@sloth.alerting.ticket labels severity critical
annotations	Annotations are the Prometheus annotations for the specific alert.	@sloth.alerting.ticket annotations tier application

Examples

Basic usage - Generate Sloth definitions using go:generate

The following example shows how to use go:generate to generate Sloth definitions from in code annotations.

metrics.go

    // @sloth service chatgpt
    var (
        // @sloth.slo name chat-gpt-availability
        // @sloth.slo objective 95.0
        // @sloth.sli error_query sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[{{.window}}])) OR on() vector(0)
        // @sloth.sli total_query sum(rate(tenant_login_operations_total{client="chat-gpt"}[{{.window}}]))
        // @sloth.slo description 95% of logins to the chat-gpt app should be successful.
        // @sloth.alerting name ChatGPTAvailability
        metricGaugeCertInventoryProcessingMessages = prometheus.NewGauge(
            prometheus.GaugeOpts{
                Namespace: "chatgpt",
                Subsystem: "auth0",
                Name:      "tenant_login_operations_total",
            })
        tenantFailedLogins = prometheus.NewCounter(
            prometheus.CounterOpts{
            Namespace: "chatgpt",
            Subsystem: "auth0",
            Name:      "tenant_failed_login_operations_total",
        })
    )

main.go

//go:generate slotalk init

package main

import (
)

// @sloth service chatgpt
func main() {
}

Running go generate, will allow the slotalk to walk through the different packages parsing the in code annotations and generate Sloth definitions.

go generate ./...

Result Sloth Definitions.

# Code generated by slotalk: https://github.com/tfadeyi/slotalk.
# DO NOT EDIT.
version: prometheus/v1
service: chatgpt
slos:
    - name: chat-gpt-availability
      description: 95% of logins to the chat-gpt app should be successful.
      objective: 95
      sli:
        events:
            error_query: sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[{{.window}}])) OR on() vector(0)
            total_query: sum(rate(tenant_login_operations_total{client="chat-gpt"}[{{.window}}]))
      alerting:
        name: ChatGPTAvailability

Basic usage - Generate Prometheus alert groups from code annotations

This example shows how sloth's comments can be added next to the prometheus metrics defined in a metrics.go file.

    // @sloth service chatgpt
    
    var (
        // @sloth.slo name chat-gpt-availability
        // @sloth.slo objective 95.0
        // @sloth.sli error_query sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[{{.window}}])) OR on() vector(0)
        // @sloth.sli total_query sum(rate(tenant_login_operations_total{client="chat-gpt"}[{{.window}}]))
        // @sloth.slo description 95% of logins to the chat-gpt app should be successful.
        // @sloth.alerting name ChatGPTAvailability
        metricGaugeCertInventoryProcessingMessages = prometheus.NewGauge(
            prometheus.GaugeOpts{
                Namespace: "chatgpt",
                Subsystem: "auth0",
                Name:      "tenant_login_operations_total",
            })
        tenantFailedLogins = prometheus.NewCounter(
            prometheus.CounterOpts{
            Namespace: "chatgpt",
            Subsystem: "auth0",
            Name:      "tenant_failed_login_operations_total",
        })
    )

Now running the following command from the root of the project.

./slotalk init

This will generate the following sloth definitions being outputted to standard out.

version: prometheus/v1
service: "chatgpt"
slos:
    - name: chat-gpt-availability
      description: 95% of logins to the chat-gpt app should be successful.
      objective: 95
      sli:
        events:
            error_query: sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[{{.window}}])) OR on() vector(0)
            total_query: sum(rate(tenant_login_operations_total{client="chat-gpt"}[{{.window}}]))
      alerting:
        name: "ChatGPTAvailability"

This specification can then be passed to the Sloth CLI to generate Prometheus alerting groups.

./slotalk init > sloth_defs.yaml && sloth generate -i sloth_defs.yaml

Resulting alert groups.

# Code generated by Sloth (v0.11.0): https://github.com/slok/sloth.
# DO NOT EDIT.

groups:
- name: sloth-slo-sli-recordings-foo-chat-gpt-availability
  rules:
  - record: slo:sli_error:ratio_rate5m
    expr: |
      (sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[5m])) OR on() vector(0))
      /
      (sum(rate(tenant_login_operations_total{client="chat-gpt"}[5m])))
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_window: 5m
  - record: slo:sli_error:ratio_rate30m
    expr: |
      (sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[30m])) OR on() vector(0))
      /
      (sum(rate(tenant_login_operations_total{client="chat-gpt"}[30m])))
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_window: 30m
  - record: slo:sli_error:ratio_rate1h
    expr: |
      (sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[1h])) OR on() vector(0))
      /
      (sum(rate(tenant_login_operations_total{client="chat-gpt"}[1h])))
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_window: 1h
  - record: slo:sli_error:ratio_rate2h
    expr: |
      (sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[2h])) OR on() vector(0))
      /
      (sum(rate(tenant_login_operations_total{client="chat-gpt"}[2h])))
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_window: 2h
  - record: slo:sli_error:ratio_rate6h
    expr: |
      (sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[6h])) OR on() vector(0))
      /
      (sum(rate(tenant_login_operations_total{client="chat-gpt"}[6h])))
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_window: 6h
  - record: slo:sli_error:ratio_rate1d
    expr: |
      (sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[1d])) OR on() vector(0))
      /
      (sum(rate(tenant_login_operations_total{client="chat-gpt"}[1d])))
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_window: 1d
  - record: slo:sli_error:ratio_rate3d
    expr: |
      (sum(rate(tenant_failed_login_operations_total{client="chat-gpt"}[3d])) OR on() vector(0))
      /
      (sum(rate(tenant_login_operations_total{client="chat-gpt"}[3d])))
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_window: 3d
  - record: slo:sli_error:ratio_rate30d
    expr: |
      sum_over_time(slo:sli_error:ratio_rate5m{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"}[30d])
      / ignoring (sloth_window)
      count_over_time(slo:sli_error:ratio_rate5m{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"}[30d])
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_window: 30d
- name: sloth-slo-meta-recordings-foo-chat-gpt-availability
  rules:
  - record: slo:objective:ratio
    expr: vector(0.95)
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
  - record: slo:error_budget:ratio
    expr: vector(1-0.95)
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
  - record: slo:time_period:days
    expr: vector(30)
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
  - record: slo:current_burn_rate:ratio
    expr: |
      slo:sli_error:ratio_rate5m{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"}
      / on(sloth_id, sloth_slo, sloth_service) group_left
      slo:error_budget:ratio{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"}
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
  - record: slo:period_burn_rate:ratio
    expr: |
      slo:sli_error:ratio_rate30d{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"}
      / on(sloth_id, sloth_slo, sloth_service) group_left
      slo:error_budget:ratio{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"}
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
  - record: slo:period_error_budget_remaining:ratio
    expr: 1 - slo:period_burn_rate:ratio{sloth_id="foo-chat-gpt-availability", sloth_service="foo",
      sloth_slo="chat-gpt-availability"}
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_service: foo
      sloth_slo: chat-gpt-availability
  - record: sloth_slo_info
    expr: vector(1)
    labels:
      foo: bar
      sloth_id: foo-chat-gpt-availability
      sloth_mode: cli-gen-prom
      sloth_objective: "95"
      sloth_service: foo
      sloth_slo: chat-gpt-availability
      sloth_spec: prometheus/v1
      sloth_version: v0.11.0
- name: sloth-slo-alerts-foo-chat-gpt-availability
  rules:
  - alert: K8sApiserverAvailabilityAlert
    expr: |
      (
          max(slo:sli_error:ratio_rate5m{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"} > (14.4 * 0.05)) without (sloth_window)
          and
          max(slo:sli_error:ratio_rate1h{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"} > (14.4 * 0.05)) without (sloth_window)
      )
      or
      (
          max(slo:sli_error:ratio_rate30m{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"} > (6 * 0.05)) without (sloth_window)
          and
          max(slo:sli_error:ratio_rate6h{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"} > (6 * 0.05)) without (sloth_window)
      )
    labels:
      sloth_severity: page
    annotations:
      summary: '{{$labels.sloth_service}} {{$labels.sloth_slo}} SLO error budget burn
        rate is over expected.'
      title: (page) {{$labels.sloth_service}} {{$labels.sloth_slo}} SLO error budget
        burn rate is too fast.
  - alert: K8sApiserverAvailabilityAlert
    expr: |
      (
          max(slo:sli_error:ratio_rate2h{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"} > (3 * 0.05)) without (sloth_window)
          and
          max(slo:sli_error:ratio_rate1d{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"} > (3 * 0.05)) without (sloth_window)
      )
      or
      (
          max(slo:sli_error:ratio_rate6h{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"} > (1 * 0.05)) without (sloth_window)
          and
          max(slo:sli_error:ratio_rate3d{sloth_id="foo-chat-gpt-availability", sloth_service="foo", sloth_slo="chat-gpt-availability"} > (1 * 0.05)) without (sloth_window)
      )
    labels:
      sloth_severity: ticket
    annotations:
      summary: '{{$labels.sloth_service}} {{$labels.sloth_slo}} SLO error budget burn
        rate is over expected.'
      title: (ticket) {{$labels.sloth_service}} {{$labels.sloth_slo}} SLO error budget
        burn rate is too fast.

License

MIT, see LICENSE.md.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
cmd
options Package options handles the different options the binary commands has.	Package options handles the different options the binary commands has.
options/common Package common contains the different common options across the different commands.	Package common contains the different common options across the different commands.
options/init Package spec contains the different options present under the spec generation command.	Package spec contains the different options present under the spec generation command.
internal
generate Package generate contains utilities to generate data from a given specification	Package generate contains utilities to generate data from a given specification
logging Package logging uses the logr.Logger interface to integrate with different logging implementation for structured logging	Package logging uses the logr.Logger interface to integrate with different logging implementation for structured logging
parser
parser/lang
parser/options Package options contains the different options available for the Parser struct	Package options contains the different options available for the Parser struct
parser/specification
parser/specification/sloth
parser/specification/sloth/grammar Package grammar contains the grammar rules and lexer related to sloth	Package grammar contains the grammar rules and lexer related to sloth
parser/specification/sloth/language
parser/specification/sloth/language/golang
version Package version, returns the build info of the binary	Package version, returns the build info of the binary

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL