metric-explorer

command module
v0.0.0-...-7a175ec Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 5, 2024 License: MIT Imports: 1 Imported by: 0

README

metric-explorer

With the need to accommodate more granular information in metrics, the problem of high cardinality is becoming more and more natural. We talk about cardinality often, but we don’t have the right ways to explore our time-series database to find helpful information to control or improve this problem.

Read more on what is high cardinality.

Introducing metric-explorer. It provides three modes of operation:

  1. System(system)
  2. Explore(explore)
  3. Cardinality Calculator(cc)

metric-explorer is compatible with:

  • Victoriametrics
  • Prometheus (WIP)

Build the tool from source

git clone git@github.com:pree-dew/metric-explorer.git
make my_binary
./bin/metric-explorer --help
A tool that helps answer: I have detected high cardinality; what to do next?

It provides the capability to make decisions on how to control cardinality.
It supports three modes:
1. System(system): To get system-wide information about cardinality.
2. Explore (explore): To know more about specific metrics.
3. Cardinality Control(cc): To decide to control cardinality

Usage:
  metric-explorer [command]

Available Commands:
  cc          To understand the cardinality distribution of a metric
  completion  Generate the autocompletion script for the specified shell
  explore     Provide metrics information
  help        Help about any command
  system      Overview of your TSDB coverage

Flags:
      --config string   config file (default is $HOME/.metric_explorer.yaml)
  -h, --help            help for metric_explorer
  -v, --version         version for metric_explorer

Use "metric-explorer [command] --help" for more information about a command.
System Mode:

Everything begins by identifying the spread first. When we have a database with many metrics, it’s essential to identify the most influential metrics in the system and their usage limits.

./bin/metric-explorer system --help

Use Case 1: Find topN cardinality metrics

./bin/metric-explorer system --config example/sample.yaml --cardinality --dump-as=table
╭───────────────────────────────────────┬─────────────╮
│ METRIC                                │ CARDINALITY │
├───────────────────────────────────────┼─────────────┤
│ http_request_total                    │         111 │
│ scrape_duration_seconds               │           1 │
│ scrape_samples_post_metric_relabeling │           1 │
│ scrape_samples_scraped                │           1 │
│ scrape_series_added                   │           1 │
│ scrape_timeout_seconds                │           1 │
│ up                                    │           1 │
╰───────────────────────────────────────┴─────────────╯

Use Case 2: Find my top N queries in decreasing order of average running time over the past x seconds

./bin/metric-explorer system --config example/sample.yaml --top-queries --topN=3 --top-query-max-lifetime=300 --dump-as=table
Explore Mode:

After identifying the troublesome metric and queries, it’s essential to understand other details about specific metrics to narrow down the different kinds of problems, whether cardinality, resource crunch, reset counts, loss of signal, sparseness, etc.

./bin/metric-explorer explore --help

Use Case 1: Find the cardinality distribution of a specific metric, along with its label values in decreasing order of cardinality contribution

./bin/metric-explorer explore http_request_total --config example/sample.yaml 
--cardinality --label-count=5 --dump-as=table
╭───────────────────┬────────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ METRIC            │ HTTP_REQUEST_TOTAL │                                                                                                                                                       │
│ CARDINALITY       │                111 │                                                                                                                                                       │
│ LABEL             │       UNIQUE VALUE │ LABEL VALUES                                                                                                                                          │
├───────────────────┼────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ endpoint          │                  8 │ /api/v1/label/env/values,/api/v1/label/scrape_job/values,/api/v1/label/service_name/values,/api/v1/label/handler/values,/api/v1/label/instance/values │
│ status_code       │                  5 │ 422,500,400,200,503                                                                                                                                   │
│ method            │                  3 │ PUT,GET,POST                                                                                                                                          │
│ __name__          │                  1 │ http_request_total                                                                                                                                    │
│ instance          │                  1 │ localhost:9100                                                                                                                                        │
│ job               │                  1 │ vmagent-01                                                                                                                                            │
│ exported_instance │                  9 │ 10.16.130.145:9100,10.16.128.122:8482,10.16.131.243:9153,10.16.131.183:8482,10.16.128.129:9100                                                        │
│ host              │                  9 │ 10.16.130.145,10.16.128.122,10.16.131.243,10.16.131.183,10.16.128.129                                                                                 │
╰───────────────────┴────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Use Case 2: Find the last loss of signal for a metric (Supports both counter and gauge)

./bin/metric-explorer explore http_request_total --config example/sample.yaml --loss=360000 --lag=1800

Use Case 3: Find the sparseness % of a metric. Note: Sparseness here means metric is absent.

./bin/metric-explorer explore http_request_total --config example/sample.yaml --sparse=360000 --lag=1800

Use Case 4: Find no. of resets that happened on a counter metric in the past x seconds

./bin/metric-explorer explore http_request_total --config example/sample.yaml --reset-counts --lag=1800

Use Case 5: Find churn rate, ingestion rate, sample received, active time series, scrape interval of a metric

./bin/metric-explorer explore http_request_total --config example/sample.yaml 
--active-timeseries --scrape-interval --ingestion-rate --sample-received --churn-rate  --lag=1800
Cardinality Calculator Mode:

After finding that cardinality is the problem, we have to find/investigate which labels are the culprit and how to go about them be dropping a few to control the problem. It’s not easy to find this information for a very high cardinality metric, and mainly, the way cardinality has been considered so far as cartesian products of count of all unique labels is not the right way to think about it.

./bin/metric-explorer cc --help

Use Case 1: Get the cardinality contribution of each label

Note: Beyond cardinality limit --allowed-cardinality-limit tool automatically creates relative query; we can also control relative behavior by using flag --filter-label

./bin/metric-explorer cc http_request_total --config example/sample.yaml --allowed-cardinality-limit=30000 --dump-as=table
╭───────────────────┬────────────────────┬───────────────╮
│ METRIC            │ HTTP_REQUEST_TOTAL │               │
│ CARDINALITY       │                111 │               │
│ LABEL             │       UNIQUE VALUE │ CARDINALITY % │
├───────────────────┼────────────────────┼───────────────┤
│ exported_instance │                  9 │            17 │
│ host              │                  9 │            17 │
│ endpoint          │                  8 │            37 │
│ status_code       │                  5 │            29 │
│ method            │                  3 │            25 │
│ job               │                  1 │            17 │
│ instance          │                  1 │            17 │
╰───────────────────┴────────────────────┴───────────────╯

Use Case 2: Define the label on which you want to check cardinality

 ./bin/metric-explorer cc http_request_total --config example/sample.yaml --filter-label=endpoint --allowed-cardinality-limit=100 --dump-as=table 
╭───────────────────┬──────────────────────────────────────────────────────────────────┬───────────────╮
│ METRIC            │ HTTP_REQUEST_TOTAL{ENDPOINT="/API/V1/LABEL/SERVICE_NAME/VALUES"} │               │
│ CARDINALITY       │                                                               16 │               │
│ LABEL             │                                                     UNIQUE VALUE │ CARDINALITY % │
├───────────────────┼──────────────────────────────────────────────────────────────────┼───────────────┤
│ host              │                                                                8 │            18 │
│ exported_instance │                                                                8 │            18 │
│ status_code       │                                                                5 │            25 │
│ method            │                                                                3 │            18 │
│ endpoint          │                                                                1 │            18 │
│ instance          │                                                                1 │            18 │
│ job               │                                                                1 │            18 │
╰───────────────────┴──────────────────────────────────────────────────────────────────┴───────────────╯

Use Case 3: Judge the cardinality % in pairs to identify the relation between labels is 1:1, 1:M or M:N. It is very critical to understand because if it is 1:1, it may not bring the cardinality so much.

./bin/metric-explorer cc http_request_total --config example/sample.yaml --filter-label=cluster --label-count=2 --dump-as=table

Use Case 4: Find out dropping a label or pair of labels is going to result into duplicates or not

./bin/metric-explorer cc drop metric --config  example/sample.yaml 

To find for specific labels

./bin/metric-explorer cc drop metric --config  example/sample.yaml  --labels=pod --labels=host --labels=instance

Documentation

Overview

Copyright © 2023 NAME HERE <EMAIL ADDRESS>

Directories

Path Synopsis
client_golang/api
Package api provides clients for the HTTP APIs.
Package api provides clients for the HTTP APIs.
client_golang/api/prometheus/v1
Package v1 provides bindings to the Prometheus HTTP API v1: http://prometheus.io/docs/querying/api/
Package v1 provides bindings to the Prometheus HTTP API v1: http://prometheus.io/docs/querying/api/
client_golang/examples/gocollector
A minimal example of how to include Prometheus instrumentation.
A minimal example of how to include Prometheus instrumentation.
client_golang/examples/random
A simple example exposing fictional RPC latencies with different types of random distributions (uniform, normal, and exponential) as Prometheus metrics.
A simple example exposing fictional RPC latencies with different types of random distributions (uniform, normal, and exponential) as Prometheus metrics.
client_golang/examples/simple
A minimal example of how to include Prometheus instrumentation.
A minimal example of how to include Prometheus instrumentation.
client_golang/prometheus
Package prometheus is the core instrumentation package.
Package prometheus is the core instrumentation package.
client_golang/prometheus/collectors
Package collectors provides implementations of prometheus.Collector to conveniently collect process and Go-related metrics.
Package collectors provides implementations of prometheus.Collector to conveniently collect process and Go-related metrics.
client_golang/prometheus/graphite
Package graphite provides a bridge to push Prometheus metrics to a Graphite server.
Package graphite provides a bridge to push Prometheus metrics to a Graphite server.
client_golang/prometheus/promauto
Package promauto provides alternative constructors for the fundamental Prometheus metric types and their …Vec and …Func variants.
Package promauto provides alternative constructors for the fundamental Prometheus metric types and their …Vec and …Func variants.
client_golang/prometheus/promhttp
Package promhttp provides tooling around HTTP servers and clients.
Package promhttp provides tooling around HTTP servers and clients.
client_golang/prometheus/push
Package push provides functions to push metrics to a Pushgateway.
Package push provides functions to push metrics to a Pushgateway.
client_golang/prometheus/testutil
Package testutil provides helpers to test code using the prometheus package of client_golang.
Package testutil provides helpers to test code using the prometheus package of client_golang.
client_golang/prometheus/testutil/promlint
Package promlint provides a linter for Prometheus metrics.
Package promlint provides a linter for Prometheus metrics.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL