prometheus

package
v0.0.0-...-6ade924 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 8, 2022 License: GPL-3.0 Imports: 15 Imported by: 0

README

Prometheus endpoint monitoring with Netdata

The generic Prometheus endpoint collector gathers metrics from Prometheus endpoints that use the OpenMetrics exposition format.

  • As of v1.24, Netdata can autodetect more than 600 Prometheus endpoints, including support for Windows 10 via windows_exporter, and instantly generate new charts with the same high-granularity, per-second frequency as you expect from other collectors.

  • The full list of endpoints is available in the collector's configuration file.

  • Collecting metrics from Prometheus endpoints in Kubernetes.

Charts

Netdata will produce one or more charts for every metric collected via a Prometheus endpoint. The number of charts depends entirely on the number of exposed metrics.

For example, scraping node_exporter produces 3000+ metrics.

Configuration

Edit the go.d/prometheus.conf configuration file using edit-config from the Netdata config directory, which is typically at /etc/netdata.

cd /etc/netdata # Replace this path with your Netdata config directory
sudo ./edit-config go.d/prometheus.conf

To add a new endpoint to collect metrics from, or change the URL that Netdata looks for, add or configure the name and url values. Endpoints can be both local or remote as long as they expose their metrics on the provided URL.

Here is an example with two endpoints:

jobs:
  - name: node_exporter_local
    url: http://127.0.0.1:9100/metrics

  - name: win10
    url: http://203.0.113.0:9182/metrics
Dimension algorithm

incremental algorithm (values displayed as rate) used when:

  • the metric type is Counter, Histogram or Summary.
  • the metrics suffix is _total, _sum or _count.

absolute algorithm (values displayed as is) is used in all other cases.

Use force_absolute_algorithm configuration option to overwrite the logic.

jobs:
  - name: node_exporter_local
    url: http://127.0.0.1:9100/metrics
    force_absolute_algorithm:
      - '*_sum'
      - '*_count'
Time Series Selector (filtering)

To filter unwanted time series (metrics) use selector configuration option.

Here is an example:

jobs:
  - name: node_exporter_local
    url: http://127.0.0.1:9100/metrics
    # (allow[0] || allow[1] || ...) && !(deny[0] || deny[1] || ...)
    selector:
      allow:
        - <PATTERN>
        - <PATTERN>
      deny:
        - <PATTERN>
        - <PATTERN>

To find PATTERN syntax description and more examples see selectors readme.

Time Series Grouping

This module groups time series into charts. It has built-in grouping logic (based on metric type). It is possible to extend it via group configuration option.

Gauge and Counter
  • A chart per every metric.
  • Dimensions are labels sets.
  • Dimensions per chart limit is 50. If there is more dimensions the chart split into several charts.
  • Values as is.

For instance, the following time series produce 1 chart.

example_device_cur_state{name="0",type="Fan"} 0
example_device_cur_state{name="1",type="Fan"} 0
example_device_cur_state{name="10",type="Processor"} 0
example_device_cur_state{name="11",type="intel_powerclamp"} -1
example_device_cur_state{name="2",type="Fan"} 0
example_device_cur_state{name="3",type="Fan"} 0
example_device_cur_state{name="4",type="Fan"} 0
example_device_cur_state{name="5",type="Processor"} 0
example_device_cur_state{name="6",type="Processor"} 0
example_device_cur_state{name="7",type="Processor"} 0
example_device_cur_state{name="8",type="Processor"} 0
example_device_cur_state{name="9",type="Processor"} 0
Custom Grouping (Gauge and Counter only)

To group time series use group configuration option.

Here is an example:

jobs:
  - name: node_exporter_local
    url: http://127.0.0.1:9100/metrics
    group:
      - selector: <PATTERN>
        by_label: <a space separated list of labels names>
      - selector: <PATTERN>
        by_label: <a space separated list of labels names> 

To find PATTERN syntax description and more examples see selectors readme.

This example configuration groups all time series with metric names equal to example_device_cur_state into multiple charts by type label. Number of charts is equal to number of type label values.

jobs:
  - name: node_exporter_local
    url: http://127.0.0.1:9100/metrics
    group:
      - selector: example_device_cur_state
        by_label: type 
Summary
  • A chart per time series (label set).
  • Dimensions are quantiles.
  • Values as is.

For instance, the following time series produce 2 charts.

example_duration_seconds{interval="15s",quantile="0"} 4.693e-06
example_duration_seconds{interval="15s",quantile="0.25"} 2.4383e-05
example_duration_seconds{interval="15s",quantile="0.5"} 0.00013458
example_duration_seconds{interval="15s",quantile="0.75"} 0.000195183
example_duration_seconds{interval="15s",quantile="1"} 0.005386229

example_duration_seconds{interval="30s",quantile="0"} 4.693e-06
example_duration_seconds{interval="30s",quantile="0.25"} 2.4383e-05
example_duration_seconds{interval="30s",quantile="0.5"} 0.00013458
example_duration_seconds{interval="30s",quantile="0.75"} 0.000195183
example_duration_seconds{interval="30s",quantile="1"} 0.005386229
Histogram
  • A chart per time series (label set).
  • Dimensions are le buckets.
  • Values are not as is because histogram buckets are cumulative (le="0.3" contains le="1.2"). We calculate exact values for all buckets.

For instance, the following time series produce 2 charts.

example_seconds_bucket{interval="15s",le="0.1"} 0
example_seconds_bucket{interval="15s",le="0.25"} 0
example_seconds_bucket{interval="15s",le="0.5"} 0
example_seconds_bucket{interval="15s",le="1"} 0
example_seconds_bucket{interval="15s",le="2.5"} 0
example_seconds_bucket{interval="15s",le="5"} 0
example_seconds_bucket{interval="15s",le="+Inf"} 0

example_seconds_bucket{interval="30s",le="0.1"} 0
example_seconds_bucket{interval="30s",le="0.25"} 0
example_seconds_bucket{interval="30s",le="0.5"} 0
example_seconds_bucket{interval="30s",le="1"} 0
example_seconds_bucket{interval="30s",le="2.5"} 0
example_seconds_bucket{interval="30s",le="5"} 0
example_seconds_bucket{interval="30s",le="+Inf"} 0

For all available options, see the Prometheus collector's configuration file.

Troubleshooting

To troubleshoot issues with the prometheus collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn't working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that's not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
    
  • Switch to the netdata user.

    sudo -u netdata -s
    
  • Run the go.d.plugin to debug the collector:

    ./go.d.plugin -d -m prometheus
    

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Charts

type Charts = module.Charts

type Config

type Config struct {
	web.HTTP               `yaml:",inline"`
	Name                   string        `yaml:"name"`
	Application            string        `yaml:"app"`
	BearerTokenFile        string        `yaml:"bearer_token_file"` // TODO: part of web.Request?
	MaxTS                  int           `yaml:"max_time_series"`
	MaxTSPerMetric         int           `yaml:"max_time_series_per_metric"`
	Selector               selector.Expr `yaml:"selector"`
	Grouping               []GroupOption `yaml:"group"`
	ExpectedPrefix         string        `yaml:"expected_prefix"`
	ForceAbsoluteAlgorithm []string      `yaml:"force_absolute_algorithm"`
}

type Dims

type Dims = module.Dims

type GroupOption

type GroupOption struct {
	Selector string `yaml:"selector"`
	ByLabel  string `yaml:"by_label"`
}

type Prometheus

type Prometheus struct {
	module.Base
	Config `yaml:",inline"`
	// contains filtered or unexported fields
}

func New

func New() *Prometheus

func (Prometheus) Charts

func (p Prometheus) Charts() *module.Charts

func (*Prometheus) Check

func (p *Prometheus) Check() bool

func (Prometheus) Cleanup

func (Prometheus) Cleanup()

func (*Prometheus) Collect

func (p *Prometheus) Collect() map[string]int64

func (*Prometheus) Init

func (p *Prometheus) Init() bool

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL