clickhouseexporter

package module
v0.112.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 22, 2024 License: Apache-2.0 Imports: 22 Imported by: 5

README

ClickHouse Exporter

Status
Stability alpha: metrics
beta: traces, logs
Distributions contrib
Issues Open issues Closed issues
Code Owners @hanjm, @dmitryax, @Frapschen, @SpencerTorres

This exporter supports sending OpenTelemetry data to ClickHouse.

ClickHouse is an open-source, high performance columnar OLAP database management system for real-time analytics using SQL. Throughput can be measured in rows per second or megabytes per second. If the data is placed in the page cache, a query that is not too complex is processed on modern hardware at a speed of approximately 2-10 GB/s of uncompressed data on a single server. If 10 bytes of columns are extracted, the speed is expected to be around 100-200 million rows per second.

Note: Always add batch-processor to collector pipeline, as ClickHouse document says:

We recommend inserting data in packets of at least 1000 rows, or no more than a single request per second. When inserting to a MergeTree table from a tab-separated dump, the insertion speed can be from 50 to 200 MB/s.

Visualization Tools

Official ClickHouse Plugin for Grafana

The official ClickHouse Datasource for Grafana contains features that integrate directly with this exporter. You can view associated logs and traces, as well as visualize other queries such as tables and time series graphs. Learn how to configure the OpenTelemetry integration here.

Altinity's ClickHouse Plugin for Grafana

If the official plugin doesn't meet your needs, you can try the Altinity plugin for ClickHouse, which also supports a wide range of features.

Logs
  • Get log severity count time series.
SELECT toDateTime(toStartOfInterval(TimestampTime, INTERVAL 60 second)) as time, SeverityText, count() as count
FROM otel_logs
WHERE time >= NOW() - INTERVAL 1 HOUR
GROUP BY SeverityText, time
ORDER BY time;
  • Find any log.
SELECT Timestamp as log_time, Body
FROM otel_logs
WHERE TimestampTime >= NOW() - INTERVAL 1 HOUR
Limit 100;
  • Find log with specific service.
SELECT Timestamp as log_time, Body
FROM otel_logs
WHERE ServiceName = 'clickhouse-exporter'
  AND TimestampTime >= NOW() - INTERVAL 1 HOUR
Limit 100;
  • Find log with specific attribute.
SELECT Timestamp as log_time, Body
FROM otel_logs
WHERE LogAttributes['container_name'] = '/example_flog_1'
  AND TimestampTime >= NOW() - INTERVAL 1 HOUR
Limit 100;
  • Find log with body contain string token.
SELECT Timestamp as log_time, Body
FROM otel_logs
WHERE hasToken(Body, 'http')
  AND TimestampTime >= NOW() - INTERVAL 1 HOUR
Limit 100;
  • Find log with body contain string.
SELECT Timestamp as log_time, Body
FROM otel_logs
WHERE Body like '%http%'
  AND TimestampTime >= NOW() - INTERVAL 1 HOUR
Limit 100;
  • Find log with body regexp match string.
SELECT Timestamp as log_time, Body
FROM otel_logs
WHERE match(Body, 'http')
  AND TimestampTime >= NOW() - INTERVAL 1 HOUR
Limit 100;
  • Find log with body json extract.
SELECT Timestamp as log_time, Body
FROM otel_logs
WHERE JSONExtractFloat(Body, 'bytes') > 1000
  AND TimestampTime >= NOW() - INTERVAL 1 HOUR
Limit 100;
Traces
  • Find spans with specific attribute.
SELECT Timestamp,
       TraceId,
       SpanId,
       ParentSpanId,
       SpanName,
       SpanKind,
       ServiceName,
       Duration,
       StatusCode,
       StatusMessage,
       toString(SpanAttributes),
       toString(ResourceAttributes),
       toString(Events.Name),
       toString(Links.TraceId)
FROM otel_traces
WHERE ServiceName = 'clickhouse-exporter'
  AND SpanAttributes['peer.service'] = 'telemetrygen-server'
  AND Timestamp >= NOW() - INTERVAL 1 HOUR
Limit 100;
  • Find traces with traceID (using time primary index and TraceID skip index).
WITH
    '391dae938234560b16bb63f51501cb6f' as trace_id,
    (SELECT min(Start) FROM otel_traces_trace_id_ts WHERE TraceId = trace_id) as start,
    (SELECT max(End) + 1 FROM otel_traces_trace_id_ts WHERE TraceId = trace_id) as end
SELECT Timestamp,
       TraceId,
       SpanId,
       ParentSpanId,
       SpanName,
       SpanKind,
       ServiceName,
       Duration,
       StatusCode,
       StatusMessage,
       toString(SpanAttributes),
       toString(ResourceAttributes),
       toString(Events.Name),
       toString(Links.TraceId)
FROM otel_traces
WHERE TraceId = trace_id
  AND Timestamp >= start
  AND Timestamp <= end
Limit 100;
  • Find spans is error.
SELECT Timestamp,
       TraceId,
       SpanId,
       ParentSpanId,
       SpanName,
       SpanKind,
       ServiceName,
       Duration,
       StatusCode,
       StatusMessage,
       toString(SpanAttributes),
       toString(ResourceAttributes),
       toString(Events.Name),
       toString(Links.TraceId)
FROM otel_traces
WHERE ServiceName = 'clickhouse-exporter'
  AND StatusCode = 'Error'
  AND Timestamp >= NOW() - INTERVAL 1 HOUR
Limit 100;
  • Find slow spans.
SELECT Timestamp,
       TraceId,
       SpanId,
       ParentSpanId,
       SpanName,
       SpanKind,
       ServiceName,
       Duration,
       StatusCode,
       StatusMessage,
       toString(SpanAttributes),
       toString(ResourceAttributes),
       toString(Events.Name),
       toString(Links.TraceId)
FROM otel_traces
WHERE ServiceName = 'clickhouse-exporter'
  AND Duration > 1 * 1e9
  AND Timestamp >= NOW() - INTERVAL 1 HOUR
Limit 100;
Metrics

Metrics data is stored in different clickhouse tables depending on their types. The tables will have a suffix to distinguish which type of metrics data is stored.

Metrics Type Metrics Table
sum _sum
gauge _gauge
histogram _histogram
exponential histogram _exponential_histogram
summary _summary

Before you make a metrics query, you need to know the type of metric you wish to use. If your metrics come from Prometheus(or someone else uses OpenMetrics protocol), you also need to know the compatibility between Prometheus(OpenMetrics) and OTLP Metrics.

  • Find a sum metrics with name
select TimeUnix,MetricName,Attributes,Value from otel_metrics_sum
where MetricName='calls' limit 100
  • Find a sum metrics with name, attribute.
select TimeUnix,MetricName,Attributes,Value from otel_metrics_sum
where MetricName='calls' and Attributes['service_name']='featureflagservice'
limit 100

The OTLP Metrics define two type value for one datapoint, clickhouse only use one value of float64 to store them.

Performance Guide

A single ClickHouse instance with 32 CPU cores and 128 GB RAM can handle around 20 TB (20 Billion) logs per day, the data compression ratio is 7 ~ 11, the compressed data store in disk is 1.8 TB ~ 2.85 TB, add more clickhouse node to cluster can increase linearly.

The otel-collector with otlp receiver/batch processor/clickhouse tcp exporter can process around 40k/s logs entry per CPU cores, add more collector node can increase linearly.

Configuration options

The following settings are required:

  • endpoint (no default): The ClickHouse server address, support multi host with port, for example:
    • tcp protocol tcp://addr1:port,tcp://addr2:port or TLS tcp://addr1:port,addr2:port?secure=true
    • http protocol http://addr1:port,addr2:port or https https://addr1:port,addr2:port
    • clickhouse protocol clickhouse://addr1:port,addr2:port or TLS clickhouse://addr1:port,addr2:port?secure=true

Many other ClickHouse specific options can be configured through query parameters e.g. addr?dial_timeout=5s&compress=lz4. For a full list of options see the ClickHouse driver documentation

Connection options:

  • username (default = ): The authentication username.
  • password (default = ): The authentication password.
  • ttl (default = 0): The data time-to-live example 30m, 48h. Also, 0 means no ttl.
  • database (default = default): The database name. Overrides the database defined in endpoint when this setting is not equal to default.
  • connection_params (default = {}). Params is the extra connection parameters with map format. Query parameters provided in endpoint will be individually overwritten if present in this map.
  • create_schema (default = true): When set to true, will run DDL to create the database and tables. (See schema management)
  • compress (default = lz4): Controls the compression algorithm. Valid options: none (disabled), zstd, lz4 (default), gzip, deflate, br, true (lz4). Ignored if compress is set in the endpoint or connection_params.
  • async_insert (default = true): Enables async inserts. Ignored if async inserts are configured in the endpoint or connection_params. Async inserts may still be overridden server-side.

ClickHouse tables:

  • logs_table_name (default = otel_logs): The table name for logs.
  • traces_table_name (default = otel_traces): The table name for traces.
  • metrics_tables
    • gauge
      • name (default = "otel_metrics_gauge")
    • sum
      • name (default = "otel_metrics_sum")
    • summary
      • name (default = "otel_metrics_summary")
    • histogram
      • name (default = "otel_metrics_histogram")
    • exponential_histogram
      • name (default = "otel_metrics_exp_histogram")

Cluster definition:

  • cluster_name (default = ): Optional. If present, will include ON CLUSTER cluster_name when creating tables.

Table engine:

  • table_engine
    • name (default = MergeTree)
    • params (default = )

Modifies ENGINE definition when table is created. If not set then ENGINE defaults to MergeTree(). Can be combined with cluster_name to enable replication for fault tolerance.

Processing:

  • timeout (default = 5s): The timeout for every attempt to send data to the backend.
  • sending_queue
    • enabled (default = true)
    • num_consumers (default = 10): Number of consumers that dequeue batches; ignored if enabled is false
    • queue_size (default = 1000): Maximum number of batches kept in memory before dropping data.
  • retry_on_failure
    • enabled (default = true)
    • initial_interval (default = 5s): The Time to wait after the first failure before retrying; ignored if enabled is false
    • max_interval (default = 30s): The upper bound on backoff; ignored if enabled is false
    • max_elapsed_time (default = 300s): The maximum amount of time spent trying to send a batch; ignored if enabled is false

TLS

The exporter supports TLS. To enable TLS, you need to specify the secure=true query parameter in the endpoint URL or use the https scheme.

Schema management

By default the exporter will create the database and tables under the names defined in the config. This is fine for simple deployments, but for production workloads, it is recommended that you manage your own schema by setting create_schema to false in the config. This prevents each exporter process from racing to create the database and tables, and makes it easier to upgrade the exporter in the future.

In this mode, the only SQL sent to your server will be for INSERT statements.

The default DDL used by the exporter can be found in example/default_ddl. Be sure to customize the indexes, TTL, and partitioning to fit your deployment. Column names and types must be the same to preserve compatibility with the exporter's INSERT statements. As long as the column names/types match the INSERT statement, you can create whatever kind of table you want. See ClickHouse's LogHouse as an example of this flexibility.

Example

This example shows how to configure the exporter to send data to a ClickHouse server. It uses the native protocol without TLS. The exporter will create the database and tables if they don't exist. The data is stored for 72 hours (3 days).

receivers:
  examplereceiver:
processors:
  batch:
    timeout: 5s
    send_batch_size: 100000
exporters:
  clickhouse:
    endpoint: tcp://127.0.0.1:9000?dial_timeout=10s
    database: otel
    async_insert: true
    ttl: 72h
    compress: lz4
    create_schema: true
    logs_table_name: otel_logs
    traces_table_name: otel_traces
    timeout: 5s
    metrics_tables:
      gauge: 
        name: "otel_metrics_gauge"
      sum: 
        name: "otel_metrics_sum"
      summary: 
        name: "otel_metrics_summary"
      histogram: 
        name: "otel_metrics_histogram"
      exponential_histogram: 
        name: "otel_metrics_exp_histogram"
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s
    # cluster_name: my_cluster
    # table_engine:
    #   name: ReplicatedMergeTree
    #   params:
service:
  pipelines:
    logs:
      receivers: [ examplereceiver ]
      processors: [ batch ]
      exporters: [ clickhouse ]

Contributing

Before contributing, review the contribution guidelines in CONTRIBUTING.md.

Integration tests

Integration tests can be run with the following command:

go test -tags integration -run=TestIntegration

Note: Make sure integration tests pass after making changes to SQL.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewFactory

func NewFactory() exporter.Factory

NewFactory creates a factory for ClickHouse exporter.

Types

type Config

type Config struct {
	TimeoutSettings           exporterhelper.TimeoutConfig `mapstructure:",squash"`
	configretry.BackOffConfig `mapstructure:"retry_on_failure"`
	QueueSettings             exporterhelper.QueueConfig `mapstructure:"sending_queue"`

	// Endpoint is the clickhouse endpoint.
	Endpoint string `mapstructure:"endpoint"`
	// Username is the authentication username.
	Username string `mapstructure:"username"`
	// Password is the authentication password.
	Password configopaque.String `mapstructure:"password"`
	// Database is the database name to export.
	Database string `mapstructure:"database"`
	// ConnectionParams is the extra connection parameters with map format. for example compression/dial_timeout
	ConnectionParams map[string]string `mapstructure:"connection_params"`
	// LogsTableName is the table name for logs. default is `otel_logs`.
	LogsTableName string `mapstructure:"logs_table_name"`
	// TracesTableName is the table name for traces. default is `otel_traces`.
	TracesTableName string `mapstructure:"traces_table_name"`
	// MetricsTableName is the table name for metrics. default is `otel_metrics`.
	//
	// Deprecated: MetricsTableName exists for historical compatibility
	// and should not be used. To set the metrics tables name,
	// use the MetricsTables parameter instead.
	MetricsTableName string `mapstructure:"metrics_table_name"`
	// TTL is The data time-to-live example 30m, 48h. 0 means no ttl.
	TTL time.Duration `mapstructure:"ttl"`
	// TableEngine is the table engine to use. default is `MergeTree()`.
	TableEngine TableEngine `mapstructure:"table_engine"`
	// ClusterName if set will append `ON CLUSTER` with the provided name when creating tables.
	ClusterName string `mapstructure:"cluster_name"`
	// CreateSchema if set to true will run the DDL for creating the database and tables. default is true.
	CreateSchema bool `mapstructure:"create_schema"`
	// Compress controls the compression algorithm. Valid options: `none` (disabled), `zstd`, `lz4` (default), `gzip`, `deflate`, `br`, `true` (lz4).
	Compress string `mapstructure:"compress"`
	// AsyncInsert if true will enable async inserts. Default is `true`.
	// Ignored if async inserts are configured in the `endpoint` or `connection_params`.
	// Async inserts may still be overridden server-side.
	AsyncInsert bool `mapstructure:"async_insert"`
	// MetricsTables defines the table names for metric types.
	MetricsTables MetricTablesConfig `mapstructure:"metrics_tables"`
}

Config defines configuration for Elastic exporter.

func (*Config) Validate

func (cfg *Config) Validate() (err error)

Validate the ClickHouse server configuration.

type MetricTablesConfig added in v0.112.0

type MetricTablesConfig struct {
	// Gauge is the table name for gauge metric type. default is `otel_metrics_gauge`.
	Gauge internal.MetricTypeConfig `mapstructure:"gauge"`
	// Sum is the table name for sum metric type. default is `otel_metrics_sum`.
	Sum internal.MetricTypeConfig `mapstructure:"sum"`
	// Summary is the table name for summary metric type. default is `otel_metrics_summary`.
	Summary internal.MetricTypeConfig `mapstructure:"summary"`
	// Histogram is the table name for histogram metric type. default is `otel_metrics_histogram`.
	Histogram internal.MetricTypeConfig `mapstructure:"histogram"`
	// ExponentialHistogram is the table name for exponential histogram metric type. default is `otel_metrics_exponential_histogram`.
	ExponentialHistogram internal.MetricTypeConfig `mapstructure:"exponential_histogram"`
}

type TableEngine added in v0.97.0

type TableEngine struct {
	Name   string `mapstructure:"name"`
	Params string `mapstructure:"params"`
}

TableEngine defines the ENGINE string value when creating the table.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL