zipkin

package
v1.33.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 10, 2025 License: MIT Imports: 20 Imported by: 11

README

Zipkin Input Plugin

This plugin implements the Zipkin http server to gather trace and timing data needed to troubleshoot latency problems in microservice architectures.

Please Note: This plugin is experimental; Its data schema may be subject to change based on its main usage cases and the evolution of the OpenTracing standard.

Service Input

This plugin is a service input. Normal plugins gather metrics determined by the interval setting. Service plugins start a service to listens and waits for metrics or events to occur. Service plugins have two key differences from normal plugins:

  1. The global or plugin specific interval setting may not apply
  2. The CLI options of --test, --test-wait, and --once may not produce output for this plugin

Global configuration options

In addition to the plugin-specific configuration settings, plugins support additional global and plugin configuration settings. These settings are used to modify metrics, tags, and field or create aliases and configure ordering, etc. See the CONFIGURATION.md for more details.

Configuration

# This plugin implements the Zipkin http server to gather trace and timing data needed to troubleshoot latency problems in microservice architectures.
[[inputs.zipkin]]
  ## URL path for span data
  # path = "/api/v1/spans"

  ## Port on which Telegraf listens
  # port = 9411

  ## Maximum duration before timing out read of the request
  # read_timeout = "10s"
  ## Maximum duration before timing out write of the response
  # write_timeout = "10s"

The plugin accepts spans in JSON or thrift if the Content-Type is application/json or application/x-thrift, respectively. If Content-Type is not set, then the plugin assumes it is JSON format.

Tracing

This plugin uses Annotations tags and fields to track data from spans

  • TRACE: is a set of spans that share a single root span. Traces are built by collecting all Spans that share a traceId.

  • SPAN: is a set of Annotations and BinaryAnnotations that correspond to a particular RPC.

  • Annotations: for each annotation & binary annotation of a span a metric is output. Records an occurrence in time at the beginning and end of a request.

    Annotations may have the following values:

    • CS (client start): beginning of span, request is made.
    • SR (server receive): server receives request and will start processing it network latency & clock jitters differ it from cs
    • SS (server send): server is done processing and sends request back to client amount of time it took to process request will differ it from sr
    • CR (client receive): end of span, client receives response from server RPC is considered complete with this annotation

Metrics

  • "duration_ns": The time in nanoseconds between the end and beginning of a span.
Tags
  • "id": The 64-bit ID of the span.
  • "parent_id": An ID associated with a particular child span. If there is no child span, the parent ID is set to ID.
  • "trace_id": The 64 or 128-bit ID of a particular trace. Every span in a trace shares this ID. Concatenation of high and low and converted to hexadecimal.
  • "name": Defines a span
Annotations have these additional tags
  • "service_name": Defines a service
  • "annotation": The value of an annotation
  • "endpoint_host": Listening port concat with IPV4, if port is not present it will not be concatenated
Binary Annotations have these additional tag
  • "service_name": Defines a service
  • "annotation": The value of an annotation
  • "endpoint_host": Listening port concat with IPV4, if port is not present it will not be concatenated
  • "annotation_key": label describing the annotation

Sample Queries

Get All Span Names for Service my_web_server

SHOW TAG VALUES FROM "zipkin" with key="name" WHERE "service_name" = 'my_web_server'
  • Description: returns a list containing the names of the spans which have annotations with the given service_name of my_web_server.

-Get All Service Names-

SHOW TAG VALUES FROM "zipkin" WITH KEY = "service_name"
  • Description: returns a list of all distinct endpoint service names.

-Find spans with the longest duration-

SELECT max("duration_ns") FROM "zipkin" WHERE "service_name" = 'my_service' AND "name" = 'my_span_name' AND time > now() - 20m GROUP BY "trace_id",time(30s) LIMIT 5
  • Description: In the last 20 minutes find the top 5 longest span durations for service my_server and span name my_span_name

This test will create high cardinality data so we recommend using the tsi influxDB engine.

How To Set Up InfluxDB For Work With Zipkin
Steps
  1. Update InfluxDB to >= 1.3, in order to use the new tsi engine.

  2. Generate a config file with the following command:

    influxd config > /path/for/config/file
    
  3. Add the following to your config file, under the [data] tab:

    [data]
      index-version = "tsi1"
    
  4. Start influxd with your new config file:

    influxd -config=/path/to/your/config/file
    
  5. Update your retention policy:

    ALTER RETENTION POLICY "autogen" ON "telegraf" DURATION 1d SHARD DURATION 30m
    
Example Input Trace
Trace Example from Zipkin model
{
  "traceId": "bd7a977555f6b982",
  "name": "query",
  "id": "be2d01e33cc78d97",
  "parentId": "ebf33e1a81dc6f71",
  "timestamp": 1458702548786000,
  "duration": 13000,
  "annotations": [
    {
      "endpoint": {
        "serviceName": "zipkin-query",
        "ipv4": "192.168.1.2",
        "port": 9411
      },
      "timestamp": 1458702548786000,
      "value": "cs"
    },
    {
      "endpoint": {
        "serviceName": "zipkin-query",
        "ipv4": "192.168.1.2",
        "port": 9411
      },
      "timestamp": 1458702548799000,
      "value": "cr"
    }
  ],
  "binaryAnnotations": [
    {
      "key": "jdbc.query",
      "value": "select distinct `zipkin_spans`.`trace_id` from `zipkin_spans` join `zipkin_annotations` on (`zipkin_spans`.`trace_id` = `zipkin_annotations`.`trace_id` and `zipkin_spans`.`id` = `zipkin_annotations`.`span_id`) where (`zipkin_annotations`.`endpoint_service_name` = ? and `zipkin_spans`.`start_ts` between ? and ?) order by `zipkin_spans`.`start_ts` desc limit ?",
      "endpoint": {
        "serviceName": "zipkin-query",
        "ipv4": "192.168.1.2",
        "port": 9411
      }
    },
    {
      "key": "sa",
      "value": true,
      "endpoint": {
        "serviceName": "spanstore-jdbc",
        "ipv4": "127.0.0.1",
        "port": 3306
      }
    }
  ]
}

Example Output

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ContentDecoder

func ContentDecoder(r *http.Request) (codec.Decoder, error)

ContentDecoder returns a Decoder that is able to produce Traces from bytes. Failure should yield an HTTP 415 (`http.StatusUnsupportedMediaType`) If a Content-Type is not set, zipkin assumes application/json

Types

type Handler

type Handler interface {
	Register(router *mux.Router, recorder Recorder) error
}

Handler represents a type which can register itself with a router for http routing, and a Recorder for trace data collection.

type LineProtocolConverter

type LineProtocolConverter struct {
	// contains filtered or unexported fields
}

LineProtocolConverter implements the Recorder interface; it is a type meant to encapsulate the storage of zipkin tracing data in telegraf as line protocol.

func NewLineProtocolConverter

func NewLineProtocolConverter(acc telegraf.Accumulator) *LineProtocolConverter

NewLineProtocolConverter returns an instance of LineProtocolConverter that will add to the given telegraf.Accumulator

func (*LineProtocolConverter) Error

func (l *LineProtocolConverter) Error(err error)

func (*LineProtocolConverter) Record

func (l *LineProtocolConverter) Record(t trace.Trace) error

Record is LineProtocolConverter's implementation of the Record method of the Recorder interface; it takes a trace as input, and adds it to an internal telegraf.Accumulator.

type Recorder

type Recorder interface {
	Record(trace.Trace) error
	Error(error)
}

Recorder represents a type which can record zipkin trace data as well as any accompanying errors, and process that data.

type SpanHandler

type SpanHandler struct {
	Path string
	// contains filtered or unexported fields
}

SpanHandler is an implementation of a Handler which accepts zipkin thrift span data and sends it to the recorder

func NewSpanHandler

func NewSpanHandler(path string) *SpanHandler

NewSpanHandler returns a new server instance given path to handle

func (*SpanHandler) Register

func (s *SpanHandler) Register(router *mux.Router, recorder Recorder) error

Register implements the Service interface. Register accepts zipkin thrift data POSTed to the path of the mux router

func (*SpanHandler) Spans

func (s *SpanHandler) Spans(w http.ResponseWriter, r *http.Request)

Spans handles zipkin thrift spans

type Zipkin

type Zipkin struct {
	Port         int             `toml:"port"`
	Path         string          `toml:"path"`
	ReadTimeout  config.Duration `toml:"read_timeout"`
	WriteTimeout config.Duration `toml:"write_timeout"`

	Log telegraf.Logger
	// contains filtered or unexported fields
}

Zipkin is a telegraf configuration structure for the zipkin input plugin, but it also contains fields for the management of a separate, concurrent zipkin http server

func (*Zipkin) Gather

func (*Zipkin) Gather(telegraf.Accumulator) error

Gather is empty for the zipkin plugin; all gathering is done through the separate goroutine launched in (*Zipkin).Start()

func (*Zipkin) Listen

func (z *Zipkin) Listen(ln net.Listener, acc telegraf.Accumulator)

Listen creates a http server on the zipkin instance it is called with, and serves http until it is stopped by Zipkin's (*Zipkin).Stop() method.

func (*Zipkin) SampleConfig

func (*Zipkin) SampleConfig() string

func (*Zipkin) Start

func (z *Zipkin) Start(acc telegraf.Accumulator) error

Start launches a separate goroutine for collecting zipkin client http requests, passing in a telegraf.Accumulator such that data can be collected.

func (*Zipkin) Stop

func (z *Zipkin) Stop()

Stop shuts the internal http server down with via context.Context

Directories

Path Synopsis
cmd
thrift_serialize
A small cli utility meant to convert json to zipkin thrift binary format, and vice versa.
A small cli utility meant to convert json to zipkin thrift binary format, and vice versa.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL