ftdc

package
v0.48.0-rc0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 28, 2024 License: AGPL-3.0 Imports: 16 Imported by: 0

Documentation

Overview

Package ftdc ...

To build intuition for the file format, let's start with a simple example input being serialized with a simple format. Let's say we have two data points (two "datum"):

Datum1 = {time: 123, motor: {powerPct: 0.2, pos: 5000}, gps: {lat: 40.7128, long: -74.0060}} Datum2 = {time: 124, motor: {powerPct: 0.2, pos: 5001}, gps: {lat: 40.7128, long: -74.0061}}

We can serialize those two data points as pure json into a file. One after the other:

{"time", 123, "motor": {"powerPct": 0.2, "pos": 5000}, gps: {"lat": 40.7128, "long": -74.0060}}\n {"time", 124, "motor": {"powerPct": 0.2, "pos": 5001}, gps: {"lat": 40.7128, "long": -74.0061}}

While json can be a bit wasteful in its own right, there are two properties of the data we want to collect that allow us greatly compress data:

  • The metrics we're recording (e.g: `motor.powerPct`, `gps.lat`, ...) are often exactly the same between the readings.
  • Many of the values we're recording (e.g: the `0.2`) do not change between readings.

Thus, the two tricks employed are to: - Only write metric names when things change. - Minimize how much space is used to write out a value that hasn't changed.

Using a pseudo EBNF notation, an FTDC file is: FTDC = ftdc_doc*

ftdc_doc = schema | metric

schema =

schema_identifier : 0x01 (a full byte of value 1)
schema : <array of strings serialized as JSON, including a trailing \n(0xa)>

metric_reading =

metric_identifier : 0b0 (a single bit of value 0)
diff_bit : bit* + byte alignment padding
time: int64 <Golang: `time.Now().Unix()`. Nanoseconds since the 1970 epoch.>
values : float32*

Because a metric reading is not meaningful without a schema, a file will always start with a schema document. The first byte of a schema document is 0x01 followed by a JSON list of strings and a UNIX newline (0x0a). The JSON strings are "flattened" using a dot to concatenate the map key with the metric name. E.g:

0000 0001 ["motor.powerPct", "motor.pos", "gps.lat", "gps.long"]\n 7 0

Following a schema document will be 0 or more metric documents. A metric reading has one diff bit per reading (i.e: the "size" of the schema). In our example, that's four bits. A diff bit is set to `0` if the new reading for a given metric is same as the immediately prior reading. A diff bit is set to `1` if the readings differ. Each reading that differs will have one 32-bit float value written as part of this metric reading document. A metric can contain numeric values that are not 32-bit floats. This format is lossy. We don't expect to need the fully ~7 (`math.log10(2**23)`) significant figures of precision a float32 provides.

The diff bits immediately follow the metric bit value of 0. In other words, the first byte containing the metric bit is packed/merged with the first (up to seven) diff bits. The remaining diff bytes each contain up to eight diff bits. The last diff byte may not have eight metrics to fill out a full byte. A full byte will be written none the less for alignment. The higher bits will be wasted.

Note that the number of diff bytes to write/read is a function of the number of fields in the most recent schema.

The initial metric reading immediately following a schema document does not have a diff to compare against. In this case the format assumes a prior value of `0` for each metric. To continue our example, let's consider the first metric reading document.

For clarity, the diff bits are described in a binary representation detailing the exact bits. Where bit-0 is the metric bit (defined as 0) and the diff bits 1 through 4 (inclusive) are set to 1. And the numbers written for time/readings are annotated. All numbers are big-endian encoded for no good/benchmarked reason.

0001 1110 <64bit time> <32bit "motor.powerPct"> <32bit "motor.pos"> <32bit "gps.lat"> <32bit "gps.long"> 7 0

Now let's see what the second datum will look like. First we calculate which metric readings have changed: - motor.powerPct: 0.2 -> 0.2 (no diff) - motor.pos: 5000 -> 5001 (diff) - gps.lat: 40.7128 -> 40.7128 (no diff) - gps.long: -74.0060 -> -74.0061 (diff)

Giving us the following encoding:

0001 0100 <64bit time> <32bit "motor.pos"> <32bit "gps.long"> 7 0

If we now get a new datum that changes the schema (e.g: remove the motor), we can write out a new schema document:

0000 0001 ["gps.lat", "gps.long"]\n 7 0

To maybe illuminate the necessity of the schema/metric identifier, when a parser is about to read a new document, it needs to know whether: - to interpret the bytes as json for a new schema, or - as values for a new metric reading

A parser can read a single byte and look at the least significant bit to determine which path to take.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type FTDC

type FTDC struct {
	// contains filtered or unexported fields
}

FTDC is a tool for storing observability data on disk in a compact binary format for production debugging.

func New

func New(logger logging.Logger) *FTDC

New creates a new *FTDC.

func NewWithWriter

func NewWithWriter(writer io.Writer, logger logging.Logger) *FTDC

NewWithWriter creates a new *FTDC that outputs bytes to the specified writer.

func (*FTDC) Add

func (ftdc *FTDC) Add(name string, statser Statser)

Add regsiters a new staters that will be recorded in future FTDC loop iterations.

func (*FTDC) Remove

func (ftdc *FTDC) Remove(name string)

Remove removes a statser that was previously `Add`ed with the given `name`.

func (*FTDC) Start added in v0.48.1

func (ftdc *FTDC) Start()

Start spins off the background goroutine for collecting/writing FTDC data.

func (*FTDC) StopAndJoin added in v0.48.1

func (ftdc *FTDC) StopAndJoin(ctx context.Context)

StopAndJoin stops the background worker started by `Start`. It is only legal to call this after `Start` returns.

type FlatDatum added in v0.48.1

type FlatDatum struct {
	Time     int64
	Readings []Reading
}

FlatDatum has the same information as a `datum`, but without the arbitrarily nested `Data` map. Using dots to join keys as per the disk format. So where a `Data` map might be:

{ "webrtc": { "connections": 5, "bytesSent": 8096 } }

A `FlatDatum` would be:

[ Reading{"webrtc.connections", 5}, Reading{"webrtc.bytesSent", 8096} ].

func Parse added in v0.48.1

func Parse(rawReader io.Reader) ([]FlatDatum, error)

Parse reads the entire contents from `rawReader` and returns a list of `Datum`. If an error occurs, the []Datum parsed up until the place of the error will be returned, in addition to a non-nil error.

func ParseWithLogger added in v0.48.1

func ParseWithLogger(rawReader io.Reader, logger logging.Logger) ([]FlatDatum, error)

ParseWithLogger parses with a logger for output.

type Reading added in v0.48.1

type Reading struct {
	MetricName string
	Value      float32
}

Reading is a "fully qualified" metric name paired with a value.

type Statser

type Statser interface {
	// The Stats method must return a struct with public field members that are either:
	// - Numbers (e.g: int, float64, byte, etc...)
	// - A "recursive" structure that has the same properties as this return value (public field
	//   members with numbers, or more structures). (NOT YET SUPPORTED)
	//
	// The return value must not be a map. This is to enforce a "schema" constraints.
	Stats() any
}

Statser implements Stats.

Directories

Path Synopsis
main provides a CLI tool for viewing `.ftdc` files emitted by the `viam-server`.
main provides a CLI tool for viewing `.ftdc` files emitted by the `viam-server`.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL