antler

package module
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 30, 2024 License: GPL-3.0 Imports: 33 Imported by: 0

README

Antler

Antler is a tool for network and congestion control testing. The name stands for Active Network Tester of Load & Response, where '&' == Et. :)

Introduction

Antler can be used to set up and tear down test environments, coordinate traffic flows across multiple nodes, gather data using external tools like tcpdump, and generate reports and plots from the results. It grew out of testing needs for SCE, and related congestion control projects in the IETF.

Why Antler?

In running tests with existing tools, I found that the job for congestion control work tends to be time consuming and error prone, as it involves more than just generating traffic and emitting stats, including:

  • setting up and tearing down test environments
  • orchestrating actions across multiple nodes
  • running multiple tests with varied parameter combinations
  • re-running only some tests while retaining prior results
  • running external tools to gather pcaps or other data
  • gathering results from multiple nodes into a single source of truth
  • emitting results in different formats for consumption
  • saving results non-destructively so prior work isn't lost
  • making results available on the web
  • configuring all of the above in a common way, to avoid mistakes

Antler is an attempt to address the above. The test environment is set up and torn down before and after each test, preventing configuration mistakes and "config bleed" from run to run. The test nodes are auto-installed and uninstalled before and after each test, preventing version mismatch and dependency problems. Tests are orchestrated using a hierarchy of serial and parallel actions that can be coordinated over the control connections to each node. Results, logs and data from all the nodes are gathered into a single data stream, saved non-destructively, and processed in a report pipeline to produce the output. Partial test runs allow re-running only some tests, while hard linking results from prior runs so a complete result tree is always available. Results may be published using an internal, embedded web server. Finally, all of the configuration is done using CUE, a data language that helps avoid config mistakes and duplication.

Features

Tests
  • auto-installed test nodes that run either locally or via ssh, and optionally in Linux network namespaces
  • builtin traffic generator in Go:
    • support for tests using stream-oriented and packet-oriented protocols (for now, TCP and UDP)
    • configurable UDP packet release times and lengths, supporting anything from isochronous, to VBR or bursty traffic, or combinations in one flow
    • support for setting arbitrary sockopts, including CCA and the DS field
  • configuration using CUE, to support test parameter combinations, schema definition, data validation and config reuse
  • configurable hierarchy of "runners", that may execute in serial or parallel across nodes, and with arbitrary scheduled timing (e.g. TCP flow introductions on an exponential distribution with lognormal lengths)
  • incremental test runs to run only selected tests, and hard link the rest from prior results
  • system runner for system commands, e.g. for setup, teardown, data collection such as pcaps, and mid-test config changes
  • system information gathering from commands, files, environment variables and sysctls
  • parallel execution of entire tests, with nested serial and parallel test runs
Results/Reports
  • time series and FCT plots using Google Charts
  • plots/reports implemented with Go templates, which may eventually be written by users to target any plotting package
  • optional result streaming during test (may be configured to deliver only some results, e.g. logs, but not pcaps)
  • generation of index.html pages of tests
  • embedded web server to serve results

Status

As of version 0.6.0, many of the core features are implemented, along with some basic tests and visualizations. The Roadmap shows future plans. Overall, more work is needed to expand and improve the available plots, stabilize the config and data formats, and support platforms other than Linux.

Installation

  1. Install Go (1.21 or later required).
  2. cd
  3. mkdir -p go/src/github.com/heistp
  4. cd go/src/github.com/heistp
  5. git clone https://github.com/heistp/antler
  6. cd antler
  7. make (builds node binaries, installs antler command)

To run antler, the binary must be in your PATH, or the full path must be specified. Typically, you add ~/go/bin to your PATH so you can run binaries installed by Go. Note: if using sudo and the secure_path option is set in /etc/sudoers, either this must be added to that path, or additional configuration is required.

Examples

The examples output is available online here, where you can view the HTML plots and log files. A few samples from that directory:

To run the examples yourself (root required for network namespaces):

cd examples
sudo antler run

All configuration is in the .cue or .cue.tmpl files, and the output is written to the results directory.

Documentation

Antler is currently documented through the examples, and the comments in config.cue. Antler is configured using CUE, so it helps to get familiar with the language, but for simple tests, it may be enough to just follow the examples.

UDP Latency Accuracy Limits

The node and its builtin traffic generators are written in Go. This comes with some system call overhead and scheduling jitter, which reduces the accuracy of the UDP latency results somewhat relative to C/C++, or better yet timings obtained from the kernel or network. The following comparison between ping and irtt gives some idea (note the log scale on the vertical axis):

Ping vs IRTT

While the UDP results are still useful for tests at most Internet RTTs, if microsecond level accuracy is required, external tools should be invoked using the System runner, or the times may be interpreted from pcaps instead. In the future, either the traffic generation or the entire node may be rewritten in another language, if required.

Roadmap

Version 1.0.0
  • undergo security audit
  • secure servers for use on the Internet
  • enhance stream server protocol to ensure streams have completed
  • add runner duration and use that to implement timeouts
  • add an antler init command to create a default project
  • write documentation (in markdown)
Inbox
Features
  • implement traffic generator in C (or rewrite node in Rust)
  • allow writing custom Go templates to generate any plot/report output
  • merge system info and logs into plots
  • add rm command to remove result and update latest symlink
  • add ls command to list results
  • add admin web UI to run a package of tests
  • add node-side compression support for System runner FileData output
  • handle tests both with and without node-synchronized time
  • process pcaps to get retransmits, CE/SCE marks, TCP RTT or other stats
  • add test progress bar
  • add ability to save System Stdout directly to local file
  • add ability to buffer System Stdout to a tmp file before sending as FileData
  • add log command to emit LogEntry's to stdout
  • implement flagForward optimization, and maybe invert it to flagProcess
  • add support for simulating conversational stream protocols
  • add Antler to CUE Unity
  • support MacOS
  • support FreeBSD
Refactoring
  • convert longer funcs/methods to use explicit return values
  • consistently document config in config.cue, with minimal doc in structs
  • replace use of chan any in conn
  • improve semantics for System.Stdout and Stderr
  • find a better way than unions to create interface implementations from CUE
  • consider moving all FileData to gob, for consistency with encoding
Bugs
  • improve poor error messages from CUE, especially under disjunctions
  • figure out why default for #EmitSysInfo:To doesn't work (default-default)

Thanks

A kind thanks to sponsors:

  • NLNet and NGI0 Core
  • NGI Pointer
  • RIPE NCC

and to Jonathan Morton and Rodney Grimes for advice.

NGI SCE Sticker

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Run

func Run(ctx context.Context, cmd Command) error

Run runs an Antler Command.

Types

type AmbiguousNodeIDError

type AmbiguousNodeIDError struct {
	TestID TestID
	ID     []node.ID
}

AmbiguousNodeIDError is returned when multiple Nodes use the same ID but with different field values.

func (AmbiguousNodeIDError) Error

func (a AmbiguousNodeIDError) Error() string

Error implements error

type Analyze added in v0.4.0

type Analyze struct {
}

Analyze is a reporter that processes stream and packet data for reports. This must be in the Report pipeline *before* reporters that require it.

type AndFilter

type AndFilter []TestFilter

AndFilter accepts a Test if each of its TestFilters accepts it. AndFilter panics if it has no TestFilters.

func (AndFilter) Accept

func (a AndFilter) Accept(test *Test) bool

Accept implements TestFilter.

type BoolFilter added in v0.4.0

type BoolFilter bool

BoolFilter is a TestFilter that accepts (if true) or rejects all Tests.

func (BoolFilter) Accept added in v0.4.0

func (b BoolFilter) Accept(test *Test) bool

Accept implements TestFilter.

type ChartsFCT

type ChartsFCT struct {
	// To lists the names of files to execute the template to. A file of "-"
	// emits to stdout.
	To []string

	// Series matches Flows to series.
	Series []FlowSeries

	// Options is an arbitrary structure of Charts options, with defaults
	// defined in config.cue.
	// https://developers.google.com/chart/interactive/docs/gallery/scatterchart#configuration-options
	Options map[string]any
}

ChartsFCT is a reporter that makes time series plots using Google Charts.

type ChartsTimeSeries

type ChartsTimeSeries struct {
	// FlowLabel sets custom labels for Flows.
	FlowLabel map[node.Flow]string

	// To lists the names of files to execute the template to. A file of "-"
	// emits to stdout.
	To []string

	// Options is an arbitrary structure of Charts options, with defaults
	// defined in config.cue.
	// https://developers.google.com/chart/interactive/docs/gallery/linechart#configuration-options
	Options map[string]any
}

ChartsTimeSeries is a reporter that makes time series plots using Google Charts.

type Codec added in v0.4.0

type Codec struct {
	ID             string
	Extension      []string
	Encode         string
	EncodeArg      []string
	EncodePriority int
	Decode         string
	DecodeArg      []string
	DecodePriority int
}

Codec configures a file encoder/decoder.

func (Codec) Equal added in v0.4.0

func (c Codec) Equal(other Codec) bool

Equal returns true if the Codecs are equal.

type Codecs added in v0.4.0

type Codecs map[string]Codec

Codecs wraps a map of Codecs to provide related methods.

type Command

type Command interface {
	// contains filtered or unexported methods
}

A Command is an Antler command.

type Config

type Config struct {
	Test        Tests
	MultiReport []MultiReport
	Results     Results
	Server      Server
}

Config is the Antler configuration, loaded from CUE.

func LoadConfig

func LoadConfig(cuecfg *load.Config) (cfg *Config, err error)

LoadConfig first executes templates in any .cue.tmpl files to create the corresponding .cue files, then uses the CUE API to load and return the Antler Config.

type DataFileUnsetError added in v0.4.0

type DataFileUnsetError struct {
	Test *Test
}

DataFileUnsetError is returned by DataWriter or DataReader when the Test's DataFile field is empty, so no data may be read or written. The Test field is the corresponding Test.

func (DataFileUnsetError) Error added in v0.4.0

func (n DataFileUnsetError) Error() string

Error implements error

type DuplicatePathError added in v0.5.0

type DuplicatePathError struct {
	Path []string
}

DuplicatePathError is returned when multiple Tests have the same Path.

func (DuplicatePathError) Error added in v0.5.0

func (d DuplicatePathError) Error() string

Error implements error

type DuplicateResultPrefixError added in v0.4.0

type DuplicateResultPrefixError struct {
	ResultPrefix []string
}

DuplicateResultPrefixError is returned when multiple Tests have the same ResultPrefix.

func (DuplicateResultPrefixError) Error added in v0.4.0

Error implements error

type DuplicateTestIDError

type DuplicateTestIDError struct {
	ID []TestID
}

DuplicateTestIDError is returned when multiple Tests have the same ID.

func (DuplicateTestIDError) Error

func (d DuplicateTestIDError) Error() string

Error implements error

type EmitLog

type EmitLog struct {
	// To lists the destinations to send output to. "-" sends output to stdout,
	// and everything else sends output to the named file. If To is empty,
	// output is emitted to stdout.
	To []string

	// Sort, if true, indicates to gather the logs, sort them by time, and emit
	// them after "in" is closed.
	Sort bool
}

EmitLog is a reporter that emits LogEntry's to files and/or stdout.

type EmitSysInfo added in v0.4.0

type EmitSysInfo struct {
	// To lists the destinations to send output to. "-" sends output to stdout,
	// and everything else sends output to the named file. If To is empty,
	// output is emitted to stdout. If two contains the verb %s, it is replaced
	// by the Node ID.
	To []string
}

EmitSysInfo is a reporter that emits SysInfoData's to files and/or stdout.

type Encode added in v0.4.0

type Encode struct {
	File        []string // list of glob patterns of files to encode
	Extension   string   // extension for newly encoded files (e.g. ".gz")
	ReEncode    bool     // if true, allow re-encoding of file
	Destructive bool     // if true, delete originals upon success
}

Encode is a reporter that encodes files referenced by FileRefs.

type FileRef added in v0.4.0

type FileRef struct {
	Name string
}

FileRef is sent as a data item by SaveFiles to record the presence of a file with the specified Name, even after its FileData items may have been consumed.

type FlowSeries

type FlowSeries struct {
	Name    string
	Pattern string
	// contains filtered or unexported fields
}

FlowSeries groups flows into series by matching the Flow ID with a Regex.

func (*FlowSeries) Compile

func (s *FlowSeries) Compile() (err error)

Compile compiles Pattern to a Regexp.

func (*FlowSeries) Match

func (s *FlowSeries) Match(flow node.Flow) (matches bool)

Match returns true if Flow matches Regex.

type GoodputPoint added in v0.5.0

type GoodputPoint struct {
	// T is the time relative to the start of the earliest stream.
	T metric.RelativeTime

	// Goodput is the goodput bitrate.
	Goodput metric.Bitrate
}

GoodputPoint is a single Goodput data point.

type Index added in v0.5.0

type Index struct {
	To          string
	GroupBy     string
	Title       string
	ExcludeFile []string

	sync.Mutex
	// contains filtered or unexported fields
}

Index is a reporter that creates an index.html file for a Group.

type LinkError added in v0.4.0

type LinkError struct {
	Name string
}

LinkError is returned by resultRW.Link when the named file could not be found in any prior result.

func (LinkError) Error added in v0.4.0

func (l LinkError) Error() string

Error implements error.

func (LinkError) Is added in v0.4.0

func (l LinkError) Is(target error) bool

Is makes this error an fs.ErrNotExist for the errors package.

type LogEntry added in v0.4.0

type LogEntry interface {
	GetLogEntry() node.LogEntry
}

A LogEntry returns a node.LogEntry that should be logged. The method name GetLogEntry is non-idiomatic so that node.LogEntry may be embedded in implementations.

type MultiReport added in v0.5.0

type MultiReport struct {
	ID TestID
	// contains filtered or unexported fields
}

MultiReport represents the MultiReport configuration from CUE.

type OrFilter

type OrFilter []TestFilter

OrFilter accepts a Test if any of its TestFilters accepts it. OrFilter panics if it has no TestFilters.

func (OrFilter) Accept

func (o OrFilter) Accept(test *Test) bool

Accept implements TestFilter

type PacketAnalysis added in v0.5.0

type PacketAnalysis struct {
	// data
	Flow       node.Flow
	Client     node.PacketInfo
	Server     node.PacketInfo
	ClientSent []node.PacketIO
	ClientRcvd []node.PacketIO
	ServerSent []node.PacketIO
	ServerRcvd []node.PacketIO

	// statistics
	Up      packetStats // stats from client to server
	Down    packetStats // stats from server to client
	RTT     []rtt
	RTTMean float64
}

PacketAnalysis contains the data and calculated stats for a packet flow.

func (*PacketAnalysis) T0 added in v0.5.0

func (y *PacketAnalysis) T0() time.Time

T0 returns the earliest absolute packet time.

type RegexFilter

type RegexFilter struct {
	Key   *regexp.Regexp
	Value *regexp.Regexp
}

RegexFilter is a TestFilter that matches Tests by their ID using regular expressions. If any of a Test ID's key/value pairs match the non-nil expressions in Key and Value, the Test is accepted. If both Key and Value are nil (i.e. a zero value RegexFilter), all Tests are accepted.

func NewRegexFilterArg

func NewRegexFilterArg(arg string) (flt *RegexFilter, err error)

NewRegexFilterArg returns a new RegexFilter from a string argument. The argument may be either a single pattern matching the value of any ID field, or a string in the form key=value, where key and value are separate patterns that must match both a Test ID key and value for it to be accepted.

func (*RegexFilter) Accept

func (f *RegexFilter) Accept(test *Test) bool

Accept implements TestFilter

type Report

type Report []reporters

Report represents a list of reporters.

type ReportCommand

type ReportCommand struct {
	// DataFileUnset is called when a report was skipped because the Test's
	// DataFile field is empty.
	DataFileUnset func(test *Test)

	// NotFound is called when a report was skipped because the data file needed
	// to run it doesn't exist.
	NotFound func(test *Test, name string)

	// Reporting is called when a report starts running.
	Reporting func(test *Test)

	// Done is called when the ReportCommand is done.
	Done func(ReportInfo)
}

ReportCommand runs the After reports using the data files as the source.

type ReportInfo added in v0.4.0

type ReportInfo struct {
	Start     time.Time
	Elapsed   time.Duration
	Reported  int
	ResultDir string
}

ReportInfo contains stats and info for a report run.

type ResultInfo added in v0.4.0

type ResultInfo struct {
	Name string // base name of result directory
	Path string // path to result directory
}

ResultInfo contains information on one result.

type ResultReader added in v0.4.0

type ResultReader struct {
	// Name is the name of the result file as requested. This is not the name of
	// a file on the filesystem.
	Name string

	// Path is the path to the result file actually read, which may be either an
	// encoded or unencoded version of the file.
	Path string

	// Codec is the Codec used to decode the file. The zero value of Codec means
	// the file is read directly.
	Codec Codec

	// ReadCloser reads the result file, decoding it transparently if needed.
	io.ReadCloser
}

ResultReader reads a result file.

type ResultWriter added in v0.4.0

type ResultWriter struct {
	// Name is the name of the result file as requested. This does not
	// correspond to the name of a file on the filesystem.
	Name string

	// Path is the path to the result file actually written, including WorkDir
	// and the result prefix.
	Path string

	// Codec is the Codec used to encode the file (based on Name's extension).
	// The zero value of Codec means the file is written directly.
	Codec Codec

	// WriteCloser writes the result file, encoding it transparently if needed.
	io.WriteCloser
	// contains filtered or unexported fields
}

ResultWriter writes a result file.

func (*ResultWriter) Write added in v0.4.0

func (w *ResultWriter) Write(p []byte) (n int, err error)

Write implements io.Writer.

type Results added in v0.4.0

type Results struct {
	RootDir         string
	WorkDir         string
	ResultDirUTC    bool
	ResultDirFormat string
	LatestSymlink   string
	Codec           Codecs
}

Results configures the behavior for reading and writing result files, which include all output files and reports.

Callers must use the open method to obtain a resultRW to read and write results in WorkDir. See the doc on resultRW for more info.

type RunCommand

type RunCommand struct {
	// Filter selects which Tests to run. If Filter is nil, Tests which were not
	// run before or had errors are run.
	Filter TestFilter

	// Skipped is called when a Test was skipped because it wasn't accepted by
	// the Filter.
	Skipped func(*Test)

	// ReRunning is called when a Test is being re-run because the prior result
	// contains errors.
	ReRunning func(*Test)

	// Linked is called when Test data was linked from a prior run.
	Linked func(*Test)

	// Running is called when a Test starts running.
	Running func(*Test)

	// Done is called when the RunCommand is done.
	Done func(RunInfo)
}

RunCommand runs tests and reports.

type RunInfo added in v0.4.0

type RunInfo struct {
	sync.Mutex
	Start     time.Time
	Elapsed   time.Duration
	Ran       int
	Linked    int
	ResultDir string
}

RunInfo contains stats and info for a test run.

type SaveFiles

type SaveFiles struct {
	Consume bool
}

SaveFiles is a reporter that saves FileData. If Consume is true, FileData items are not forwarded to the out channel.

type Server added in v0.4.0

type Server struct {
	ListenAddr string
	RootDir    string
}

Server is the builtin web server.

func (Server) Run added in v0.4.0

func (s Server) Run(ctx context.Context) (err error)

Run runs the server.

type ServerCommand added in v0.4.0

type ServerCommand struct {
}

ServerCommand runs the builtin web server.

type StreamAnalysis added in v0.5.0

type StreamAnalysis struct {
	Flow         node.Flow
	Client       node.StreamInfo
	Server       node.StreamInfo
	Sent         []node.StreamIO
	Rcvd         []node.StreamIO
	TCPInfo      []node.TCPInfo
	GoodputPoint []GoodputPoint
	RtxCumAvg    []rtxCumAvg
	FCT          metric.Duration
	Length       metric.Bytes
}

StreamAnalysis contains the data and calculated stats for a stream.

func (*StreamAnalysis) Goodput added in v0.5.0

func (s *StreamAnalysis) Goodput() metric.Bitrate

Goodput returns the total goodput for the stream.

func (*StreamAnalysis) T0 added in v0.5.0

func (s *StreamAnalysis) T0() time.Time

T0 returns the earliest absolute time from Sent or Rcvd.

type Test

type Test struct {
	// ID uniquely identifies the Test in the test package.
	ID TestID

	// Path is the path prefix for result files.
	Path string

	// DataFile is the name of the gob file containing the raw result data. If
	// empty, raw result data is not saved for the Test.
	DataFile string

	// Run is the top-level Run instance.
	node.Run

	// DuringDefault is the first part of a pipeline of Reports run while the
	// Test runs.
	DuringDefault Report

	// During is the latter part of a pipeline of Reports run while the Test
	// Runs.
	During Report

	// AfterDefault is the first part of a pipeline of Reports run while the
	// Test runs.
	AfterDefault Report

	// After is the latter part of a pipeline of Reports run while the Test
	// Runs.
	After Report
}

Test is an Antler test.

func (*Test) DataHasError added in v0.4.0

func (t *Test) DataHasError(rw resultRW) (hasError bool, err error)

DataHasError returns true if the DataFile exists and has errors. See DataReader for the errors that may be returned.

func (*Test) DataReader

func (t *Test) DataReader(rw resultRW) (rc io.ReadCloser, err error)

DataReader returns a ReadCloser for reading result data.

If DataFile is empty, DataFileUnsetError is returned.

If the data file does not exist, errors.Is(err, fs.ErrNotExist) returns true.

func (*Test) DataWriter

func (t *Test) DataWriter(rw resultRW) (wc io.WriteCloser, err error)

DataWriter returns a WriteCloser for writing result data to the work directory.

If DataFile is empty, DataFileUnsetError is returned.

func (*Test) LinkPriorData added in v0.4.0

func (t *Test) LinkPriorData(rw resultRW) (err error)

LinkPriorData creates hard links to the most recent result data for this Test. DataFile is linked, along with any FileRefs it contains.

If DataFile is empty, DataFileUnsetError is returned.

If no prior result data for this Test could be found, LinkError is returned.

func (*Test) RW added in v0.4.0

func (t *Test) RW(work resultRW) resultRW

RW returns a child resultRW for reading and writing this Test's results.

type TestFilter

type TestFilter interface {
	Accept(*Test) bool
}

A TestFilter accepts or rejects Tests.

type TestID

type TestID map[string]string

TestID represents a compound Test identifier. Keys and values must match the regex defined in config.cue.

func (TestID) Equal

func (i TestID) Equal(other TestID) bool

Equal returns true if other is equal to this TestID (they contain the same key/value pairs).

func (TestID) Match added in v0.5.0

func (i TestID) Match(pattern TestID) (matched bool, err error)

Match returns matched true if each of the keys in pattern is in the TestID, and each of the value patterns in pattern match the TestID's corresponding values. A zero value pattern always matches the ID.

func (TestID) String

func (i TestID) String() string

String returns the Test ID in the form: [K=V ...] with key/value pairs sorted by their keys.

type Tests added in v0.5.0

type Tests []Test

Tests wraps a list of Tests to add functionality.

type VetCommand

type VetCommand struct {
}

VetCommand loads and checks the CUE config.

Directories

Path Synopsis
cmd
metric
Package metric provides base types for units, measurement and statistics.
Package metric provides base types for units, measurement and statistics.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL