GoBAT

module
v0.0.0-...-be71968 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 14, 2021 License: Apache-2.0

README

GoBAT

Golang Background Traffic Generator Tooling for a large K8s Cluster.

Purpose and Philosophy

  • Load all network paths and involved HW/SW functions in the system to significant level
  • Do not exceed 20-30% load anywhere to avoid overload in the event of any planned failure or resilience TCs
  • Cover all relevant supported networking options/functions in the traffic design
  • Different traffic types (as basic as possible to still cover K8s functionality): e.g. UDP, HTTP(S)
  • Generate and terminate traffic by workload pods inside the K8s cluster (scales with cluster size)
  • Simulate North-South traffic by routing packets through the DC-Edge or GW and back to the cluster (example: traffic from K8s secondary to primary network)
  • Random pairing of client with server pods/services to statistically cover all nodes, paths and functions
  • BAT traffic needs to survive or recover from source and destination pod restarts
  • Record per stream stats (sent/dropped/RTT) every minute to capture traffic disturbances
  • Packet/request frequency >100/s per stream for detecting 10 ms packet drops
  • Metadata of affected streams often allows isolating root cause
  • Use aggregate traffic stats for monitoring the cluster health

Architecture

GoBAT Logo

BAT Pairing script and ConfigMap

The script queries K8s for BAT services, pods and interfaces and generates pairing ConfigMap (the script implementation is out of scope of this project). An example pairing ConfigMap is present here.

The traffic profile can also be configured using net-bat-profile Config Map. Here is an example to configure udp and http streams.

Network BAT container
  • Implemented in Go using Go-routines for concurrency.
  • Currently implements udp client and server, extensible to plug other protocol clients/servers.
  • TGC watches BAT ConfigMap changes. Processes pairing file and creates a TGen instance per originating stream
  • TGen instance generates traffic and uses Prometheus Go client to report per stream metrics
  • One TApp instance per interface to answer all incoming requests
Prometheus PM server scraping all BAT pods for stream metrics
  • Compute and store stream interval metrics in Time Series DB (TSDB)
  • Compute and store summary stats in TSDB
  • Grafana Dashboard to visualize summary stats
  • Postprocessing scripts and trouble-shooting tools querying PM TSDB

The deployment and configuration of Prometheus and Grafana is out of scope of this project

Network BAT Metrics in Prometheus

Metrics per BAT stream:
  • Duration
  • Packets/Requests send_failed/sent/received/dropped (drop means timeout in Tgen)
  • RTT quantiles: 50%, 90%, 95%, 99%
Interval stats (a minute interval configured at PM)
  • Packets sent/received/dropped
  • Interval PPM drop rate
Metrics per BAT server process
  • Number of ongoing client connections

Acknowledgements

Thanks big time to Jan Scheurich for his invaluable design inputs and code reviews.

Directories

Path Synopsis
cmd
pkg
tapp
Package tapp contains different protocol stream server implementations
Package tapp contains different protocol stream server implementations
tgc
Package tgc contains controller implementation which reacts to config map events, registration, start and teardown of tgen, tapp instances
Package tgc contains controller implementation which reacts to config map events, registration, start and teardown of tgen, tapp instances
tgen
Package tgen contains different protocol stream client implementations
Package tgen contains different protocol stream client implementations
util
Package util contains necessary type definitions and helper methods for tgc, tgen and tapp modules
Package util contains necessary type definitions and helper methods for tgc, tgen and tapp modules

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL