xvs

package module
v0.1.15 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 18, 2024 License: GPL-2.0 Imports: 20 Imported by: 1

README

XDP Virtual Server

An XDP/eBPF load balancer and Go API for Linux.

This code is originally from the vc5 load balancer, and has been split out to be developed seperately.

This code implements an IPv4 layer 2 Direct Server Return load balancer with an eBPF data plane that is loaded into the kernel and a supporting Go library to configure the balancer through the XDP API. Backend servers need share a VLAN with the load balancer. Multiple VLANs/interfaces are supported.

The ELF object file is committed to this repository (main branch) and is accessed via Go's embed feature, which means that it can be used as a standard Go module without having to build the binary as a seperate step. libbpf is still required for linking programs using the library (CGO_CFLAGS and CGO_LDFLAGS environment variables may need to be used to specify the location of the library - see the Makefile for an example of how to do this).

Support for layer 3 DSR and IPv6 is planned.

Portability

eBPF code is JITted to the native instruction set at runtime, so this should run on any Linux architecture. Currently AMD64 and ARM (Raspberry Pi) are confirmed to work.

Devices with constrained memory might have issues loading in the default size flow state tables, so you may have to rebuild the eBPF object file overriding the defaults (see the raspberrypi target in the Makefile).

Pi wi-fi load balancer:

cmd/balancer wlan0 192.168.0.16 192.168.101.1 192.168.0.10 192.168.0.11

Documentation

https://pkg.go.dev/github.com/davidcoles/xvs

The API is loosely modelled on the Cloudflare IPVS library (Go reference).

Sample application

A simple application in the cmd/ directory will balance traffic to a VIP (TCP port 80 by default, can be changed with flags) to a number of backend servers on the same IP subnet.

Compile/run with:

  • make example
  • cmd/balancer ens192 10.1.2.3 192.168.101.1 10.1.2.10 10.1.2.11 10.1.2.12

Replace ens192 with your ethernet interface name, 10.1.2.3 with the address of the machine you are running the program on, 192.168.101.1 with the VIP you want to use and 10.1.2.10-12 with any number of real server addresses.

On a seperate client machine on the same subnet you should add a static route for the VIP, eg.:

  • ip r add 192.168.101.1 via 10.1.2.3

You should then be able to contact the service:

  • curl http://192.168.101.1/

No healthchecking is done, so you'll have to make sure that a webserver is running on the real servers and that the VIP has been configured on the loopback address (ip a add 192.168.101.1 dev lo).

A more complete example with health check and BGP route health injection is currently available at vc5.

Performance

This has mostly been tested using Icecast backend servers with clients pulling a mix of low and high bitrate streams (48kbps - 192kbps).

A VMWare guest (4 core, 8GB) using the XDP generic driver was able to support 100K concurrent clients, 380Mbps/700Kpps through the load balancer and 8Gbps of traffic from the backends directly to the clients. Going above 700Kpps cause connections to be dropped, regardless of the number of cores or memory assigned to the VM, so I suspect that there is a limit on the number of interrupts that the VM is able to handle per second.

On a single (non-virtualised) Intel Xeon Gold 6314U CPU (2.30GHz 32 physical cores, with hyperthreading enabled for 64 logical cores) and an Intel 10G 4P X710-T4L-t ethernet card, I was able to run 700K streams at 2Gbps/3.8Mpps ingress traffic and 46.5Gbps egress. The server was more than 90% idle. Unfortunately I did not have the resources available to create more clients/servers. I realised that I carried this out when the server's profile was set to performance per-watt. Using the performance mode the CPU usage is barely 2% and latencey is less than 250 nanoseconds.

On a Raspberry Pi (B+) ... don't get your hopes up!

Recalcitrant cards

I'm currently investigating issues with the Intel X710 card. We have had issues getting the NIC to bring up links (particularly after switch reboot), though this may be due to SFP+ module/optics. I've been able to force a renegotiation with ethtool -r, but this then has the effect of breaking XDP. This seems to be fixable by reattaching the BPF section, so I have added a function to carry this out. The generic driver did not show this problem.

This has been extremely disappointing as the Intel X520 card (as well as an older Intel 1Gbps card that I can't remember the model of) worked perfectly, and pulling/reinserting cables on a bond behaved exactly as I would have hoped.

Documentation

Index

Constants

View Source
const (
	F_NO_SHARE_FLOWS    = bpf.F_NO_SHARE_FLOWS
	F_NO_TRACK_FLOWS    = bpf.F_NO_TRACK_FLOWS
	F_NO_ESTIMATE_CONNS = bpf.F_NO_ESTIMATE_CONNS
	F_NO_STORE_STATS    = bpf.F_NO_STORE_STATS
)
View Source
const PREFIXES = 1048576
View Source
const VETH = bpf.VETH_ID

Variables

This section is empty.

Functions

This section is empty.

Types

type Client

type Client struct {
	NAT        bool
	Native     bool
	Interfaces []string
	Address    netip.Addr
	VLANs      map[uint16]net.IPNet
	Debug      Debug
	InitDelay  uint8
	MaxFlows   uint32
	// contains filtered or unexported fields
}

func (*Client) Block added in v0.1.13

func (c *Client) Block(b [PREFIXES]bool)

func (*Client) CreateDestination

func (c *Client) CreateDestination(s Service, d Destination) error

func (*Client) CreateService

func (c *Client) CreateService(s Service) error

func (*Client) Destinations

func (c *Client) Destinations(s Service) (destinations []DestinationExtended, e error)

func (*Client) Flags added in v0.1.11

func (c *Client) Flags(f uint8)

func (*Client) Info

func (c *Client) Info() (i Info)

func (*Client) NATAddress

func (c *Client) NATAddress(vip, rip netip.Addr) (r netip.Addr, _ bool)

Return the NAT address of a virtual and real IP address pair - traffic to this address will be translated to target the services on the real server. This address can be used to query the services for health checking purposes.

func (*Client) Namespace

func (c *Client) Namespace() string

Retrieve the name of the network namespace that xvs is using

func (*Client) NamespaceAddress

func (c *Client) NamespaceAddress() string

Retrieve the IP address of the interface in the network namespace that xvs is using

func (*Client) Prefixes

func (c *Client) Prefixes() [PREFIXES]uint64

func (*Client) ReadFlow added in v0.1.3

func (c *Client) ReadFlow() []byte

Retrive a flow state descriptor from the kernel via a queue. Retrieved descriptors can be shared with other load balancers in a cluster to facilitate failover. If no more flows are currently available then the length of the byte slice returned will be zero.

func (*Client) ReattachBPF added in v0.1.14

func (c *Client) ReattachBPF(nic string) error

Rerun bpf_xdp_attach with the XDP forwarding eBPF code. This can be used to correct an issue with some network cards/drivers which seem to forget about the XDP hook (eg.: Intel X710) after being poked with ethtool (ethtool -r was problematic for me). In a bonded ethernet setup I have had sucess with removing a member from the bond, running ethtool -r, reattaching eBPF and then re-introducting to the bond, with short pauses between each step.

func (*Client) RemoveDestination

func (c *Client) RemoveDestination(s Service, d Destination) error

func (*Client) RemoveService

func (c *Client) RemoveService(s Service) error

func (*Client) Service

func (c *Client) Service(s Service) (se ServiceExtended, e error)

func (*Client) Services

func (c *Client) Services() (services []ServiceExtended, e error)

func (*Client) SetService added in v0.1.2

func (c *Client) SetService(s Service, dst ...Destination) error

func (*Client) Start

func (c *Client) Start() error

func (*Client) UpdateVLANs

func (c *Client) UpdateVLANs(vlans map[uint16]net.IPNet)

func (*Client) WriteFlow added in v0.1.3

func (c *Client) WriteFlow(fs []byte)

Make a flow returned from ReadFlow() known to the local eBPF program to facilitate cluster failover.

type Debug added in v0.1.11

type Debug interface {
	NAT(tag map[netip.Addr]int16, arp map[netip.Addr][6]byte, vrn map[[2]netip.Addr]netip.Addr, nat map[netip.Addr]string, out []netip.Addr, in []string)
	Redirects(vlans map[uint16]string)
	Backend(vip netip.Addr, port uint16, protocol uint8, backends []byte, took time.Duration)
}

type Destination

type Destination struct {
	Address netip.Addr // Destinationserver IP address
	Weight  uint8      // Not full implemented; 0 - don't use, non-zero enables the destination
	// contains filtered or unexported fields
}

type DestinationExtended

type DestinationExtended struct {
	Destination Destination
	MAC         MAC
	Stats       Stats
}

type Info

type Info struct {
	Packets   uint64
	Octets    uint64
	Flows     uint64
	Latency   uint64
	Dropped   uint64
	Blocked   uint64
	NotQueued uint64
}

type MAC

type MAC = mac

type Protocol

type Protocol = uint8
const (
	TCP Protocol = 0x06
	UDP Protocol = 0x11
)

type Service

type Service struct {
	Address  netip.Addr // The Virtual IP address of the service
	Port     uint16     // Layer 4 port number
	Protocol Protocol   // IP protocol number; TCP (6), and UDP (17) are current supported
	Sticky   bool       // Only use source and destination IP addresses when determining backend
}

type ServiceExtended

type ServiceExtended struct {
	Service Service
	Stats   Stats
}

type Stats

type Stats struct {
	Packets uint64
	Octets  uint64
	Flows   uint64
	Current uint64
}

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL