contiv

package
v2.0.3+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 10, 2018 License: Apache-2.0 Imports: 63 Imported by: 0

Documentation

Overview

Package contiv implements plugin providing GRPC-server that accepts requests from the contiv-CNI (acting as a GRPC-client) and configures the networking between VPP and the PODs.

The plugin is configurable via its config file that can be specified using `-contiv-config="<path to config>` argument when running the contiv-agent. This is usually being injected into the vswitch POD by a config map inside of the k8s deployment file of the contiv-VPP k8s networking plugin (see contiv-agent-cfg ConfigMap in ../../k8s/contiv-vpp.yaml).

Based on the configuration, the plugin can wire PODs in 3 different ways:

1. VETH-based pod-VPP connectivity (default)

Each POD is wired to VPP using a virtual ethernet interface pair, where one end is connected to VPP using AF_PACKET interface and the other end is placed into the POD's network namespace:

+-------------------------------------------------+ | vSwitch VPP host.go | +--------------+ | +--------------+ | | VETH VPP |____________| VETH Host | | routing | | | | | | +--------------+ | +--------------+ | +------+ +------+ | | | AF1 | | AFn | | | | | ... | | | | +------+ +------+ | | ^ | | | | +------|------------------------------------------+

    v
+------------+
|            |
| VETH1-VPP  |
|            |
+------------+
    ^
    |              pod.go

+------|------------+ | NS1 v | | +------------+ | | | | | | | VETH1-POD | | | | | | | +------------+ | | | +-------------------+

2. TAP-based pod-VPP connectivity

Each POD is wired to VPP using a TAP interface created on VPP. Can be turned on by setting the UseTAPInterfaces: True in the config file. Legacy and the new virtio-based TAP interfaces are supported, the latter can be turned on by setting the TAPInterfaceVersion: 2.

+-------------------------------------------------+ | vSwitch VPP host.go | +--------------+ | +--------------+ | | VETH VPP |____________| VETH Host | | routing | | | | | | +--------------+ | +--------------+ | +-------+ +-------+ | | | TAP1 | | TAPn | | | | | ... | | | | +-------+ +-------+ | | ^ | | | | +------|------------------------------------------+

|
|              pod.go

+------|------------+ | NS1 v | | +------------+ | | | | | | | TAP1-POD | | | | | | | +------------+ | | | +-------------------+

3. VPP TCP stack based pod-VPP connectivity

The PODs communicate with VPP via shared memory between VPP TCP stack and VCL library in PODs. To enable this, the plugin needs to be configured with TCPstackDisabled: False in the plugin config file and the POD needs to be deployed with ldpreload: "true" label. If the label is not specified for a POD, the communication between the POD and the VPP falls back to the option 1 or 2.

+-------------------------------------------------+ | vSwitch VPP host.go | +--------------+ | +--------------+ | | VETH VPP |____________| VETH Host | | routing | | | | | | +--------------+ | +--------------+ | +-------+ +-------+ | | | LOOP1 | | LOOPn | | | | | ... | | | | +-------+ +-------+ | | ^ ^ | | | | | | v v | | +-----------------------+ | | | VPP TCP Stack | | | +-----------------------+ | | ^ | | | | +------|------------------------------------------+

|
|                 pod.go

+------|---------------+ | NS1 v | | +-----------------+ | | | VCL | | | | (LD_PRELOAD-ed) | | | +-----------------+ | | ^ | | | | | v | | +------+ | | | APP | | | +------+ | +----------------------+

Note: the picture above is simplified, each LD_PRELOAD-ed POD is actually wired also with the veth/tap (option 1/2), for the non-TCP/UDP communications, or not LD_PRELOAD-ed applications.

Plugin Structure ================

The plugin consists of these components:

  1. Plugin base: - plugin_*.go: plugin definition and setup - node_events.go: handler of changes in nodes within the k8s cluster (node add / delete)

  2. Remote CNI Server - the main logic of the plugin that is in charge of wiring the PODs.

  3. Node ID Allocator - manages allocation/deallocation of unique number identifying a node within the k8s cluster. Allocated identifier is used as an input of the IPAM calculations.

  4. IPAM module (separate package, described in its own doc.go) - provides node-local IP address assignments.

  5. Helper functions: - host.go: provides host-related helper functions and VPP-Agent NB API builders - pod.go: provides POD-related helper functions and VPP-Agent NB API builders

Index

Constants

View Source
const (
	// ConfigFlagName is name of flag that can be used to define config for contiv plugin
	ConfigFlagName = "contiv"

	// ContivConfigPath is the default location of Agent's Contiv plugin. This path reflects configuration in k8s/contiv-vpp.yaml.
	ContivConfigPath = "/etc/agent/contiv.yaml"

	// ContivConfigPathUsage explains the purpose of 'kube-config' flag.
	ContivConfigPathUsage = "Path to the Agent's Contiv plugin configuration yaml file."
)
View Source
const (

	// TapHostEndLogicalName is the logical name of the VPP-host interconnect TAP interface (host end)
	TapHostEndLogicalName = "tap-vpp1"
	// TapHostEndName is the physical name of the VPP-host interconnect TAP interface (host end)
	TapHostEndName = "vpp1"
	// TapVPPEndLogicalName is the logical name of the VPP-host interconnect TAP interface (VPP end)
	TapVPPEndLogicalName = "tap-vpp2"
	// TapVPPEndName is the physical name of the VPP-host interconnect TAP interface (VPP end)
	TapVPPEndName = "vpp2"
	// HostInterconnectMAC is MAC address of tap that interconnects VPP with host stack
	HostInterconnectMAC = "01:23:45:67:89:42"
)
View Source
const (
	// Prefix is versioned prefix for REST urls
	Prefix = "/contiv/v1/"
	// PluginURL is versioned URL (using prefix) for IPAM REST endpoint
	PluginURL = Prefix + "ipam"
)
View Source
const MgmtIPSeparator = ","

MgmtIPSeparator is a delimiter inserted between management IPs in nodeInfo structure

Variables

This section is empty.

Functions

This section is empty.

Types

type API

type API interface {
	// GetIfName looks up logical interface name that corresponds to the interface
	// associated with the given pod.
	GetIfName(podNamespace string, podName string) (name string, exists bool)

	// GetNsIndex returns the index of the VPP session namespace associated
	// with the given pod.
	GetNsIndex(podNamespace string, podName string) (nsIndex uint32, exists bool)

	// GetPodByIf looks up podName and podNamespace that is associated with logical interface name.
	GetPodByIf(ifname string) (podNamespace string, podName string, exists bool)

	// GetPodByAppNsIndex looks up podName and podNamespace that is associated with the VPP application namespace.
	GetPodByAppNsIndex(nsIndex uint32) (podNamespace string, podName string, exists bool)

	// GetPodSubnet provides subnet used for allocating pod IP addresses across all nodes.
	GetPodSubnet() *net.IPNet

	// GetPodNetwork provides subnet used for allocating pod IP addresses on this host node.
	GetPodNetwork() *net.IPNet

	// GetContainerIndex exposes index of configured containers
	GetContainerIndex() containeridx.Reader

	// IsTCPstackDisabled returns true if the TCP stack is disabled and only VETHs/TAPs are configured
	IsTCPstackDisabled() bool

	// InSTNMode returns true if Contiv operates in the STN mode (single interface for each node).
	InSTNMode() bool

	// NatExternalTraffic returns true if traffic with cluster-outside destination should be S-NATed
	// with node IP before being sent out from the node.
	NatExternalTraffic() bool

	// CleanupIdleNATSessions returns true if cleanup of idle NAT sessions is enabled.
	CleanupIdleNATSessions() bool

	// GetTCPNATSessionTimeout returns NAT session timeout (in minutes) for TCP connections, used in case that CleanupIdleNATSessions is turned on.
	GetTCPNATSessionTimeout() uint32

	// GetOtherNATSessionTimeout returns NAT session timeout (in minutes) for non-TCP connections, used in case that CleanupIdleNATSessions is turned on.
	GetOtherNATSessionTimeout() uint32

	// GetServiceLocalEndpointWeight returns the load-balancing weight assigned to locally deployed service endpoints.
	GetServiceLocalEndpointWeight() uint8

	// GetNatLoopbackIP returns the IP address of a virtual loopback, used to route traffic
	// between clients and services via VPP even if the source and destination are the same
	// IP addresses and would otherwise be routed locally.
	GetNatLoopbackIP() net.IP

	// GetNodeIP returns the IP+network address of this node.
	// With DHCP the node IP may get assigned later or change in the runtime, therefore it is preferred
	// to watch for node IP via WatchNodeIP().
	GetNodeIP() (ip net.IP, network *net.IPNet)

	// GetHostIPs returns all IP addresses of this node present in the host network namespace (Linux).
	GetHostIPs() []net.IP

	// WatchNodeIP adds given channel to the list of subscribers that are notified upon change
	// of nodeIP address. If the channel is not ready to receive notification, the notification is dropped.
	WatchNodeIP(subscriber chan string)

	// GetMainPhysicalIfName returns name of the "main" interface - i.e. physical interface connecting
	// the node with the rest of the cluster.
	GetMainPhysicalIfName() string

	// GetOtherPhysicalIfNames returns a slice of names of all physical interfaces configured additionally
	// to the main interface.
	GetOtherPhysicalIfNames() []string

	// GetHostInterconnectIfName returns the name of the TAP/AF_PACKET interface
	// interconnecting VPP with the host stack.
	GetHostInterconnectIfName() string

	// GetVxlanBVIIfName returns the name of an BVI interface facing towards VXLAN tunnels to other hosts.
	// Returns an empty string if VXLAN is not used (in L2 interconnect mode).
	GetVxlanBVIIfName() string

	// GetDefaultInterface returns the name and the IP address of the interface
	// used by the default route to send packets out from VPP towards the default gateway.
	// If the default GW is not configured, the function returns zero values.
	GetDefaultInterface() (ifName string, ifAddress net.IP)

	// RegisterPodPreRemovalHook allows to register callback that will be run for each
	// pod immediately before its removal.
	RegisterPodPreRemovalHook(hook PodActionHook)

	// RegisterPodPostAddHook allows to register callback that will be run for each
	// pod once it is added and before the CNI reply is sent.
	RegisterPodPostAddHook(hook PodActionHook)

	// GetMainVrfID returns the ID of the main network connectivity VRF.
	GetMainVrfID() uint32

	// GetPodVrfID returns the ID of the POD VRF.
	GetPodVrfID() uint32
}

API for other plugins to query network-related information.

type Config

type Config struct {
	TCPChecksumOffloadDisabled  bool
	TCPstackDisabled            bool
	UseL2Interconnect           bool
	UseTAPInterfaces            bool
	TAPInterfaceVersion         uint8
	TAPv2RxRingSize             uint16
	TAPv2TxRingSize             uint16
	Vmxnet3RxRingSize           uint16
	Vmxnet3TxRingSize           uint16
	InterfaceRxMode             string // "" = polling / interrupt / adaptive
	MTUSize                     uint32
	StealFirstNIC               bool
	StealInterface              string
	STNSocketFile               string
	NatExternalTraffic          bool   // if enabled, traffic with cluster-outside destination is SNATed on node output (for all nodes)
	CleanupIdleNATSessions      bool   // if enabled, the agent will periodically check for idle NAT sessions and delete inactive ones
	TCPNATSessionTimeout        uint32 // NAT session timeout (in minutes) for TCP connections, used in case that CleanupIdleNATSessions is turned on
	OtherNATSessionTimeout      uint32 // NAT session timeout (in minutes) for non-TCP connections, used in case that CleanupIdleNATSessions is turned on
	ScanIPNeighbors             bool   // if enabled, periodically scans and probes IP neighbors to maintain the ARP table
	IPNeighborScanInterval      uint8
	IPNeighborStaleThreshold    uint8
	MainVRFID                   uint32
	PodVRFID                    uint32
	ServiceLocalEndpointWeight  uint8
	DisableNATVirtualReassembly bool // if true, NAT plugin will drop fragmented packets
	EnablePacketTrace           bool
	RouteServiceCIDRToVPP       bool // if true, cluster IP CIDR will be routed towards VPP from Linux
	IPAMConfig                  ipam.Config
	NodeConfig                  []NodeConfig
}

Config represents configuration for the Contiv plugin. It can be injected or loaded from external config file. Injection has priority to external config. To use external config file, add `-contiv-config="<path to config>` argument when running the contiv-agent.

func (*Config) ApplyDefaults added in v1.4.0

func (cfg *Config) ApplyDefaults()

ApplyDefaults stores default values to undefined configuration fields.

func (*Config) ApplyIPAMConfig added in v1.4.0

func (cfg *Config) ApplyIPAMConfig() error

ApplyIPAMConfig populates the Config struct with the calculated subnets

func (*Config) GetNodeConfig added in v1.4.0

func (cfg *Config) GetNodeConfig(nodeName string) *NodeConfig

GetNodeConfig returns configuration specific to a given node, or nil if none was found.

type Deps

type Deps struct {
	infra.PluginDeps
	ServiceLabel servicelabel.ReaderAPI
	GRPC         grpc.Server
	Proxy        *kvdbproxy.Plugin
	VPP          *vpp.Plugin
	GoVPP        govppmux.API
	Resync       resync.Subscriber
	ETCD         *etcd.Plugin
	Bolt         keyval.KvProtoPlugin
	Watcher      datasync.KeyValProtoWatcher
	HTTPHandlers rest.HTTPHandlers
}

Deps groups the dependencies of the Plugin.

type KVBrokerFactory added in v1.4.0

type KVBrokerFactory interface {
	NewBroker(keyPrefix string) keyval.ProtoBroker
}

KVBrokerFactory is used to generalize different means of accessing KV-store for the purpose of reading CRD-defined node configuration.

type NodeConfig added in v1.4.0

type NodeConfig struct {
	NodeName string // name of the node, should match with the hostname
	nodeconfigcrd.NodeConfigSpec
}

NodeConfig represents configuration specific to a given node.

func LoadNodeConfigFromCRD added in v1.4.0

func LoadNodeConfigFromCRD(nodeName string, remoteDB, localDB KVBrokerFactory, log logging.Logger) *NodeConfig

LoadNodeConfigFromCRD loads node configuration defined via CRD, which was reflected into a remote kv-store by contiv-crd and mirrored into local kv-store by the agent.

type Option

type Option func(*Plugin)

Option is a function that acts on a Plugin to inject Dependencies or configuration

func UseDeps

func UseDeps(cb func(*Deps)) Option

UseDeps returns Option that can inject custom dependencies.

type Plugin

type Plugin struct {
	Deps

	Config *Config
	// contains filtered or unexported fields
}

Plugin represents the instance of the Contiv network plugin, that transforms CNI requests received over GRPC into configuration for the vswitch VPP in order to connect/disconnect a container into/from the network.

func NewPlugin

func NewPlugin(opts ...Option) *Plugin

NewPlugin creates a new Plugin with the provides Options

func (*Plugin) AfterInit

func (plugin *Plugin) AfterInit() error

AfterInit is called by the plugin infra after Init of all plugins is finished. It registers to the ResyncOrchestrator. The registration is done in this phase in order to trigger the resync for this plugin once the resync of VPP plugins is finished.

func (*Plugin) CleanupIdleNATSessions

func (plugin *Plugin) CleanupIdleNATSessions() bool

CleanupIdleNATSessions returns true if cleanup of idle NAT sessions is enabled.

func (*Plugin) Close

func (plugin *Plugin) Close() error

Close is called by the plugin infra upon agent cleanup. It cleans up the resources allocated by the plugin.

func (*Plugin) GetContainerIndex

func (plugin *Plugin) GetContainerIndex() containeridx.Reader

GetContainerIndex returns the index of configured containers/pods

func (*Plugin) GetDefaultInterface

func (plugin *Plugin) GetDefaultInterface() (ifName string, ifAddress net.IP)

GetDefaultInterface returns the name and the IP address of the interface used by the default route to send packets out from VPP towards the default gateway. If the default GW is not configured, the function returns zero values.

func (*Plugin) GetHostIPs

func (plugin *Plugin) GetHostIPs() []net.IP

GetHostIPs returns all IP addresses of this node present in the host network namespace (Linux).

func (*Plugin) GetHostInterconnectIfName

func (plugin *Plugin) GetHostInterconnectIfName() string

GetHostInterconnectIfName returns the name of the TAP/AF_PACKET interface interconnecting VPP with the host stack.

func (*Plugin) GetIfName

func (plugin *Plugin) GetIfName(podNamespace string, podName string) (name string, exists bool)

GetIfName looks up logical interface name that corresponds to the interface associated with the given POD name.

func (*Plugin) GetMainPhysicalIfName

func (plugin *Plugin) GetMainPhysicalIfName() string

GetMainPhysicalIfName returns name of the "main" interface - i.e. physical interface connecting the node with the rest of the cluster.

func (*Plugin) GetMainVrfID

func (plugin *Plugin) GetMainVrfID() uint32

GetMainVrfID returns the ID of the main network connectivity VRF.

func (*Plugin) GetNatLoopbackIP

func (plugin *Plugin) GetNatLoopbackIP() net.IP

GetNatLoopbackIP returns the IP address of a virtual loopback, used to route traffic between clients and services via VPP even if the source and destination are the same IP addresses and would otherwise be routed locally.

func (*Plugin) GetNodeIP

func (plugin *Plugin) GetNodeIP() (ip net.IP, network *net.IPNet)

GetNodeIP returns the IP address of this node.

func (*Plugin) GetNsIndex

func (plugin *Plugin) GetNsIndex(podNamespace string, podName string) (nsIndex uint32, exists bool)

GetNsIndex returns the index of the VPP session namespace associated with the given POD name.

func (*Plugin) GetOtherNATSessionTimeout

func (plugin *Plugin) GetOtherNATSessionTimeout() uint32

GetOtherNATSessionTimeout returns NAT session timeout (in minutes) for non-TCP connections, used in case that CleanupIdleNATSessions is turned on.

func (*Plugin) GetOtherPhysicalIfNames

func (plugin *Plugin) GetOtherPhysicalIfNames() []string

GetOtherPhysicalIfNames returns a slice of names of all physical interfaces configured additionally to the main interface.

func (*Plugin) GetPodByAppNsIndex

func (plugin *Plugin) GetPodByAppNsIndex(nsIndex uint32) (podNamespace string, podName string, exists bool)

GetPodByAppNsIndex looks up podName and podNamespace that is associated with the VPP application namespace.

func (*Plugin) GetPodByIf

func (plugin *Plugin) GetPodByIf(ifname string) (podNamespace string, podName string, exists bool)

GetPodByIf looks up podName and podNamespace that is associated with logical interface name.

func (*Plugin) GetPodNetwork

func (plugin *Plugin) GetPodNetwork() *net.IPNet

GetPodNetwork provides subnet used for allocating pod IP addresses on this node.

func (*Plugin) GetPodSubnet

func (plugin *Plugin) GetPodSubnet() *net.IPNet

GetPodSubnet provides subnet used for allocating pod IP addresses across all nodes.

func (*Plugin) GetPodVrfID

func (plugin *Plugin) GetPodVrfID() uint32

GetPodVrfID returns the ID of the POD VRF.

func (*Plugin) GetServiceLocalEndpointWeight

func (plugin *Plugin) GetServiceLocalEndpointWeight() uint8

GetServiceLocalEndpointWeight returns the load-balancing weight assigned to locally deployed service endpoints.

func (*Plugin) GetTCPNATSessionTimeout

func (plugin *Plugin) GetTCPNATSessionTimeout() uint32

GetTCPNATSessionTimeout returns NAT session timeout (in minutes) for TCP connections, used in case that CleanupIdleNATSessions is turned on.

func (*Plugin) GetVxlanBVIIfName

func (plugin *Plugin) GetVxlanBVIIfName() string

GetVxlanBVIIfName returns the name of an BVI interface facing towards VXLAN tunnels to other hosts. Returns an empty string if VXLAN is not used (in L2 interconnect mode).

func (*Plugin) InSTNMode

func (plugin *Plugin) InSTNMode() bool

InSTNMode returns true if Contiv operates in the STN mode (single interface for each node).

func (*Plugin) Init

func (plugin *Plugin) Init() error

Init initializes the Contiv plugin. Called automatically by plugin infra upon contiv-agent startup.

func (*Plugin) IsTCPstackDisabled

func (plugin *Plugin) IsTCPstackDisabled() bool

IsTCPstackDisabled returns true if the VPP TCP stack is disabled and only VETHs/TAPs are configured.

func (*Plugin) NatExternalTraffic

func (plugin *Plugin) NatExternalTraffic() bool

NatExternalTraffic returns true if traffic with cluster-outside destination should be S-NATed with node IP before being sent out from the node.

func (*Plugin) RegisterPodPostAddHook

func (plugin *Plugin) RegisterPodPostAddHook(hook PodActionHook)

RegisterPodPostAddHook allows to register callback that will be run for each pod once it is added and before the CNI reply is sent.

func (*Plugin) RegisterPodPreRemovalHook

func (plugin *Plugin) RegisterPodPreRemovalHook(hook PodActionHook)

RegisterPodPreRemovalHook allows to register callback that will be run for each pod immediately before its removal.

func (*Plugin) WatchNodeIP

func (plugin *Plugin) WatchNodeIP(subscriber chan string)

WatchNodeIP adds given channel to the list of subscribers that are notified upon change of nodeIP address. If the channel is not ready to receive notification, the notification is dropped.

type PodActionHook

type PodActionHook func(podNamespace string, podName string) error

PodActionHook defines parameters and the return value of a callback triggered during an event associated with a pod.

type PodConfig

type PodConfig struct {
	// ID identifies the Pod
	ID string
	// PodName from the CNI request
	PodName string
	// PodNamespace from the CNI request
	PodNamespace string
	// Veth1 one end end of veth pair that is in the given container namespace.
	// Nil if TAPs are used instead.
	Veth1 *linux_intf.LinuxInterfaces_Interface
	// Veth2 is the other end of veth pair in the default namespace
	// Nil if TAPs are used instead.
	Veth2 *linux_intf.LinuxInterfaces_Interface
	// VppIf is AF_PACKET/TAP interface connecting pod to VPP
	VppIf *vpp_intf.Interfaces_Interface
	// PodTap is the host end of the tap connecting pod to VPP
	// Nil if TAPs are not used
	PodTap *linux_intf.LinuxInterfaces_Interface
	// Loopback interface associated with the pod.
	// Nil if VPP TCP stack is disabled.
	Loopback *vpp_intf.Interfaces_Interface
	// StnRule is STN rule used to "punt" any traffic via VETHs/TAPs with no match in VPP TCP stack.
	// Nil if VPP TCP stack is disabled.
	StnRule *stn.STN_Rule
	// AppNamespace is the application namespace associated with the pod.
	// Nil if VPP TCP stack is disabled.
	AppNamespace *vpp_l4.AppNamespaces_AppNamespace
	// VppARPEntry is ARP entry configured in VPP to route traffic from VPP to pod.
	VppARPEntry *vpp_l3.ArpTable_ArpEntry
	// PodARPEntry is ARP entry configured in the pod to route traffic from pod to VPP.
	PodARPEntry *linux_l3.LinuxStaticArpEntries_ArpEntry
	// VppRoute is the route from VPP to the container
	VppRoute *vpp_l3.StaticRoutes_Route
	// PodLinkRoute is the route from pod to the default gateway.
	PodLinkRoute *linux_l3.LinuxStaticRoutes_Route
	// PodDefaultRoute is the default gateway for the pod.
	PodDefaultRoute *linux_l3.LinuxStaticRoutes_Route
}

PodConfig groups applied configuration for a container

Directories

Path Synopsis
Package containeridx implements a mapping structure that allows to store configured container networking.
Package containeridx implements a mapping structure that allows to store configured container networking.
Package ipam provides node-local IPAM calculations: POD IP addresses, VPP-host interconnect and node interconnect IP addresses.
Package ipam provides node-local IPAM calculations: POD IP addresses, VPP-host interconnect and node interconnect IP addresses.
model
Package model is a generated protocol buffer package.
Package model is a generated protocol buffer package.
model
cni

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL