Documentation ¶
Overview ¶
Package fdbased provides the implemention of data-link layer endpoints backed by boundary-preserving file descriptors (e.g., TUN devices, seqpacket/datagram sockets).
FD based endpoints can be used in the networking stack by calling New() to create a new endpoint, and then passing it as an argument to Stack.CreateNIC().
FD based endpoints can use more than one file descriptor to read incoming packets. If there are more than one FDs specified and the underlying FD is an AF_PACKET then the endpoint will enable FANOUT mode on the socket so that the host kernel will consistently hash the packets to the sockets. This ensures that packets for the same TCP streams are not reordered.
Similarly if more than one FD's are specified where the underlying FD is not AF_PACKET then it's the caller's responsibility to ensure that all inbound packets on the descriptors are consistently 5 tuple hashed to one of the descriptors to prevent TCP reordering.
Since netstack today does not compute 5 tuple hashes for outgoing packets we only use the first FD to write outbound packets. Once 5 tuple hashes for all outbound packets are available we will make use of all underlying FD's to write outbound packets.
Index ¶
- Constants
- Variables
- func New(opts *Options) (stack.LinkEndpoint, error)
- type InjectableEndpoint
- func (e *InjectableEndpoint) ARPHardwareType() header.ARPHardwareType
- func (e *InjectableEndpoint) AddHeader(pkt stack.PacketBufferPtr)
- func (e *InjectableEndpoint) Attach(dispatcher stack.NetworkDispatcher)
- func (e *InjectableEndpoint) Capabilities() stack.LinkEndpointCapabilities
- func (e *InjectableEndpoint) GSOMaxSize() uint32
- func (e *InjectableEndpoint) InjectInbound(protocol tcpip.NetworkProtocolNumber, pkt stack.PacketBufferPtr)
- func (e *InjectableEndpoint) InjectOutbound(dest tcpip.Address, packet *buffer.View) tcpip.Error
- func (e *InjectableEndpoint) IsAttached() bool
- func (e *InjectableEndpoint) LinkAddress() tcpip.LinkAddress
- func (e *InjectableEndpoint) MTU() uint32
- func (e *InjectableEndpoint) MaxHeaderLength() uint16
- func (e *InjectableEndpoint) ParseHeader(pkt stack.PacketBufferPtr) bool
- func (e *InjectableEndpoint) SupportedGSO() stack.SupportedGSO
- func (e *InjectableEndpoint) Wait()
- func (e *InjectableEndpoint) WritePackets(pkts stack.PacketBufferList) (int, tcpip.Error)
- type Options
- type PacketDispatchMode
Constants ¶
const BatchSize = 47
BatchSize is the number of packets to write in each syscall. It is 47 because when GvisorGSO is in use then a single 65KB TCP segment can get split into 46 segments of 1420 bytes and a single 216 byte segment.
const ( // MaxMsgsPerRecv is the maximum number of packets we want to retrieve // in a single RecvMMsg call. MaxMsgsPerRecv = 8 )
Variables ¶
var BufConfig = []int{128, 256, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768}
BufConfig defines the shape of the buffer used to read packets from the NIC.
Functions ¶
Types ¶
type InjectableEndpoint ¶
type InjectableEndpoint struct {
// contains filtered or unexported fields
}
InjectableEndpoint is an injectable fd-based endpoint. The endpoint writes to the FD, but does not read from it. All reads come from injected packets.
func NewInjectable ¶
func NewInjectable(fd int, mtu uint32, capabilities stack.LinkEndpointCapabilities) (*InjectableEndpoint, error)
NewInjectable creates a new fd-based InjectableEndpoint.
func (*InjectableEndpoint) ARPHardwareType ¶
func (e *InjectableEndpoint) ARPHardwareType() header.ARPHardwareType
ARPHardwareType implements stack.LinkEndpoint.ARPHardwareType.
func (*InjectableEndpoint) AddHeader ¶
func (e *InjectableEndpoint) AddHeader(pkt stack.PacketBufferPtr)
AddHeader implements stack.LinkEndpoint.AddHeader.
func (*InjectableEndpoint) Attach ¶
func (e *InjectableEndpoint) Attach(dispatcher stack.NetworkDispatcher)
Attach saves the stack network-layer dispatcher for use later when packets are injected.
func (*InjectableEndpoint) Capabilities ¶
func (e *InjectableEndpoint) Capabilities() stack.LinkEndpointCapabilities
Capabilities implements stack.LinkEndpoint.Capabilities.
func (*InjectableEndpoint) GSOMaxSize ¶
func (e *InjectableEndpoint) GSOMaxSize() uint32
GSOMaxSize implements stack.GSOEndpoint.
func (*InjectableEndpoint) InjectInbound ¶
func (e *InjectableEndpoint) InjectInbound(protocol tcpip.NetworkProtocolNumber, pkt stack.PacketBufferPtr)
InjectInbound injects an inbound packet. If the endpoint is not attached, the packet is not delivered.
func (*InjectableEndpoint) InjectOutbound ¶
InjectOutbound implements stack.InjectableEndpoint.InjectOutbound.
func (*InjectableEndpoint) IsAttached ¶
func (e *InjectableEndpoint) IsAttached() bool
IsAttached implements stack.LinkEndpoint.IsAttached.
func (*InjectableEndpoint) LinkAddress ¶
func (e *InjectableEndpoint) LinkAddress() tcpip.LinkAddress
LinkAddress returns the link address of this endpoint.
func (*InjectableEndpoint) MTU ¶
func (e *InjectableEndpoint) MTU() uint32
MTU implements stack.LinkEndpoint.MTU. It returns the value initialized during construction.
func (*InjectableEndpoint) MaxHeaderLength ¶
func (e *InjectableEndpoint) MaxHeaderLength() uint16
MaxHeaderLength returns the maximum size of the link-layer header.
func (*InjectableEndpoint) ParseHeader ¶
func (e *InjectableEndpoint) ParseHeader(pkt stack.PacketBufferPtr) bool
ParseHeader implements stack.LinkEndpoint.ParseHeader.
func (*InjectableEndpoint) SupportedGSO ¶
func (e *InjectableEndpoint) SupportedGSO() stack.SupportedGSO
SupportedGSO implements stack.GSOEndpoint.
func (*InjectableEndpoint) Wait ¶
func (e *InjectableEndpoint) Wait()
Wait implements stack.LinkEndpoint.Wait. It waits for the endpoint to stop reading from its FD.
func (*InjectableEndpoint) WritePackets ¶
func (e *InjectableEndpoint) WritePackets(pkts stack.PacketBufferList) (int, tcpip.Error)
WritePackets writes outbound packets to the underlying file descriptors. If one is not currently writable, the packet is dropped.
Being a batch API, each packet in pkts should have the following fields populated:
- pkt.EgressRoute
- pkt.GSOOptions
- pkt.NetworkProtocolNumber
type Options ¶
type Options struct { // FDs is a set of FDs used to read/write packets. FDs []int // MTU is the mtu to use for this endpoint. MTU uint32 // EthernetHeader if true, indicates that the endpoint should read/write // ethernet frames instead of IP packets. EthernetHeader bool // ClosedFunc is a function to be called when an endpoint's peer (if // any) closes its end of the communication pipe. ClosedFunc func(tcpip.Error) // Address is the link address for this endpoint. Only used if // EthernetHeader is true. Address tcpip.LinkAddress // SaveRestore if true, indicates that this NIC capability set should // include CapabilitySaveRestore SaveRestore bool // DisconnectOk if true, indicates that this NIC capability set should // include CapabilityDisconnectOk. DisconnectOk bool // GSOMaxSize is the maximum GSO packet size. It is zero if GSO is // disabled. GSOMaxSize uint32 // GvisorGSOEnabled indicates whether Gvisor GSO is enabled or not. GvisorGSOEnabled bool // PacketDispatchMode specifies the type of inbound dispatcher to be // used for this endpoint. PacketDispatchMode PacketDispatchMode // TXChecksumOffload if true, indicates that this endpoints capability // set should include CapabilityTXChecksumOffload. TXChecksumOffload bool // RXChecksumOffload if true, indicates that this endpoints capability // set should include CapabilityRXChecksumOffload. RXChecksumOffload bool // If MaxSyscallHeaderBytes is non-zero, it is the maximum number of bytes // of struct iovec, msghdr, and mmsghdr that may be passed by each host // system call. MaxSyscallHeaderBytes int // AFXDPFD is used with the experimental AF_XDP mode. // TODO(b/240191988): Use multiple sockets. // TODO(b/240191988): How do we handle the MTU issue? AFXDPFD *int // InterfaceIndex is the interface index of the underlying device. InterfaceIndex int }
Options specify the details about the fd-based endpoint to be created.
type PacketDispatchMode ¶
type PacketDispatchMode int
PacketDispatchMode are the various supported methods of receiving and dispatching packets from the underlying FD.
const ( // Readv is the default dispatch mode and is the least performant of the // dispatch options but the one that is supported by all underlying FD // types. Readv PacketDispatchMode = iota // RecvMMsg enables use of recvmmsg() syscall instead of readv() to // read inbound packets. This reduces # of syscalls needed to process // packets. // // NOTE: recvmmsg() is only supported for sockets, so if the underlying // FD is not a socket then the code will still fall back to the readv() // path. RecvMMsg // PacketMMap enables use of PACKET_RX_RING to receive packets from the // NIC. PacketMMap requires that the underlying FD be an AF_PACKET. The // primary use-case for this is runsc which uses an AF_PACKET FD to // receive packets from the veth device. PacketMMap )
func (PacketDispatchMode) String ¶
func (p PacketDispatchMode) String() string