Documentation ¶
Index ¶
- Constants
- func IsNamespaceSupported(ns NamespaceType) bool
- func KnownHookNames() []string
- func NsName(ns NamespaceType) string
- type Action
- type Arg
- type BlockIODevice
- type Capabilities
- type Cgroup
- type Command
- type CommandHook
- type Config
- type FreezerState
- type FuncHook
- type Hook
- type HookList
- type HookName
- type Hooks
- type HugepageLimit
- type IDMap
- type IfPrioMap
- type IntelRdt
- type LinuxRdma
- type Mount
- type Namespace
- type NamespaceType
- type Namespaces
- func (n *Namespaces) Add(t NamespaceType, path string)
- func (n *Namespaces) CloneFlags() uintptr
- func (n *Namespaces) Contains(t NamespaceType) bool
- func (n Namespaces) IsPrivate(t NamespaceType) bool
- func (n *Namespaces) PathOf(t NamespaceType) string
- func (n *Namespaces) Remove(t NamespaceType) bool
- type Network
- type Operator
- type Resources
- type Rlimit
- type Route
- type Seccomp
- type Syscall
- type ThrottleDevice
- type WeightDevice
Constants ¶
const ( // EXT_COPYUP is a directive to copy up the contents of a directory when // a tmpfs is mounted over it. EXT_COPYUP = 1 << iota //nolint:golint // ignore "don't use ALL_CAPS" warning )
Variables ¶
This section is empty.
Functions ¶
func IsNamespaceSupported ¶
func IsNamespaceSupported(ns NamespaceType) bool
IsNamespaceSupported returns whether a namespace is available or not
func KnownHookNames ¶
func KnownHookNames() []string
KnownHookNames returns the known hook names. Used by `runc features`.
func NsName ¶
func NsName(ns NamespaceType) string
NsName converts the namespace type to its filename
Types ¶
type Arg ¶
type Arg struct { Index uint `json:"index"` Value uint64 `json:"value"` ValueTwo uint64 `json:"value_two"` Op Operator `json:"op"` }
Arg is a rule to match a specific syscall argument in Seccomp
type BlockIODevice ¶
type BlockIODevice struct { // Major is the device's major number Major int64 `json:"major"` // Minor is the device's minor number Minor int64 `json:"minor"` }
BlockIODevice holds major:minor format supported in blkio cgroup.
type Capabilities ¶
type Capabilities struct { // Bounding is the set of capabilities checked by the kernel. Bounding []string // Effective is the set of capabilities checked by the kernel. Effective []string // Inheritable is the capabilities preserved across execve. Inheritable []string // Permitted is the limiting superset for effective capabilities. Permitted []string // Ambient is the ambient set of capabilities that are kept. Ambient []string }
type Cgroup ¶
type Cgroup struct { // Name specifies the name of the cgroup Name string `json:"name,omitempty"` // Parent specifies the name of parent of cgroup or slice Parent string `json:"parent,omitempty"` // Path specifies the path to cgroups that are created and/or joined by the container. // The path is assumed to be relative to the host system cgroup mountpoint. Path string `json:"path"` // ScopePrefix describes prefix for the scope name ScopePrefix string `json:"scope_prefix"` // Resources contains various cgroups settings to apply *Resources // Systemd tells if systemd should be used to manage cgroups. Systemd bool // SystemdProps are any additional properties for systemd, // derived from org.systemd.property.xxx annotations. // Ignored unless systemd is used for managing cgroups. SystemdProps []systemdDbus.Property `json:"-"` // Rootless tells if rootless cgroups should be used. Rootless bool // The host UID that should own the cgroup, or nil to accept // the default ownership. This should only be set when the // cgroupfs is to be mounted read/write. // Not all cgroup manager implementations support changing // the ownership. OwnerUID *int `json:"owner_uid,omitempty"` }
Cgroup holds properties of a cgroup on Linux.
type Command ¶
type CommandHook ¶
type CommandHook struct {
Command
}
func NewCommandHook ¶
func NewCommandHook(cmd Command) CommandHook
NewCommandHook will execute the provided command when the hook is run.
type Config ¶
type Config struct { // NoPivotRoot will use MS_MOVE and a chroot to jail the process into the container's rootfs // This is a common option when the container is running in ramdisk NoPivotRoot bool `json:"no_pivot_root"` // ParentDeathSignal specifies the signal that is sent to the container's process in the case // that the parent process dies. ParentDeathSignal int `json:"parent_death_signal"` // Path to a directory containing the container's root filesystem. Rootfs string `json:"rootfs"` // Umask is the umask to use inside of the container. Umask *uint32 `json:"umask"` // Readonlyfs will remount the container's rootfs as readonly where only externally mounted // bind mounts are writtable. Readonlyfs bool `json:"readonlyfs"` // Specifies the mount propagation flags to be applied to /. RootPropagation int `json:"rootPropagation"` // Mounts specify additional source and destination paths that will be mounted inside the container's // rootfs and mount namespace if specified Mounts []*Mount `json:"mounts"` // The device nodes that should be automatically created within the container upon container start. Note, make sure that the node is marked as allowed in the cgroup as well! Devices []*devices.Device `json:"devices"` MountLabel string `json:"mount_label"` // Hostname optionally sets the container's hostname if provided Hostname string `json:"hostname"` // Domainname optionally sets the container's domainname if provided Domainname string `json:"domainname"` // Namespaces specifies the container's namespaces that it should setup when cloning the init process // If a namespace is not provided that namespace is shared from the container's parent process Namespaces Namespaces `json:"namespaces"` // Capabilities specify the capabilities to keep when executing the process inside the container // All capabilities not specified will be dropped from the processes capability mask Capabilities *Capabilities `json:"capabilities"` // Networks specifies the container's network setup to be created Networks []*Network `json:"networks"` // Routes can be specified to create entries in the route table as the container is started Routes []*Route `json:"routes"` // Cgroups specifies specific cgroup settings for the various subsystems that the container is // placed into to limit the resources the container has available Cgroups *Cgroup `json:"cgroups"` // AppArmorProfile specifies the profile to apply to the process running in the container and is // change at the time the process is execed AppArmorProfile string `json:"apparmor_profile,omitempty"` // ProcessLabel specifies the label to apply to the process running in the container. It is // commonly used by selinux ProcessLabel string `json:"process_label,omitempty"` // Rlimits specifies the resource limits, such as max open files, to set in the container // If Rlimits are not set, the container will inherit rlimits from the parent process Rlimits []Rlimit `json:"rlimits,omitempty"` // OomScoreAdj specifies the adjustment to be made by the kernel when calculating oom scores // for a process. Valid values are between the range [-1000, '1000'], where processes with // higher scores are preferred for being killed. If it is unset then we don't touch the current // value. // More information about kernel oom score calculation here: https://lwn.net/Articles/317814/ OomScoreAdj *int `json:"oom_score_adj,omitempty"` // UIDMappings is an array of User ID mappings for User Namespaces UIDMappings []IDMap `json:"uid_mappings"` // GIDMappings is an array of Group ID mappings for User Namespaces GIDMappings []IDMap `json:"gid_mappings"` // MaskPaths specifies paths within the container's rootfs to mask over with a bind // mount pointing to /dev/null as to prevent reads of the file. MaskPaths []string `json:"mask_paths"` // ReadonlyPaths specifies paths within the container's rootfs to remount as read-only // so that these files prevent any writes. ReadonlyPaths []string `json:"readonly_paths"` // Sysctl is a map of properties and their values. It is the equivalent of using // sysctl -w my.property.name value in Linux. Sysctl map[string]string `json:"sysctl"` // Seccomp allows actions to be taken whenever a syscall is made within the container. // A number of rules are given, each having an action to be taken if a syscall matches it. // A default action to be taken if no rules match is also given. Seccomp *Seccomp `json:"seccomp"` // NoNewPrivileges controls whether processes in the container can gain additional privileges. NoNewPrivileges bool `json:"no_new_privileges,omitempty"` // Hooks are a collection of actions to perform at various container lifecycle events. // CommandHooks are serialized to JSON, but other hooks are not. Hooks Hooks // Version is the version of opencontainer specification that is supported. Version string `json:"version"` // Labels are user defined metadata that is stored in the config and populated on the state Labels []string `json:"labels"` // NoNewKeyring will not allocated a new session keyring for the container. It will use the // callers keyring in this case. NoNewKeyring bool `json:"no_new_keyring"` // IntelRdt specifies settings for Intel RDT group that the container is placed into // to limit the resources (e.g., L3 cache, memory bandwidth) the container has available IntelRdt *IntelRdt `json:"intel_rdt,omitempty"` // RootlessEUID is set when the runc was launched with non-zero EUID. // Note that RootlessEUID is set to false when launched with EUID=0 in userns. // When RootlessEUID is set, runc creates a new userns for the container. // (config.json needs to contain userns settings) RootlessEUID bool `json:"rootless_euid,omitempty"` // RootlessCgroups is set when unlikely to have the full access to cgroups. // When RootlessCgroups is set, cgroups errors are ignored. RootlessCgroups bool `json:"rootless_cgroups,omitempty"` // Do not try to remount a bind mount again after the first attempt failed on source // filesystems that have nodev, noexec, nosuid, noatime, relatime, strictatime, nodiratime set NoMountFallback bool `json:"no_mount_fallback,omitempty"` }
Config defines configuration options for executing a process inside a contained environment.
func (Config) HostGID ¶
HostGID gets the translated gid for the process on host which could be different when user namespaces are enabled.
func (Config) HostRootGID ¶
HostRootGID gets the root gid for the process on host which could be non-zero when user namespaces are enabled.
func (Config) HostRootUID ¶
HostRootUID gets the root uid for the process on host which could be non-zero when user namespaces are enabled.
type FreezerState ¶
type FreezerState string
const ( Undefined FreezerState = "" Frozen FreezerState = "FROZEN" Thawed FreezerState = "THAWED" )
type FuncHook ¶
type FuncHook struct {
// contains filtered or unexported fields
}
func NewFunctionHook ¶
NewFunctionHook will call the provided function when the hook is run.
type Hook ¶
type Hook interface { // Run executes the hook with the provided state. Run(*specs.State) error }
type HookName ¶
type HookName string
const ( // Prestart commands are executed after the container namespaces are created, // but before the user supplied command is executed from init. // Note: This hook is now deprecated // Prestart commands are called in the Runtime namespace. Prestart HookName = "prestart" // CreateRuntime commands MUST be called as part of the create operation after // the runtime environment has been created but before the pivot_root has been executed. // CreateRuntime is called immediately after the deprecated Prestart hook. // CreateRuntime commands are called in the Runtime Namespace. CreateRuntime HookName = "createRuntime" // CreateContainer commands MUST be called as part of the create operation after // the runtime environment has been created but before the pivot_root has been executed. // CreateContainer commands are called in the Container namespace. CreateContainer HookName = "createContainer" // StartContainer commands MUST be called as part of the start operation and before // the container process is started. // StartContainer commands are called in the Container namespace. StartContainer HookName = "startContainer" // Poststart commands are executed after the container init process starts. // Poststart commands are called in the Runtime Namespace. Poststart HookName = "poststart" // Poststop commands are executed after the container init process exits. // Poststop commands are called in the Runtime Namespace. Poststop HookName = "poststop" )
type HugepageLimit ¶
type IDMap ¶
type IDMap struct { ContainerID int `json:"container_id"` HostID int `json:"host_id"` Size int `json:"size"` }
IDMap represents UID/GID Mappings for User Namespaces.
type IfPrioMap ¶
func (*IfPrioMap) CgroupString ¶
type IntelRdt ¶
type IntelRdt struct { // The identity for RDT Class of Service ClosID string `json:"closID,omitempty"` // The schema for L3 cache id and capacity bitmask (CBM) // Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." L3CacheSchema string `json:"l3_cache_schema,omitempty"` // The schema of memory bandwidth per L3 cache id // Format: "MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;..." // The unit of memory bandwidth is specified in "percentages" by // default, and in "MBps" if MBA Software Controller is enabled. MemBwSchema string `json:"memBwSchema,omitempty"` }
type LinuxRdma ¶
type LinuxRdma struct { // Maximum number of HCA handles that can be opened. Default is "no limit". HcaHandles *uint32 `json:"hca_handles,omitempty"` // Maximum number of HCA objects that can be created. Default is "no limit". HcaObjects *uint32 `json:"hca_objects,omitempty"` }
LinuxRdma for Linux cgroup 'rdma' resource management (Linux 4.11)
type Mount ¶
type Mount struct { // Source path for the mount. Source string `json:"source"` // Destination path for the mount inside the container. Destination string `json:"destination"` // Device the mount is for. Device string `json:"device"` // Mount flags. Flags int `json:"flags"` // Propagation Flags PropagationFlags []int `json:"propagation_flags"` // Mount data applied to the mount. Data string `json:"data"` // Relabel source if set, "z" indicates shared, "Z" indicates unshared. Relabel string `json:"relabel"` // RecAttr represents mount properties to be applied recursively (AT_RECURSIVE), see mount_setattr(2). RecAttr *unix.MountAttr `json:"rec_attr"` // Extensions are additional flags that are specific to runc. Extensions int `json:"extensions"` // UIDMappings is used to changing file user owners w/o calling chown. // Note that, the underlying filesystem should support this feature to be // used. // Every mount point could have its own mapping. UIDMappings []IDMap `json:"uidMappings,omitempty"` // GIDMappings is used to changing file group owners w/o calling chown. // Note that, the underlying filesystem should support this feature to be // used. // Every mount point could have its own mapping. GIDMappings []IDMap `json:"gidMappings,omitempty"` }
func (*Mount) IsIDMapped ¶
type Namespace ¶
type Namespace struct { Type NamespaceType `json:"type"` Path string `json:"path"` }
Namespace defines configuration for each namespace. It specifies an alternate path that is able to be joined via setns.
type NamespaceType ¶
type NamespaceType string
const ( NEWNET NamespaceType = "NEWNET" NEWPID NamespaceType = "NEWPID" NEWNS NamespaceType = "NEWNS" NEWUTS NamespaceType = "NEWUTS" NEWIPC NamespaceType = "NEWIPC" NEWUSER NamespaceType = "NEWUSER" NEWCGROUP NamespaceType = "NEWCGROUP" )
func NamespaceTypes ¶
func NamespaceTypes() []NamespaceType
type Namespaces ¶
type Namespaces []Namespace
func (*Namespaces) Add ¶
func (n *Namespaces) Add(t NamespaceType, path string)
func (*Namespaces) CloneFlags ¶
func (n *Namespaces) CloneFlags() uintptr
CloneFlags parses the container's Namespaces options to set the correct flags on clone, unshare. This function returns flags only for new namespaces.
func (*Namespaces) Contains ¶
func (n *Namespaces) Contains(t NamespaceType) bool
func (Namespaces) IsPrivate ¶
func (n Namespaces) IsPrivate(t NamespaceType) bool
IsPrivate tells whether the namespace of type t is configured as private (i.e. it exists and is not shared).
func (*Namespaces) PathOf ¶
func (n *Namespaces) PathOf(t NamespaceType) string
func (*Namespaces) Remove ¶
func (n *Namespaces) Remove(t NamespaceType) bool
type Network ¶
type Network struct { // Type sets the networks type, commonly veth and loopback Type string `json:"type"` // Name of the network interface Name string `json:"name"` // The bridge to use. Bridge string `json:"bridge"` // MacAddress contains the MAC address to set on the network interface MacAddress string `json:"mac_address"` // Address contains the IPv4 and mask to set on the network interface Address string `json:"address"` // Gateway sets the gateway address that is used as the default for the interface Gateway string `json:"gateway"` // IPv6Address contains the IPv6 and mask to set on the network interface IPv6Address string `json:"ipv6_address"` // IPv6Gateway sets the ipv6 gateway address that is used as the default for the interface IPv6Gateway string `json:"ipv6_gateway"` // Mtu sets the mtu value for the interface and will be mirrored on both the host and // container's interfaces if a pair is created, specifically in the case of type veth // Note: This does not apply to loopback interfaces. Mtu int `json:"mtu"` // TxQueueLen sets the tx_queuelen value for the interface and will be mirrored on both the host and // container's interfaces if a pair is created, specifically in the case of type veth // Note: This does not apply to loopback interfaces. TxQueueLen int `json:"txqueuelen"` // HostInterfaceName is a unique name of a veth pair that resides on in the host interface of the // container. HostInterfaceName string `json:"host_interface_name"` // HairpinMode specifies if hairpin NAT should be enabled on the virtual interface // bridge port in the case of type veth // Note: This is unsupported on some systems. // Note: This does not apply to loopback interfaces. HairpinMode bool `json:"hairpin_mode"` }
Network defines configuration for a container's networking stack
The network configuration can be omitted from a container causing the container to be setup with the host's networking stack
type Operator ¶
type Operator int
Operator is a comparison operator to be used when matching syscall arguments in Seccomp
type Resources ¶
type Resources struct { // Devices is the set of access rules for devices in the container. Devices []*devices.Rule `json:"devices"` // Memory limit (in bytes) Memory int64 `json:"memory"` // Memory reservation or soft_limit (in bytes) MemoryReservation int64 `json:"memory_reservation"` // Total memory usage (memory + swap); set `-1` to enable unlimited swap MemorySwap int64 `json:"memory_swap"` CpuShares uint64 `json:"cpu_shares"` // CPU hardcap limit (in usecs). Allowed cpu time in a given period. CpuQuota int64 `json:"cpu_quota"` // CPU period to be used for hardcapping (in usecs). 0 to use system default. CpuPeriod uint64 `json:"cpu_period"` // How many time CPU will use in realtime scheduling (in usecs). CpuRtRuntime int64 `json:"cpu_rt_quota"` // CPU period to be used for realtime scheduling (in usecs). CpuRtPeriod uint64 `json:"cpu_rt_period"` // CPU to use CpusetCpus string `json:"cpuset_cpus"` // MEM to use CpusetMems string `json:"cpuset_mems"` // cgroup SCHED_IDLE CPUIdle *int64 `json:"cpu_idle,omitempty"` // Process limit; set <= `0' to disable limit. PidsLimit int64 `json:"pids_limit"` // Specifies per cgroup weight, range is from 10 to 1000. BlkioWeight uint16 `json:"blkio_weight"` // Specifies tasks' weight in the given cgroup while competing with the cgroup's child cgroups, range is from 10 to 1000, cfq scheduler only BlkioLeafWeight uint16 `json:"blkio_leaf_weight"` // Weight per cgroup per device, can override BlkioWeight. BlkioWeightDevice []*WeightDevice `json:"blkio_weight_device"` // IO read rate limit per cgroup per device, bytes per second. BlkioThrottleReadBpsDevice []*ThrottleDevice `json:"blkio_throttle_read_bps_device"` // IO write rate limit per cgroup per device, bytes per second. BlkioThrottleWriteBpsDevice []*ThrottleDevice `json:"blkio_throttle_write_bps_device"` // IO read rate limit per cgroup per device, IO per second. BlkioThrottleReadIOPSDevice []*ThrottleDevice `json:"blkio_throttle_read_iops_device"` // IO write rate limit per cgroup per device, IO per second. BlkioThrottleWriteIOPSDevice []*ThrottleDevice `json:"blkio_throttle_write_iops_device"` // set the freeze value for the process Freezer FreezerState `json:"freezer"` // Hugetlb limit (in bytes) HugetlbLimit []*HugepageLimit `json:"hugetlb_limit"` // Whether to disable OOM Killer OomKillDisable bool `json:"oom_kill_disable"` // Tuning swappiness behaviour per cgroup MemorySwappiness *uint64 `json:"memory_swappiness"` // Set priority of network traffic for container NetPrioIfpriomap []*IfPrioMap `json:"net_prio_ifpriomap"` // Set class identifier for container's network packets NetClsClassid uint32 `json:"net_cls_classid_u"` // Rdma resource restriction configuration Rdma map[string]LinuxRdma `json:"rdma"` // CpuWeight sets a proportional bandwidth limit. CpuWeight uint64 `json:"cpu_weight"` // Unified is cgroupv2-only key-value map. Unified map[string]string `json:"unified"` // SkipDevices allows to skip configuring device permissions. // Used by e.g. kubelet while creating a parent cgroup (kubepods) // common for many containers, and by runc update. // // NOTE it is impossible to start a container which has this flag set. SkipDevices bool `json:"-"` // SkipFreezeOnSet is a flag for cgroup manager to skip the cgroup // freeze when setting resources. Only applicable to systemd legacy // (i.e. cgroup v1) manager (which uses freeze by default to avoid // spurious permission errors caused by systemd inability to update // device rules in a non-disruptive manner). // // If not set, a few methods (such as looking into cgroup's // devices.list and querying the systemd unit properties) are used // during Set() to figure out whether the freeze is required. Those // methods may be relatively slow, thus this flag. SkipFreezeOnSet bool `json:"-"` // MemoryCheckBeforeUpdate is a flag for cgroup v2 managers to check // if the new memory limits (Memory and MemorySwap) being set are lower // than the current memory usage, and reject if so. MemoryCheckBeforeUpdate bool `json:"memory_check_before_update"` }
type Route ¶
type Route struct { // Destination specifies the destination IP address and mask in the CIDR form. Destination string `json:"destination"` // Source specifies the source IP address and mask in the CIDR form. Source string `json:"source"` // Gateway specifies the gateway IP address. Gateway string `json:"gateway"` // InterfaceName specifies the device to set this route up for, for example eth0. InterfaceName string `json:"interface_name"` }
Route defines a routing table entry.
Routes can be specified to create entries in the routing table as the container is started.
All of destination, source, and gateway should be either IPv4 or IPv6. One of the three options must be present, and omitted entries will use their IP family default for the route table. For IPv4 for example, setting the gateway to 1.2.3.4 and the interface to eth0 will set up a standard destination of 0.0.0.0(or *) when viewed in the route table.
type Seccomp ¶
type Seccomp struct { DefaultAction Action `json:"default_action"` Architectures []string `json:"architectures"` Flags []specs.LinuxSeccompFlag `json:"flags"` Syscalls []*Syscall `json:"syscalls"` DefaultErrnoRet *uint `json:"default_errno_ret"` ListenerPath string `json:"listener_path,omitempty"` ListenerMetadata string `json:"listener_metadata,omitempty"` }
Seccomp represents syscall restrictions By default, only the native architecture of the kernel is allowed to be used for syscalls. Additional architectures can be added by specifying them in Architectures.
type Syscall ¶
type Syscall struct { Name string `json:"name"` Action Action `json:"action"` ErrnoRet *uint `json:"errnoRet"` Args []*Arg `json:"args"` }
Syscall is a rule to match a syscall in Seccomp
type ThrottleDevice ¶
type ThrottleDevice struct { BlockIODevice // Rate is the IO rate limit per cgroup per device Rate uint64 `json:"rate"` }
ThrottleDevice struct holds a `major:minor rate_per_second` pair
func NewThrottleDevice ¶
func NewThrottleDevice(major, minor int64, rate uint64) *ThrottleDevice
NewThrottleDevice returns a configured ThrottleDevice pointer
func (*ThrottleDevice) String ¶
func (td *ThrottleDevice) String() string
String formats the struct to be writable to the cgroup specific file
func (*ThrottleDevice) StringName ¶
func (td *ThrottleDevice) StringName(name string) string
StringName formats the struct to be writable to the cgroup specific file
type WeightDevice ¶
type WeightDevice struct { BlockIODevice // Weight is the bandwidth rate for the device, range is from 10 to 1000 Weight uint16 `json:"weight"` // LeafWeight is the bandwidth rate for the device while competing with the cgroup's child cgroups, range is from 10 to 1000, cfq scheduler only LeafWeight uint16 `json:"leafWeight"` }
WeightDevice struct holds a `major:minor weight`|`major:minor leaf_weight` pair
func NewWeightDevice ¶
func NewWeightDevice(major, minor int64, weight, leafWeight uint16) *WeightDevice
NewWeightDevice returns a configured WeightDevice pointer
func (*WeightDevice) LeafWeightString ¶
func (wd *WeightDevice) LeafWeightString() string
LeafWeightString formats the struct to be writable to the cgroup specific file
func (*WeightDevice) WeightString ¶
func (wd *WeightDevice) WeightString() string
WeightString formats the struct to be writable to the cgroup specific file