Documentation ¶
Overview ¶
Package kernel provides an emulation of the Linux kernel.
See README.md for a detailed overview.
Lock order (outermost locks must be taken first):
Kernel.extMu ThreadGroup.timerMu ktime.Timer.mu (for IntervalTimer) and Kernel.cpuClockMu TaskSet.mu SignalHandlers.mu Task.mu runningTasksMu
Locking SignalHandlers.mu in multiple SignalHandlers requires locking TaskSet.mu exclusively first. Locking Task.mu in multiple Tasks at the same time requires locking all of their signal mutexes first.
Index ¶
- Constants
- Variables
- func ContextCanTrace(ctx context.Context, t *Task, attach bool) bool
- func ExtractErrno(err error, sysno int) int
- func IncrementUnimplementedSyscallCounter(sysno uintptr)
- func LoadSeccheckData(t *Task, mask seccheck.FieldMask, info *pb.ContextData)
- func LoadSeccheckDataLocked(t *Task, mask seccheck.FieldMask, info *pb.ContextData, cwd string)
- func RegisterSyscallTable(s *SyscallTable)
- func SignalInfoNoInfo(sig linux.Signal, sender, receiver *Task) *linux.SignalInfo
- func SignalInfoPriv(sig linux.Signal) *linux.SignalInfo
- type AIOCallback
- type Auxmap
- type Cgroup
- type CgroupController
- type CgroupControllerType
- type CgroupImpl
- type CgroupMigrationContext
- type CgroupMount
- type CgroupRegistry
- func (r *CgroupRegistry) AddCgroup(cg CgroupImpl)
- func (r *CgroupRegistry) FindCgroup(ctx context.Context, ctype CgroupControllerType, path string) (Cgroup, error)
- func (r *CgroupRegistry) FindHierarchy(name string, ctypes []CgroupControllerType) (*vfs.Filesystem, error)
- func (r *CgroupRegistry) GenerateProcCgroups(buf *bytes.Buffer)
- func (r *CgroupRegistry) GetCgroup(cid uint32) (CgroupImpl, error)
- func (r *CgroupRegistry) NextCgroupID() (uint32, error)
- func (r *CgroupRegistry) Register(name string, cs []CgroupController, fs cgroupFS) error
- func (r *CgroupRegistry) Unregister(hid uint32)
- type CgroupResourceType
- type CreateProcessArgs
- type FDFlags
- type FDTable
- func (f *FDTable) CurrentMaxFDs() int
- func (f *FDTable) DecRef(ctx context.Context)
- func (f *FDTable) Exists(fd int32) bool
- func (f *FDTable) Fork(ctx context.Context, maxFd int32) *FDTable
- func (f *FDTable) Get(fd int32) (*vfs.FileDescription, FDFlags)
- func (f *FDTable) GetFDs(ctx context.Context) []int32
- func (f *FDTable) GetLastFd() int32
- func (f *FDTable) NewFD(ctx context.Context, minFD int32, file *vfs.FileDescription, flags FDFlags) (int32, error)
- func (f *FDTable) NewFDAt(ctx context.Context, fd int32, file *vfs.FileDescription, flags FDFlags) (*vfs.FileDescription, error)
- func (f *FDTable) NewFDs(ctx context.Context, minFD int32, files []*vfs.FileDescription, flags FDFlags) (fds []int32, err error)
- func (f *FDTable) Remove(ctx context.Context, fd int32) *vfs.FileDescription
- func (f *FDTable) RemoveIf(ctx context.Context, cond func(*vfs.FileDescription, FDFlags) bool)
- func (f *FDTable) RemoveNextInRange(ctx context.Context, startFd int32, endFd int32) (int32, *vfs.FileDescription)
- func (f *FDTable) SetFlags(ctx context.Context, fd int32, flags FDFlags) error
- func (f *FDTable) SetFlagsForRange(ctx context.Context, startFd int32, endFd int32, flags FDFlags) error
- func (f *FDTable) String() string
- type FSContext
- func (f *FSContext) DecRef(ctx context.Context)
- func (f *FSContext) Fork() *FSContext
- func (f *FSContext) RootDirectory() vfs.VirtualDentry
- func (f *FSContext) SetRootDirectory(ctx context.Context, vd vfs.VirtualDentry)
- func (f *FSContext) SetWorkingDirectory(ctx context.Context, d vfs.VirtualDentry)
- func (f *FSContext) SwapUmask(mask uint) uint
- func (f *FSContext) Umask() uint
- func (f *FSContext) WorkingDirectory() vfs.VirtualDentry
- type IPCNamespace
- func (i *IPCNamespace) DecRef(ctx context.Context)
- func (i *IPCNamespace) Destroy(ctx context.Context)
- func (i *IPCNamespace) GetInode() *nsfs.Inode
- func (i *IPCNamespace) IncRef()
- func (i *IPCNamespace) InitPosixQueues(ctx context.Context, vfsObj *vfs.VirtualFilesystem, creds *auth.Credentials) error
- func (i *IPCNamespace) MsgqueueRegistry() *msgqueue.Registry
- func (i *IPCNamespace) PosixQueues() *mq.Registry
- func (i *IPCNamespace) SemaphoreRegistry() *semaphore.Registry
- func (i *IPCNamespace) SetInode(inode *nsfs.Inode)
- func (i *IPCNamespace) ShmRegistry() *shm.Registry
- func (i *IPCNamespace) Type() string
- func (i *IPCNamespace) UserNamespace() *auth.UserNamespace
- type InitKernelArgs
- type IntervalTimer
- type Kcov
- func (kcov *Kcov) Clear(ctx context.Context)
- func (kcov *Kcov) ConfigureMMap(ctx context.Context, opts *memmap.MMapOpts) error
- func (kcov *Kcov) DisableTrace(ctx context.Context) error
- func (kcov *Kcov) EnableTrace(ctx context.Context, traceKind uint8) error
- func (kcov *Kcov) InitTrace(size uint64) error
- func (kcov *Kcov) OnTaskExit()
- func (kcov *Kcov) TaskWork(t *Task)
- type Kernel
- func (k *Kernel) AddCgroupMount(ctl string, mnt *CgroupMount)
- func (k *Kernel) AddDevGofer(cid string, goferFD int) error
- func (k *Kernel) ApplicationCores() uint
- func (k *Kernel) CPUClockNow() uint64
- func (k *Kernel) CgroupRegistry() *CgroupRegistry
- func (k *Kernel) CreateProcess(args CreateProcessArgs) (*ThreadGroup, ThreadID, error)
- func (*Kernel) Deadline() (time.Time, bool)
- func (k *Kernel) DeleteSocket(sock *vfs.FileDescription)
- func (*Kernel) Done() <-chan struct{}
- func (k *Kernel) EmitUnimplementedEvent(ctx context.Context, sysno uintptr)
- func (*Kernel) Err() error
- func (k *Kernel) FeatureSet() cpuid.FeatureSet
- func (k *Kernel) GenerateInotifyCookie() uint32
- func (k *Kernel) GetCgroupMount(ctl string) *CgroupMount
- func (k *Kernel) GetNamespaceInode(ctx context.Context, ns vfs.Namespace) refs.TryRefCounter
- func (k *Kernel) GetUserCounters(uid auth.KUID) *UserCounters
- func (k *Kernel) GlobalInit() *ThreadGroup
- func (k *Kernel) HostMount() *vfs.Mount
- func (k *Kernel) Init(args InitKernelArgs) error
- func (k *Kernel) Kill(ws linux.WaitStatus)
- func (k *Kernel) ListSockets() []*SocketRecord
- func (k *Kernel) LoadFrom(ctx context.Context, r io.Reader, pagesFile *fd.FD, timeReady chan struct{}, ...) error
- func (k *Kernel) LoadTaskImage(ctx context.Context, args loader.LoadArgs) (*TaskImage, *syserr.Error)
- func (k *Kernel) MemoryFile() *pgalloc.MemoryFile
- func (k *Kernel) MonotonicClock() ktime.Clock
- func (k *Kernel) NetlinkPorts() *port.Manager
- func (k *Kernel) NewFDTable() *FDTable
- func (k *Kernel) NewKcov() *Kcov
- func (k *Kernel) NewThreadGroup(pidns *PIDNamespace, sh *SignalHandlers, terminationSignal linux.Signal, ...) *ThreadGroup
- func (k *Kernel) Pause()
- func (k *Kernel) PipeMount() *vfs.Mount
- func (k *Kernel) PopulateNewCgroupHierarchy(root Cgroup)
- func (k *Kernel) RealtimeClock() ktime.Clock
- func (k *Kernel) RebuildTraceContexts()
- func (k *Kernel) ReceiveTaskStates()
- func (k *Kernel) RecordSocket(sock *vfs.FileDescription)
- func (k *Kernel) RegisterContainerName(cid, containerName string)
- func (k *Kernel) Release()
- func (k *Kernel) ReleaseCgroupHierarchy(hid uint32)
- func (k *Kernel) RemoveDevGofer(cid string)
- func (k *Kernel) ReplaceFSContextRoots(ctx context.Context, oldRoot vfs.VirtualDentry, newRoot vfs.VirtualDentry)
- func (k *Kernel) RestoreContainerMapping(containerIDs map[string]string)
- func (k *Kernel) RootIPCNamespace() *IPCNamespace
- func (k *Kernel) RootNetworkNamespace() *inet.Namespace
- func (k *Kernel) RootPIDNamespace() *PIDNamespace
- func (k *Kernel) RootUTSNamespace() *UTSNamespace
- func (k *Kernel) RootUserNamespace() *auth.UserNamespace
- func (k *Kernel) SaveStatus() (saved, autosaved bool, err error)
- func (k *Kernel) SaveTo(ctx context.Context, w io.Writer, pagesFile *os.File) error
- func (k *Kernel) SendContainerSignal(cid string, info *linux.SignalInfo) error
- func (k *Kernel) SendExternalSignal(info *linux.SignalInfo, context string)
- func (k *Kernel) SendExternalSignalProcessGroup(pg *ProcessGroup, info *linux.SignalInfo) error
- func (k *Kernel) SendExternalSignalThreadGroup(tg *ThreadGroup, info *linux.SignalInfo) error
- func (k *Kernel) SetHostMount(mnt *vfs.Mount)
- func (k *Kernel) SetMemoryFile(mf *pgalloc.MemoryFile)
- func (k *Kernel) SetSaveError(err error)
- func (k *Kernel) SetSaveSuccess(autosave bool)
- func (k *Kernel) ShmMount() *vfs.Mount
- func (k *Kernel) SocketMount() *vfs.Mount
- func (k *Kernel) Start() error
- func (k *Kernel) StartProcess(tg *ThreadGroup)
- func (k *Kernel) SupervisorContext() context.Context
- func (k *Kernel) Syslog() *syslog
- func (k *Kernel) TaskContainerName(task *Task) string
- func (k *Kernel) TaskSet() *TaskSet
- func (k *Kernel) TestOnlySetGlobalInit(tg *ThreadGroup)
- func (k *Kernel) Timekeeper() *Timekeeper
- func (k *Kernel) UniqueID() uint64
- func (k *Kernel) Unpause()
- func (k *Kernel) VFS() *vfs.VirtualFilesystem
- func (k *Kernel) WaitExited()
- type MissingFn
- type OldRSeqCriticalRegion
- type PIDNamespace
- func (ns *PIDNamespace) ID() uint64
- func (ns *PIDNamespace) IDOfProcessGroup(pg *ProcessGroup) ProcessGroupID
- func (ns *PIDNamespace) IDOfSession(s *Session) SessionID
- func (ns *PIDNamespace) IDOfTask(t *Task) ThreadID
- func (ns *PIDNamespace) IDOfThreadGroup(tg *ThreadGroup) ThreadID
- func (ns *PIDNamespace) NewChild(userns *auth.UserNamespace) *PIDNamespace
- func (ns *PIDNamespace) NumTasks() int
- func (ns *PIDNamespace) NumTasksPerContainer(cid string) int
- func (ns *PIDNamespace) ProcessGroupWithID(id ProcessGroupID) *ProcessGroup
- func (ns *PIDNamespace) Root() *PIDNamespace
- func (ns *PIDNamespace) SessionWithID(id SessionID) *Session
- func (ns *PIDNamespace) TaskWithID(tid ThreadID) *Task
- func (ns *PIDNamespace) Tasks() []*Task
- func (ns *PIDNamespace) ThreadGroupWithID(tid ThreadID) *ThreadGroup
- func (ns *PIDNamespace) ThreadGroups() []*ThreadGroup
- func (ns *PIDNamespace) ThreadGroupsAppend(tgs []*ThreadGroup) []*ThreadGroup
- func (ns *PIDNamespace) UserNamespace() *auth.UserNamespace
- type ProcessGroup
- type ProcessGroupID
- type Session
- type SessionID
- type SignalAction
- type SignalHandlers
- type SocketRecord
- type SpecialOpts
- type Stracer
- type Syscall
- type SyscallControl
- type SyscallFlagsTable
- type SyscallFn
- type SyscallInfo
- type SyscallRestartBlock
- type SyscallSupportLevel
- type SyscallTable
- func (s *SyscallTable) Init()
- func (s *SyscallTable) Lookup(sysno uintptr) SyscallFn
- func (s *SyscallTable) LookupEmulate(addr hostarch.Addr) (uintptr, bool)
- func (s *SyscallTable) LookupName(sysno uintptr) string
- func (s *SyscallTable) LookupNo(name string) (uintptr, error)
- func (s *SyscallTable) LookupSyscallToProto(sysno uintptr) SyscallToProto
- func (s *SyscallTable) MaxSysno() (max uintptr)
- type SyscallToProto
- type TTY
- type Task
- func (t *Task) Activate()
- func (t *Task) AppendSyscallFilter(p bpf.Program, syncAll bool) error
- func (t *Task) Arch() *arch.Context64
- func (t *Task) AsyncContext() context.Context
- func (t *Task) BeginExternalStop()
- func (t *Task) Block(C <-chan struct{}) error
- func (t *Task) BlockOn(w waiter.Waitable, mask waiter.EventMask) bool
- func (t *Task) BlockWithDeadline(C <-chan struct{}, haveDeadline bool, deadline ktime.Time) error
- func (t *Task) BlockWithDeadlineFrom(C <-chan struct{}, clock ktime.Clock, haveDeadline bool, deadline ktime.Time) error
- func (t *Task) BlockWithTimeout(C chan struct{}, haveTimeout bool, timeout time.Duration) (time.Duration, error)
- func (t *Task) BlockWithTimeoutOn(w waiter.Waitable, mask waiter.EventMask, timeout time.Duration) (time.Duration, bool)
- func (t *Task) CPU() int32
- func (t *Task) CPUClock() ktime.Clock
- func (t *Task) CPUMask() sched.CPUSet
- func (t *Task) CPUStats() usage.CPUStats
- func (t *Task) CanTrace(target *Task, attach bool) bool
- func (t *Task) CgroupPrepareMigrate(dst Cgroup) (*CgroupMigrationContext, error)
- func (t *Task) ChargeFor(other *Task, ctl CgroupControllerType, res CgroupResourceType, value int64) (bool, Cgroup, error)
- func (t *Task) Children() map[*Task]struct{}
- func (t *Task) ClearRSeq(addr hostarch.Addr, length, signature uint32) error
- func (t *Task) ClearYAMAException()
- func (t *Task) Clone(args *linux.CloneArgs) (ThreadID, *SyscallControl, error)
- func (t *Task) CompareAndSwapUint32(addr hostarch.Addr, old, new uint32) (uint32, error)
- func (t *Task) ContainerID() string
- func (t *Task) CopyContext(ctx context.Context, opts usermem.IOOpts) *taskCopyContext
- func (t *Task) CopyInBytes(addr hostarch.Addr, dst []byte) (int, error)
- func (t *Task) CopyInIovecs(addr hostarch.Addr, numIovecs int) (hostarch.AddrRangeSeq, error)
- func (t *Task) CopyInIovecsAsSlice(addr hostarch.Addr, numIovecs int) ([]hostarch.AddrRange, error)
- func (t *Task) CopyInString(addr hostarch.Addr, maxlen int) (string, error)
- func (t *Task) CopyInVector(addr hostarch.Addr, maxElemSize, maxTotalSize int) ([]string, error)
- func (t *Task) CopyOutBytes(addr hostarch.Addr, src []byte) (int, error)
- func (t *Task) CopyOutIovecs(addr hostarch.Addr, src hostarch.AddrRangeSeq) error
- func (t *Task) CopyScratchBuffer(size int) []byte
- func (t *Task) Credentials() *auth.Credentials
- func (t *Task) Deactivate()
- func (*Task) Deadline() (time.Time, bool)
- func (t *Task) DebugDumpState()
- func (t *Task) Debugf(fmt string, v ...any)
- func (*Task) Done() <-chan struct{}
- func (t *Task) DropBoundingCapability(cp linux.Capability) error
- func (t *Task) EndExternalStop()
- func (t *Task) EnterCgroup(c Cgroup) error
- func (t *Task) EnterInitialCgroups(parent *Task, initCgroups map[Cgroup]struct{})
- func (*Task) Err() error
- func (t *Task) Execve(newImage *TaskImage, argv, env []string, executable *vfs.FileDescription, ...) (*SyscallControl, error)
- func (t *Task) ExitState() TaskExitState
- func (t *Task) ExitStatus() linux.WaitStatus
- func (t *Task) FDTable() *FDTable
- func (t *Task) FSContext() *FSContext
- func (t *Task) Futex() *futex.Manager
- func (t *Task) FutexWaiter() *futex.Waiter
- func (t *Task) GenerateProcTaskCgroup(buf *bytes.Buffer)
- func (t *Task) GetCgroupEntries() []TaskCgroupEntry
- func (t *Task) GetFile(fd int32) *vfs.FileDescription
- func (t *Task) GetIPCNamespace() *IPCNamespace
- func (t *Task) GetMountNamespace() *vfs.MountNamespace
- func (t *Task) GetNetworkNamespace() *inet.Namespace
- func (t *Task) GetRobustList() hostarch.Addr
- func (t *Task) GetSharedKey(addr hostarch.Addr) (futex.Key, error)
- func (t *Task) GetUTSNamespace() *UTSNamespace
- func (t *Task) Getitimer(id int32) (linux.ItimerVal, error)
- func (t *Task) GoroutineID() int64
- func (t *Task) HasCapability(cp linux.Capability) bool
- func (t *Task) HasCapabilityIn(cp linux.Capability, ns *auth.UserNamespace) bool
- func (t *Task) IOUsage() *usage.IO
- func (t *Task) IPCNamespace() *IPCNamespace
- func (t *Task) Infof(fmt string, v ...any)
- func (t *Task) Interrupt()
- func (t *Task) Interrupted() bool
- func (t *Task) IntervalTimerCreate(c ktime.Clock, sigev *linux.Sigevent) (linux.TimerID, error)
- func (t *Task) IntervalTimerDelete(id linux.TimerID) error
- func (t *Task) IntervalTimerGetoverrun(id linux.TimerID) (int32, error)
- func (t *Task) IntervalTimerGettime(id linux.TimerID) (linux.Itimerspec, error)
- func (t *Task) IntervalTimerSettime(id linux.TimerID, its linux.Itimerspec, abs bool) (linux.Itimerspec, error)
- func (t *Task) IovecsIOSequence(addr hostarch.Addr, iovcnt int, opts usermem.IOOpts) (usermem.IOSequence, error)
- func (t *Task) IsChrooted() bool
- func (t *Task) IsLogging(level log.Level) bool
- func (t *Task) IsNetworkNamespaced() bool
- func (t *Task) JoinSessionKeyring(keyDesc *string) (*auth.Key, error)
- func (t *Task) KGID() uint32
- func (t *Task) KUID() uint32
- func (t *Task) Kernel() *Kernel
- func (t *Task) LeaveCgroups()
- func (t *Task) Limits() *limits.LimitSet
- func (t *Task) LoadUint32(addr hostarch.Addr) (uint32, error)
- func (t *Task) LookupKey(keyID auth.KeySerial) (*auth.Key, error)
- func (t *Task) MaxRSS(which int32) uint64
- func (t *Task) MemoryManager() *mm.MemoryManager
- func (t *Task) MigrateCgroup(dst Cgroup) error
- func (t *Task) MountNamespace() *vfs.MountNamespace
- func (t *Task) Name() string
- func (t *Task) NetworkContext() inet.Stack
- func (t *Task) NetworkNamespace() *inet.Namespace
- func (t *Task) NewFDAt(fd int32, file *vfs.FileDescription, flags FDFlags) (*vfs.FileDescription, error)
- func (t *Task) NewFDFrom(minFD int32, file *vfs.FileDescription, flags FDFlags) (int32, error)
- func (t *Task) NewFDs(fd int32, files []*vfs.FileDescription, flags FDFlags) ([]int32, error)
- func (t *Task) Niceness() int
- func (t *Task) NotifyRlimitCPUUpdated()
- func (t *Task) NumaPolicy() (policy linux.NumaPolicy, nodeMask uint64)
- func (t *Task) OOMScoreAdj() int32
- func (t *Task) OldRSeqCPUAddr() hostarch.Addr
- func (t *Task) OldRSeqCriticalRegion() OldRSeqCriticalRegion
- func (t *Task) OwnCopyContext(opts usermem.IOOpts) *ownTaskCopyContext
- func (t *Task) PIDNamespace() *PIDNamespace
- func (t *Task) Parent() *Task
- func (t *Task) ParentDeathSignal() linux.Signal
- func (t *Task) ParentLocked() *Task
- func (t *Task) PendingSignals() linux.SignalSet
- func (t *Task) PrepareExit(ws linux.WaitStatus)
- func (t *Task) PrepareGroupExit(ws linux.WaitStatus)
- func (t *Task) Priority() int
- func (t *Task) Ptrace(req int64, pid ThreadID, addr, data hostarch.Addr) error
- func (t *Task) QueueAIO(cb AIOCallback)
- func (t *Task) RSeqAvailable() bool
- func (t *Task) RegisterWork(work TaskWorker)
- func (t *Task) ResetKcov()
- func (t *Task) ResetMemCgIDFromCgroup(cg Cgroup)
- func (t *Task) RestoreContainerID(cid string)
- func (t *Task) SeccompMode() int
- func (t *Task) SendGroupSignal(info *linux.SignalInfo) error
- func (t *Task) SendSignal(info *linux.SignalInfo) error
- func (t *Task) SessionKeyring() (*auth.Key, error)
- func (t *Task) SetCPUMask(mask sched.CPUSet) error
- func (t *Task) SetCapabilitySets(permitted, inheritable, effective auth.CapabilitySet) error
- func (t *Task) SetClearTID(addr hostarch.Addr)
- func (t *Task) SetExtraGIDs(gids []auth.GID) error
- func (t *Task) SetGID(gid auth.GID) error
- func (t *Task) SetKcov(k *Kcov)
- func (t *Task) SetKeepCaps(k bool)
- func (t *Task) SetMemCgID(memCgID uint32)
- func (t *Task) SetMemCgIDFromCgroup(cg Cgroup)
- func (t *Task) SetName(name string)
- func (t *Task) SetNiceness(n int)
- func (t *Task) SetNumaPolicy(policy linux.NumaPolicy, nodeMask uint64)
- func (t *Task) SetOOMScoreAdj(adj int32) error
- func (t *Task) SetOldRSeqCPUAddr(addr hostarch.Addr) error
- func (t *Task) SetOldRSeqCriticalRegion(r OldRSeqCriticalRegion) error
- func (t *Task) SetParentDeathSignal(sig linux.Signal)
- func (t *Task) SetPermsOnKey(key *auth.Key, perms auth.KeyPermissions) error
- func (t *Task) SetREGID(r, e auth.GID) error
- func (t *Task) SetRESGID(r, e, s auth.GID) error
- func (t *Task) SetRESUID(r, e, s auth.UID) error
- func (t *Task) SetREUID(r, e auth.UID) error
- func (t *Task) SetRSeq(addr hostarch.Addr, length, signature uint32) error
- func (t *Task) SetRobustList(addr hostarch.Addr)
- func (t *Task) SetSavedSignalMask(mask linux.SignalSet)
- func (t *Task) SetSignalMask(mask linux.SignalSet)
- func (t *Task) SetSignalStack(alt linux.SignalStack) bool
- func (t *Task) SetSyscallRestartBlock(r SyscallRestartBlock)
- func (t *Task) SetUID(uid auth.UID) error
- func (t *Task) SetUserNamespace(ns *auth.UserNamespace) error
- func (t *Task) SetYAMAException(tracer *Task)
- func (t *Task) Setitimer(id int32, newitv linux.ItimerVal) (linux.ItimerVal, error)
- func (t *Task) Setns(fd *vfs.FileDescription, flags int32) error
- func (t *Task) SigaltStack(setaddr hostarch.Addr, oldaddr hostarch.Addr) (*SyscallControl, error)
- func (t *Task) SignalMask() linux.SignalSet
- func (t *Task) SignalRegister(e *waiter.Entry)
- func (t *Task) SignalReturn(rt bool) (*SyscallControl, error)
- func (t *Task) SignalStack() linux.SignalStack
- func (t *Task) SignalUnregister(e *waiter.Entry)
- func (t *Task) Sigtimedwait(set linux.SignalSet, timeout time.Duration) (*linux.SignalInfo, error)
- func (t *Task) SingleIOSequence(addr hostarch.Addr, length int, opts usermem.IOOpts) (usermem.IOSequence, error)
- func (t *Task) Stack() *arch.Stack
- func (t *Task) Start(tid ThreadID)
- func (t *Task) StartTime() ktime.Time
- func (t *Task) StateStatus() string
- func (t *Task) SwapUint32(addr hostarch.Addr, new uint32) (uint32, error)
- func (t *Task) SyscallRestartBlock() SyscallRestartBlock
- func (t *Task) SyscallTable() *SyscallTable
- func (t *Task) TGIDInRoot() ThreadID
- func (t *Task) TaskGoroutineSchedInfo() TaskGoroutineSchedInfo
- func (t *Task) TaskImage() *TaskImage
- func (t *Task) TaskSet() *TaskSet
- func (t *Task) ThreadGroup() *ThreadGroup
- func (t *Task) ThreadID() ThreadID
- func (t *Task) Timekeeper() *Timekeeper
- func (t *Task) Tracer() *Task
- func (t *Task) UTSNamespace() *UTSNamespace
- func (t *Task) UninterruptibleSleepFinish(activate bool)
- func (t *Task) UninterruptibleSleepStart(deactivate bool)
- func (t *Task) Unshare(flags int32) error
- func (t *Task) UnshareFdTable(maxFd int32)
- func (t *Task) UserCPUClock() ktime.Clock
- func (t *Task) UserNamespace() *auth.UserNamespace
- func (t *Task) Value(key any) any
- func (t *Task) Wait(opts *WaitOptions) (*WaitResult, error)
- func (t *Task) Warningf(fmt string, v ...any)
- func (t *Task) WithMuLocked(f func(*Task))
- func (t *Task) Yield()
- type TaskCgroupEntry
- type TaskConfig
- type TaskExitState
- type TaskGoroutineSchedInfo
- type TaskGoroutineState
- type TaskImage
- type TaskOrigin
- type TaskSet
- type TaskStop
- type TaskWorker
- type ThreadGroup
- func (tg *ThreadGroup) CPUClock() ktime.Clock
- func (tg *ThreadGroup) CPUStats() usage.CPUStats
- func (tg *ThreadGroup) Count() int
- func (tg *ThreadGroup) CreateProcessGroup() error
- func (tg *ThreadGroup) CreateSession() (SessionID, error)
- func (tg *ThreadGroup) ExitStatus() linux.WaitStatus
- func (tg *ThreadGroup) ForegroundProcessGroupID(tty *TTY) (ProcessGroupID, error)
- func (tg *ThreadGroup) ID() ThreadID
- func (tg *ThreadGroup) IOUsage() *usage.IO
- func (tg *ThreadGroup) IsChildSubreaper() bool
- func (tg *ThreadGroup) IsInitIn(pidns *PIDNamespace) bool
- func (tg *ThreadGroup) JoinProcessGroup(pidns *PIDNamespace, pgid ProcessGroupID, checkExec bool) error
- func (tg *ThreadGroup) JoinedChildCPUStats() usage.CPUStats
- func (tg *ThreadGroup) Leader() *Task
- func (tg *ThreadGroup) Limits() *limits.LimitSet
- func (tg *ThreadGroup) MemberIDs(pidns *PIDNamespace) []ThreadID
- func (tg *ThreadGroup) MigrateCgroup(dst Cgroup) error
- func (tg *ThreadGroup) PIDNamespace() *PIDNamespace
- func (tg *ThreadGroup) ProcessGroup() *ProcessGroup
- func (tg *ThreadGroup) Release(ctx context.Context)
- func (tg *ThreadGroup) ReleaseControllingTTY(tty *TTY) error
- func (tg *ThreadGroup) SendSignal(info *linux.SignalInfo) error
- func (tg *ThreadGroup) Session() *Session
- func (tg *ThreadGroup) SetChildSubreaper(isSubreaper bool)
- func (tg *ThreadGroup) SetControllingTTY(tty *TTY, steal bool, isReadable bool) error
- func (tg *ThreadGroup) SetForegroundProcessGroupID(tty *TTY, pgid ProcessGroupID) error
- func (tg *ThreadGroup) SetSigAction(sig linux.Signal, actptr *linux.SigAction) (linux.SigAction, error)
- func (tg *ThreadGroup) SignalHandlers() *SignalHandlers
- func (tg *ThreadGroup) TTY() *TTY
- func (tg *ThreadGroup) TaskSet() *TaskSet
- func (tg *ThreadGroup) TerminationSignal() linux.Signal
- func (tg *ThreadGroup) UserCPUClock() ktime.Clock
- func (tg *ThreadGroup) WaitExited()
- type ThreadID
- type Timekeeper
- func (t *Timekeeper) AfterFunc(d time.Duration, f func()) tcpip.Timer
- func (t *Timekeeper) BootTime() ktime.Time
- func (t *Timekeeper) Destroy()
- func (t *Timekeeper) GetTime(c sentrytime.ClockID) (int64, error)
- func (t *Timekeeper) Now() time.Time
- func (t *Timekeeper) NowMonotonic() tcpip.MonotonicTime
- func (t *Timekeeper) PauseUpdates()
- func (t *Timekeeper) ResumeUpdates()
- func (t *Timekeeper) SetClocks(c sentrytime.Clocks)
- type UTSNamespace
- func (u *UTSNamespace) Clone(userns *auth.UserNamespace) *UTSNamespace
- func (u *UTSNamespace) DecRef(ctx context.Context)
- func (u *UTSNamespace) Destroy(ctx context.Context)
- func (u *UTSNamespace) DomainName() string
- func (u *UTSNamespace) GetInode() *nsfs.Inode
- func (u *UTSNamespace) HostName() string
- func (u *UTSNamespace) IncRef()
- func (u *UTSNamespace) SetDomainName(domain string)
- func (u *UTSNamespace) SetHostName(host string)
- func (u *UTSNamespace) SetInode(inode *nsfs.Inode)
- func (u *UTSNamespace) Type() string
- func (u *UTSNamespace) UserNamespace() *auth.UserNamespace
- type UserCounters
- type VDSOParamPage
- type Version
- type WaitOptions
- type WaitResult
Constants ¶
const ( CgroupControllerCPU = CgroupControllerType("cpu") CgroupControllerCPUAcct = CgroupControllerType("cpuacct") CgroupControllerCPUSet = CgroupControllerType("cpuset") CgroupControllerDevices = CgroupControllerType("devices") CgroupControllerJob = CgroupControllerType("job") CgroupControllerMemory = CgroupControllerType("memory") CgroupControllerPIDs = CgroupControllerType("pids") )
Available cgroup controllers.
const ( // CtxCanTrace is a Context.Value key for a function with the same // signature and semantics as kernel.Task.CanTrace. CtxCanTrace contextID = iota // CtxKernel is a Context.Value key for a Kernel. CtxKernel // CtxPIDNamespace is a Context.Value key for a PIDNamespace. CtxPIDNamespace // CtxTask is a Context.Value key for a Task. CtxTask // CtxUTSNamespace is a Context.Value key for a UTSNamespace. CtxUTSNamespace )
const ( // SupportUndocumented indicates the syscall is not documented yet. SupportUndocumented = iota // SupportUnimplemented indicates the syscall is unimplemented. SupportUnimplemented // SupportPartial indicates the syscall is partially supported. SupportPartial // SupportFull indicates the syscall is fully supported. SupportFull )
const ( // StraceEnableLog enables syscall log tracing. StraceEnableLog // StraceEnableEvent enables syscall event tracing. StraceEnableEvent // ExternalBeforeEnable enables the external hook before syscall execution. ExternalBeforeEnable // ExternalAfterEnable enables the external hook after syscall execution. ExternalAfterEnable // SecCheckEnter represents a schematized/enter syscall seccheck event. SecCheckEnter // SecCheckExit represents a schematized/exit syscall seccheck event. SecCheckExit // SecCheckRawEnter represents raw/enter syscall seccheck event. SecCheckRawEnter // SecCheckRawExit represents raw/exit syscall seccheck event. SecCheckRawExit )
Possible flags for SyscallFlagsTable.enable.
const ( // EventExit represents an exit notification generated for a child thread // group leader or a tracee under the conditions specified in the comment // above runExitNotify. EventExit waiter.EventMask = 1 << iota // EventChildGroupStop occurs when a child thread group completes a group // stop (i.e. all tasks in the child thread group have entered a stopped // state as a result of a group stop). EventChildGroupStop // EventTraceeStop occurs when a task that is ptraced by a task in the // notified thread group enters a ptrace stop (see ptrace(2)). EventTraceeStop // EventGroupContinue occurs when a child thread group, or a thread group // whose leader is ptraced by a task in the notified thread group, that had // initiated or completed a group stop leaves the group stop, due to the // child thread group or any task in the child thread group being sent // SIGCONT. EventGroupContinue )
Task events that can be waited for.
const InvalidCgroupHierarchyID uint32 = 0
InvalidCgroupHierarchyID indicates an uninitialized hierarchy ID.
const InvalidCgroupID uint32 = 0
InvalidCgroupID indicates an uninitialized cgroup ID.
const MaxFdLimit int32 = int32(bitmap.MaxBitEntryLimit)
MaxFdLimit defines the upper limit on the integer value of file descriptors.
const SignalPanic = linux.SIGUSR2
SignalPanic is used to panic the running threads. It is a signal which cannot be used by the application: it must be caught and ignored by the runtime (in order to catch possible races).
const StraceEnableBits = StraceEnableLog | StraceEnableEvent
StraceEnableBits combines both strace log and event flags.
const SupportedCloneFlags = linux.CLONE_VM | linux.CLONE_FS | linux.CLONE_FILES | linux.CLONE_SYSVSEM | linux.CLONE_THREAD | linux.CLONE_SIGHAND | linux.CLONE_CHILD_SETTID | linux.CLONE_NEWPID | linux.CLONE_CHILD_CLEARTID | linux.CLONE_CHILD_SETTID | linux.CLONE_PARENT | linux.CLONE_PARENT_SETTID | linux.CLONE_SETTLS | linux.CLONE_NEWUSER | linux.CLONE_NEWUTS | linux.CLONE_NEWIPC | linux.CLONE_NEWNET | linux.CLONE_PTRACE | linux.CLONE_UNTRACED | linux.CLONE_IO | linux.CLONE_VFORK | linux.CLONE_DETACHED | linux.CLONE_NEWNS
SupportedCloneFlags is the bitwise OR of all the supported flags for clone. TODO(b/290826530): Implement CLONE_INTO_CGROUP when cgroups v2 is implemented.
const TasksLimit = (1 << 16)
TasksLimit is the maximum number of threads for untrusted application. Linux doesn't really limit this directly, rather it is limited by total memory size, stacks allocated and a global maximum. There's no real reason for us to limit it either, (esp. since threads are backed by go routines), and we would expect to hit resource limits long before hitting this number. However, for correctness, we still check that the user doesn't exceed this number.
Note that because of the way futexes are implemented, there *are* in fact serious restrictions on valid thread IDs. They are limited to 2^30 - 1 (kernel/fork.c:MAX_THREADS).
Variables ¶
var CgroupCtrls = []CgroupControllerType{"cpu", "cpuacct", "cpuset", "devices", "job", "memory", "pids"}
CgroupCtrls is the list of cgroup controllers.
var ( // CtrlDoExit is returned by the implementations of the exit and exit_group // syscalls to enter the task exit path directly, skipping syscall exit // tracing. CtrlDoExit = &SyscallControl{next: (*runExit)(nil), ignoreReturn: true} )
var ErrNoWaitableEvent = errors.New("non-blocking Wait found eligible threads but no waitable events")
ErrNoWaitableEvent is returned by non-blocking Task.Waits (e.g. waitpid(WNOHANG)) that find no waitable events, but determine that waitable events may exist in the future. (In contrast, if a non-blocking or blocking Wait determines that there are no tasks that can produce a waitable event, Task.Wait returns ECHILD.)
var IOUringEnabled = false
IOUringEnabled is set to true when IO_URING is enabled. Added as a global to allow easy access everywhere.
MAX_RW_COUNT is the maximum size in bytes of a single read or write. Reads and writes that exceed this size may be silently truncated. (Linux: include/linux/fs.h:MAX_RW_COUNT)
var StopSignals = linux.MakeSignalSet(linux.SIGSTOP, linux.SIGTSTP, linux.SIGTTIN, linux.SIGTTOU)
StopSignals is the set of signals whose default action is SignalActionStop.
var UnblockableSignals = linux.MakeSignalSet(linux.SIGKILL, linux.SIGSTOP)
UnblockableSignals contains the set of signals which cannot be blocked.
Functions ¶
func ContextCanTrace ¶
ContextCanTrace returns true if ctx is permitted to trace t, in the same sense as kernel.Task.CanTrace.
func ExtractErrno ¶
ExtractErrno extracts an integer error number from the error. The syscall number is purely for context in the error case. Use -1 if syscall number is unknown.
func IncrementUnimplementedSyscallCounter ¶
func IncrementUnimplementedSyscallCounter(sysno uintptr)
IncrementUnimplementedSyscallCounter increments the "unimplemented syscall" metric for the given syscall number. A syscall table must have been initialized prior to calling this function. +checkescape:all
func LoadSeccheckData ¶
func LoadSeccheckData(t *Task, mask seccheck.FieldMask, info *pb.ContextData)
LoadSeccheckData sets info from the task based on mask.
func LoadSeccheckDataLocked ¶
LoadSeccheckDataLocked sets info from the task based on mask.
Preconditions: The TaskSet mutex must be locked.
func RegisterSyscallTable ¶
func RegisterSyscallTable(s *SyscallTable)
RegisterSyscallTable registers a new syscall table for use by a Kernel.
func SignalInfoNoInfo ¶
func SignalInfoNoInfo(sig linux.Signal, sender, receiver *Task) *linux.SignalInfo
SignalInfoNoInfo returns a SignalInfo equivalent to Linux's SEND_SIG_NOINFO.
func SignalInfoPriv ¶
func SignalInfoPriv(sig linux.Signal) *linux.SignalInfo
SignalInfoPriv returns a SignalInfo equivalent to Linux's SEND_SIG_PRIV.
Types ¶
type AIOCallback ¶
AIOCallback is an function that does asynchronous I/O on behalf of a task.
type Cgroup ¶
type Cgroup struct { *kernfs.Dentry CgroupImpl }
Cgroup represents a named pointer to a cgroup in cgroupfs. When a task enters a cgroup, it holds a reference on the underlying dentry pointing to the cgroup.
+stateify savable
type CgroupController ¶
type CgroupController interface { // Returns the type of this cgroup controller (ex "memory", "cpu"). Returned // value is valid for the lifetime of the controller. Type() CgroupControllerType // Hierarchy returns the ID of the hierarchy this cgroup controller is // attached to. Returned value is valid for the lifetime of the controller. HierarchyID() uint32 // EffectiveRootCgroup returns the effective root cgroup for this // controller. This is either the actual root of the underlying cgroupfs // filesystem, or the override root configured at sandbox startup. Returned // value is valid for the lifetime of the controller. EffectiveRootCgroup() Cgroup // NumCgroups returns the number of cgroups managed by this controller. // Returned value is a snapshot in time. NumCgroups() uint64 // Enabled returns whether this controller is enabled. Returned value is a // snapshot in time. Enabled() bool }
CgroupController is the common interface to cgroup controllers available to the entire sentry. The controllers themselves are defined by cgroupfs.
Callers of this interface are often unable access synchronization needed to ensure returned values remain valid. Some of values returned from this interface are thus snapshots in time, and may become stale. This is ok for many callers like procfs.
type CgroupControllerType ¶
type CgroupControllerType string
CgroupControllerType is the name of a cgroup controller.
func ParseCgroupController ¶
func ParseCgroupController(val string) (CgroupControllerType, error)
ParseCgroupController parses a string as a CgroupControllerType.
type CgroupImpl ¶
type CgroupImpl interface { // Controllers lists the controller associated with this cgroup. Controllers() []CgroupController // HierarchyID returns the id of the hierarchy that contains this cgroup. HierarchyID() uint32 // Name returns the name for this cgroup, if any. If no name was provided // when the hierarchy was created, returns "". Name() string // Enter moves t into this cgroup. Enter(t *Task) // Leave moves t out of this cgroup. Leave(t *Task) // PrepareMigrate initiates a migration of t from src to this cgroup. See // cgroupfs.controller.PrepareMigrate. PrepareMigrate(t *Task, src *Cgroup) error // CommitMigrate completes an in-flight migration. See // cgroupfs.controller.CommitMigrate. CommitMigrate(t *Task, src *Cgroup) // AbortMigrate cancels an in-flight migration. See // cgroupfs.controller.AbortMigrate. AbortMigrate(t *Task, src *Cgroup) // Charge charges a controller in this cgroup for a particular resource. key // must match a valid resource for the specified controller type. // // The implementer should silently succeed if no matching controllers are // found. // // The underlying implementation will panic if passed an incompatible // resource type for a given controller. // // See cgroupfs.controller.Charge. Charge(t *Task, d *kernfs.Dentry, ctl CgroupControllerType, res CgroupResourceType, value int64) error // ReadControlFromBackground allows a background context to read a cgroup's // control values. ReadControl(ctx context.Context, name string) (string, error) // WriteControl allows a background context to write a cgroup's control // values. WriteControl(ctx context.Context, name string, val string) error // ID returns the id of this cgroup. ID() uint32 }
CgroupImpl is the common interface to cgroups.
type CgroupMigrationContext ¶
type CgroupMigrationContext struct {
// contains filtered or unexported fields
}
CgroupMigrationContext represents an in-flight cgroup migration for a single task.
func (*CgroupMigrationContext) Abort ¶
func (ctx *CgroupMigrationContext) Abort()
Abort cancels a migration.
func (*CgroupMigrationContext) Commit ¶
func (ctx *CgroupMigrationContext) Commit()
Commit completes a migration.
type CgroupMount ¶
CgroupMount contains the cgroup mount. These mounts are created for the root container by default and are stored in the kernel.
+stateify savable
type CgroupRegistry ¶
type CgroupRegistry struct {
// contains filtered or unexported fields
}
CgroupRegistry tracks the active set of cgroup controllers on the system.
+stateify savable
func (*CgroupRegistry) AddCgroup ¶
func (r *CgroupRegistry) AddCgroup(cg CgroupImpl)
AddCgroup adds the ID and cgroup in the map.
func (*CgroupRegistry) FindCgroup ¶
func (r *CgroupRegistry) FindCgroup(ctx context.Context, ctype CgroupControllerType, path string) (Cgroup, error)
FindCgroup locates a cgroup with the given parameters.
A cgroup is considered a match even if it contains other controllers on the same hierarchy.
func (*CgroupRegistry) FindHierarchy ¶
func (r *CgroupRegistry) FindHierarchy(name string, ctypes []CgroupControllerType) (*vfs.Filesystem, error)
FindHierarchy returns a cgroup filesystem containing exactly the set of controllers named in ctypes, and optionally the name specified in name if it isn't empty. If no such FS is found, FindHierarchy return nil. FindHierarchy takes a reference on the returned FS, which is transferred to the caller.
func (*CgroupRegistry) GenerateProcCgroups ¶
func (r *CgroupRegistry) GenerateProcCgroups(buf *bytes.Buffer)
GenerateProcCgroups writes the contents of /proc/cgroups to buf.
func (*CgroupRegistry) GetCgroup ¶
func (r *CgroupRegistry) GetCgroup(cid uint32) (CgroupImpl, error)
GetCgroup returns the cgroup associated with the cgroup ID.
func (*CgroupRegistry) NextCgroupID ¶
func (r *CgroupRegistry) NextCgroupID() (uint32, error)
NextCgroupID returns a newly allocated, unique cgroup ID.
func (*CgroupRegistry) Register ¶
func (r *CgroupRegistry) Register(name string, cs []CgroupController, fs cgroupFS) error
Register registers the provided set of controllers with the registry as a new hierarchy. If any controller is already registered, the function returns an error without modifying the registry. Register sets the hierarchy ID for the filesystem on success.
func (*CgroupRegistry) Unregister ¶
func (r *CgroupRegistry) Unregister(hid uint32)
Unregister removes a previously registered hierarchy from the registry. If no such hierarchy is registered, Unregister is a no-op.
type CgroupResourceType ¶
type CgroupResourceType int
CgroupResourceType represents a resource type tracked by a particular controller.
const ( // CgroupResourcePID represents a charge for pids.current. CgroupResourcePID CgroupResourceType = iota )
Resources for the cpuacct controller.
type CreateProcessArgs ¶
type CreateProcessArgs struct { // Filename is the filename to load as the init binary. // // If this is provided as "", File will be checked, then the file will be // guessed via Argv[0]. Filename string // File is a passed host FD pointing to a file to load as the init binary. // // This is checked if and only if Filename is "". File *vfs.FileDescription // Argv is a list of arguments. Argv []string // Envv is a list of environment variables. Envv []string // WorkingDirectory is the initial working directory. // // This defaults to the root if empty. WorkingDirectory string // Credentials is the initial credentials. Credentials *auth.Credentials // FDTable is the initial set of file descriptors. If CreateProcess succeeds, // it takes a reference on FDTable. FDTable *FDTable // Umask is the initial umask. Umask uint // Limits are the initial resource limits. Limits *limits.LimitSet // MaxSymlinkTraversals is the maximum number of symlinks to follow // during resolution. MaxSymlinkTraversals uint // UTSNamespace is the initial UTS namespace. UTSNamespace *UTSNamespace // IPCNamespace is the initial IPC namespace. IPCNamespace *IPCNamespace // PIDNamespace is the initial PID Namespace. PIDNamespace *PIDNamespace // MountNamespace optionally contains the mount namespace for this // process. If nil, the init process's mount namespace is used. // // Anyone setting MountNamespace must donate a reference (i.e. // increment it). MountNamespace *vfs.MountNamespace // ContainerID is the container that the process belongs to. ContainerID string // InitialCgroups are the cgroups the container is initialized to. InitialCgroups map[Cgroup]struct{} // Origin indicates how the task was first created. Origin TaskOrigin }
CreateProcessArgs holds arguments to kernel.CreateProcess.
func (*CreateProcessArgs) NewContext ¶
func (args *CreateProcessArgs) NewContext(k *Kernel) context.Context
NewContext returns a context.Context that represents the task that will be created by args.NewContext(k).
type FDFlags ¶
type FDFlags struct { // CloseOnExec indicates the descriptor should be closed on exec. CloseOnExec bool }
FDFlags define flags for an individual descriptor.
+stateify savable
func (FDFlags) ToLinuxFDFlags ¶
ToLinuxFDFlags converts a kernel.FDFlags object to a Linux descriptor flags representation.
func (FDFlags) ToLinuxFileFlags ¶
ToLinuxFileFlags converts a kernel.FDFlags object to a Linux file flags representation.
type FDTable ¶
type FDTable struct { FDTableRefs // contains filtered or unexported fields }
FDTable is used to manage File references and flags.
+stateify savable
func (*FDTable) CurrentMaxFDs ¶
CurrentMaxFDs returns the number of file descriptors that may be stored in f without reallocation.
func (*FDTable) DecRef ¶
DecRef implements RefCounter.DecRef.
If f reaches zero references, all of its file descriptors are removed.
func (*FDTable) Fork ¶
Fork returns an independent FDTable, cloning all FDs up to maxFds (non-inclusive).
func (*FDTable) Get ¶
func (f *FDTable) Get(fd int32) (*vfs.FileDescription, FDFlags)
Get returns a reference to the file and the flags for the FD or nil if no file is defined for the given fd.
N.B. Callers are required to use DecRef when they are done.
func (*FDTable) GetFDs ¶
GetFDs returns a sorted list of valid fds.
Precondition: The caller must be running on the task goroutine, or Task.mu must be locked.
func (*FDTable) NewFD ¶
func (f *FDTable) NewFD(ctx context.Context, minFD int32, file *vfs.FileDescription, flags FDFlags) (int32, error)
NewFD allocates a file descriptor greater than or equal to minFD for the given file description. If it succeeds, it takes a reference on file.
func (*FDTable) NewFDAt ¶
func (f *FDTable) NewFDAt(ctx context.Context, fd int32, file *vfs.FileDescription, flags FDFlags) (*vfs.FileDescription, error)
NewFDAt sets the file reference for the given FD. If there is an existing file description for that FD, it is returned.
N.B. Callers are required to use DecRef on the returned file when they are done.
Precondition: file != nil.
func (*FDTable) NewFDs ¶
func (f *FDTable) NewFDs(ctx context.Context, minFD int32, files []*vfs.FileDescription, flags FDFlags) (fds []int32, err error)
NewFDs allocates new FDs guaranteed to be the lowest number available greater than or equal to the minFD parameter. All files will share the set flags. Success is guaranteed to be all or none.
func (*FDTable) Remove ¶
Remove removes an FD from f. It returns the removed file description.
N.B. Callers are required to use DecRef on the returned file when they are done.
func (*FDTable) RemoveNextInRange ¶
func (f *FDTable) RemoveNextInRange(ctx context.Context, startFd int32, endFd int32) (int32, *vfs.FileDescription)
RemoveNextInRange removes the next FD that falls within the given range, and returns the FD number and FileDescription of the removed FD.
N.B. Callers are required to use DecRef on the returned file when they are done.
func (*FDTable) SetFlags ¶
SetFlags sets the flags for the given file descriptor.
True is returned iff flags were changed.
type FSContext ¶
type FSContext struct { FSContextRefs // contains filtered or unexported fields }
FSContext contains filesystem context.
This includes umask and working directory.
+stateify savable
func NewFSContext ¶
func NewFSContext(root, cwd vfs.VirtualDentry, umask uint) *FSContext
NewFSContext returns a new filesystem context.
func (*FSContext) DecRef ¶
DecRef implements RefCounter.DecRef.
When f reaches zero references, DecRef will be called on both root and cwd Dirents.
Note that there may still be calls to WorkingDirectory() or RootDirectory() (that return nil). This is because valid references may still be held via proc files or other mechanisms.
func (*FSContext) RootDirectory ¶
func (f *FSContext) RootDirectory() vfs.VirtualDentry
RootDirectory returns the current filesystem root.
This will return an empty vfs.VirtualDentry if called after f is destroyed, otherwise it will return a Dirent with a reference taken.
func (*FSContext) SetRootDirectory ¶
func (f *FSContext) SetRootDirectory(ctx context.Context, vd vfs.VirtualDentry)
SetRootDirectory sets the root directory. It takes a reference on vd.
This is not a valid call after f is destroyed.
func (*FSContext) SetWorkingDirectory ¶
func (f *FSContext) SetWorkingDirectory(ctx context.Context, d vfs.VirtualDentry)
SetWorkingDirectory sets the current working directory. This will take an extra reference on the VirtualDentry.
This is not a valid call after f is destroyed.
func (*FSContext) SwapUmask ¶
SwapUmask atomically sets the current umask and returns the old umask.
func (*FSContext) WorkingDirectory ¶
func (f *FSContext) WorkingDirectory() vfs.VirtualDentry
WorkingDirectory returns the current working directory.
This will return an empty vfs.VirtualDentry if called after f is destroyed, otherwise it will return a Dirent with a reference taken.
type IPCNamespace ¶
type IPCNamespace struct {
// contains filtered or unexported fields
}
IPCNamespace represents an IPC namespace.
+stateify savable
func IPCNamespaceFromContext ¶
func IPCNamespaceFromContext(ctx context.Context) *IPCNamespace
IPCNamespaceFromContext returns the IPC namespace in which ctx is executing, or nil if there is no such IPC namespace. It takes a reference on the namespace.
func NewIPCNamespace ¶
func NewIPCNamespace(userNS *auth.UserNamespace) *IPCNamespace
NewIPCNamespace creates a new IPC namespace.
func (*IPCNamespace) DecRef ¶
func (i *IPCNamespace) DecRef(ctx context.Context)
DecRef decrements the namespace's refcount.
func (*IPCNamespace) Destroy ¶
func (i *IPCNamespace) Destroy(ctx context.Context)
Destroy implements nsfs.Namespace.Destroy.
func (*IPCNamespace) GetInode ¶
func (i *IPCNamespace) GetInode() *nsfs.Inode
GetInode returns the nsfs inode associated with the IPC namespace.
func (*IPCNamespace) IncRef ¶
func (i *IPCNamespace) IncRef()
IncRef increments the Namespace's refcount.
func (*IPCNamespace) InitPosixQueues ¶
func (i *IPCNamespace) InitPosixQueues(ctx context.Context, vfsObj *vfs.VirtualFilesystem, creds *auth.Credentials) error
InitPosixQueues creates a new POSIX queue registry, and returns an error if the registry was previously initialized.
func (*IPCNamespace) MsgqueueRegistry ¶
func (i *IPCNamespace) MsgqueueRegistry() *msgqueue.Registry
MsgqueueRegistry returns the message queue registry for this namespace.
func (*IPCNamespace) PosixQueues ¶
func (i *IPCNamespace) PosixQueues() *mq.Registry
PosixQueues returns the posix message queue registry for this namespace.
Precondition: i.InitPosixQueues must have been called.
func (*IPCNamespace) SemaphoreRegistry ¶
func (i *IPCNamespace) SemaphoreRegistry() *semaphore.Registry
SemaphoreRegistry returns the semaphore set registry for this namespace.
func (*IPCNamespace) SetInode ¶
func (i *IPCNamespace) SetInode(inode *nsfs.Inode)
SetInode sets the nsfs `inode` to the IPC namespace.
func (*IPCNamespace) ShmRegistry ¶
func (i *IPCNamespace) ShmRegistry() *shm.Registry
ShmRegistry returns the shm segment registry for this namespace.
func (*IPCNamespace) Type ¶
func (i *IPCNamespace) Type() string
Type implements nsfs.Namespace.Type.
func (*IPCNamespace) UserNamespace ¶
func (i *IPCNamespace) UserNamespace() *auth.UserNamespace
UserNamespace returns the user namespace associated with the namespace.
type InitKernelArgs ¶
type InitKernelArgs struct { // FeatureSet is the emulated CPU feature set. FeatureSet cpuid.FeatureSet // Timekeeper manages time for all tasks in the system. Timekeeper *Timekeeper // RootUserNamespace is the root user namespace. RootUserNamespace *auth.UserNamespace // RootNetworkNamespace is the root network namespace. If nil, no networking // will be available. RootNetworkNamespace *inet.Namespace // ApplicationCores is the number of logical CPUs visible to sandboxed // applications. The set of logical CPU IDs is [0, ApplicationCores); thus // ApplicationCores is analogous to Linux's nr_cpu_ids, the index of the // most significant bit in cpu_possible_mask + 1. ApplicationCores uint // If UseHostCores is true, Task.CPU() returns the task goroutine's CPU // instead of a virtualized CPU number, and Task.CopyToCPUMask() is a // no-op. If ApplicationCores is less than hostcpu.MaxPossibleCPU(), it // will be overridden. UseHostCores bool // ExtraAuxv contains additional auxiliary vector entries that are added to // each process by the ELF loader. ExtraAuxv []arch.AuxEntry // Vdso holds the VDSO and its parameter page. Vdso *loader.VDSO // RootUTSNamespace is the root UTS namespace. RootUTSNamespace *UTSNamespace // RootIPCNamespace is the root IPC namespace. RootIPCNamespace *IPCNamespace // PIDNamespace is the root PID namespace. PIDNamespace *PIDNamespace // MaxFDLimit specifies the maximum file descriptor number that can be // used by processes. If it is zero, the limit will be set to // unlimited. MaxFDLimit int32 }
InitKernelArgs holds arguments to Init.
type IntervalTimer ¶
type IntervalTimer struct {
// contains filtered or unexported fields
}
IntervalTimer represents a POSIX interval timer as described by timer_create(2).
+stateify savable
func (*IntervalTimer) DestroyTimer ¶
func (it *IntervalTimer) DestroyTimer()
DestroyTimer releases it's resources.
func (*IntervalTimer) NotifyTimer ¶
NotifyTimer implements ktime.TimerListener.NotifyTimer.
func (*IntervalTimer) PauseTimer ¶
func (it *IntervalTimer) PauseTimer()
PauseTimer pauses the associated Timer.
func (*IntervalTimer) ResumeTimer ¶
func (it *IntervalTimer) ResumeTimer()
ResumeTimer resumes the associated Timer.
type Kcov ¶
type Kcov struct {
// contains filtered or unexported fields
}
Kcov provides kernel coverage data to userspace through a memory-mapped region, as kcov does in Linux.
To give the illusion that the data is always up to date, we update the shared memory every time before we return to userspace.
func (*Kcov) Clear ¶
Clear resets the mode and clears the owning task and memory mapping for kcov. It is called when the fd corresponding to kcov is closed. Note that the mode needs to be set so that the next call to kcov.TaskWork() will exit early.
func (*Kcov) ConfigureMMap ¶
ConfigureMMap is called by the vfs.FileDescription for this kcov instance to implement vfs.FileDescription.ConfigureMMap.
func (*Kcov) DisableTrace ¶
DisableTrace performs the KCOV_DISABLE_TRACE ioctl.
func (*Kcov) EnableTrace ¶
EnableTrace performs the KCOV_ENABLE_TRACE ioctl.
func (*Kcov) OnTaskExit ¶
func (kcov *Kcov) OnTaskExit()
OnTaskExit is called when the owning task exits. It is similar to kcov.Clear(), except the memory mapping is not cleared, so that the same mapping can be used in the future if kcov is enabled again by another task.
type Kernel ¶
type Kernel struct { // Platform is the platform that is used to execute tasks in the created // Kernel. platform.Platform `state:"nosave"` // SpecialOpts contains special kernel options. SpecialOpts // If set to true, report address space activation waits as if the task is in // external wait so that the watchdog doesn't report the task stuck. SleepForAddressSpaceActivation bool // YAMAPtraceScope is the current level of YAMA ptrace restrictions. YAMAPtraceScope atomicbitops.Int32 // MaxFDLimit specifies the maximum file descriptor number that can be // used by processes. MaxFDLimit atomicbitops.Int32 // contains filtered or unexported fields }
Kernel represents an emulated Linux kernel. It must be initialized by calling Init() or LoadFrom().
+stateify savable
func KernelFromContext ¶
KernelFromContext returns the Kernel in which ctx is executing, or nil if there is no such Kernel.
func (*Kernel) AddCgroupMount ¶
func (k *Kernel) AddCgroupMount(ctl string, mnt *CgroupMount)
AddCgroupMount adds the cgroup mounts to the cgroupMountsMap. These cgroup mounts are created during the creation of root container process and the reference ownership is transferred to the kernel.
func (*Kernel) AddDevGofer ¶
AddDevGofer initializes the dev gofer connection and starts tracking it. It takes ownership of goferFD.
func (*Kernel) ApplicationCores ¶
ApplicationCores returns the number of CPUs visible to sandboxed applications.
func (*Kernel) CPUClockNow ¶
CPUClockNow returns the current value of k.cpuClock.
func (*Kernel) CgroupRegistry ¶
func (k *Kernel) CgroupRegistry() *CgroupRegistry
CgroupRegistry returns the cgroup registry.
func (*Kernel) CreateProcess ¶
func (k *Kernel) CreateProcess(args CreateProcessArgs) (*ThreadGroup, ThreadID, error)
CreateProcess creates a new task in a new thread group with the given options. The new task has no parent and is in the root PID namespace.
If k.Start() has already been called, then the created process must be started by calling kernel.StartProcess(tg).
If k.Start() has not yet been called, then the created task will begin running when k.Start() is called.
CreateProcess has no analogue in Linux; it is used to create the initial application task, as well as processes started by the control server.
func (*Kernel) DeleteSocket ¶
func (k *Kernel) DeleteSocket(sock *vfs.FileDescription)
DeleteSocket removes a socket from the system-wide socket table.
func (*Kernel) EmitUnimplementedEvent ¶
EmitUnimplementedEvent emits an UnimplementedSyscall event via the event channel.
func (*Kernel) FeatureSet ¶
func (k *Kernel) FeatureSet() cpuid.FeatureSet
FeatureSet returns the FeatureSet.
func (*Kernel) GenerateInotifyCookie ¶
GenerateInotifyCookie generates a unique inotify event cookie.
Returned values may overlap with previously returned values if the value space is exhausted. 0 is not a valid cookie value, all other values representable in a uint32 are allowed.
func (*Kernel) GetCgroupMount ¶
func (k *Kernel) GetCgroupMount(ctl string) *CgroupMount
GetCgroupMount returns the cgroup mount for the given cgroup controller.
func (*Kernel) GetNamespaceInode ¶
GetNamespaceInode returns a new nsfs inode which serves as a reference counter for the namespace.
func (*Kernel) GetUserCounters ¶
func (k *Kernel) GetUserCounters(uid auth.KUID) *UserCounters
GetUserCounters returns the user counters for the given KUID.
func (*Kernel) GlobalInit ¶
func (k *Kernel) GlobalInit() *ThreadGroup
GlobalInit returns the thread group with ID 1 in the root PID namespace, or nil if no such thread group exists. GlobalInit may return a thread group containing no tasks if the thread group has already exited.
func (*Kernel) Init ¶
func (k *Kernel) Init(args InitKernelArgs) error
Init initialize the Kernel with no tasks.
Callers must manually set Kernel.Platform and call Kernel.SetMemoryFile before calling Init.
func (*Kernel) Kill ¶
func (k *Kernel) Kill(ws linux.WaitStatus)
Kill requests that all tasks in k immediately exit as if group exiting with status ws. Kill does not wait for tasks to exit.
func (*Kernel) ListSockets ¶
func (k *Kernel) ListSockets() []*SocketRecord
ListSockets returns a snapshot of all sockets.
Callers of ListSockets() should use SocketRecord.Sock.TryIncRef() to get a reference on a socket in the table.
func (*Kernel) LoadFrom ¶
func (k *Kernel) LoadFrom(ctx context.Context, r io.Reader, pagesFile *fd.FD, timeReady chan struct{}, net inet.Stack, clocks sentrytime.Clocks, vfsOpts *vfs.CompleteRestoreOptions) error
LoadFrom returns a new Kernel loaded from args.
func (*Kernel) LoadTaskImage ¶
func (k *Kernel) LoadTaskImage(ctx context.Context, args loader.LoadArgs) (*TaskImage, *syserr.Error)
LoadTaskImage loads a specified file into a new TaskImage.
args.MemoryManager does not need to be set by the caller.
func (*Kernel) MemoryFile ¶
func (k *Kernel) MemoryFile() *pgalloc.MemoryFile
MemoryFile returns the MemoryFile that provides application memory.
func (*Kernel) MonotonicClock ¶
MonotonicClock returns the application CLOCK_MONOTONIC clock.
func (*Kernel) NetlinkPorts ¶
NetlinkPorts returns the netlink port manager.
func (*Kernel) NewFDTable ¶
NewFDTable allocates a new FDTable that may be used by tasks in k.
func (*Kernel) NewThreadGroup ¶
func (k *Kernel) NewThreadGroup(pidns *PIDNamespace, sh *SignalHandlers, terminationSignal linux.Signal, limits *limits.LimitSet) *ThreadGroup
NewThreadGroup returns a new, empty thread group in PID namespace pidns. The thread group leader will send its parent terminationSignal when it exits. The new thread group isn't visible to the system until a task has been created inside of it by a successful call to TaskSet.NewTask.
func (*Kernel) Pause ¶
func (k *Kernel) Pause()
Pause requests that all tasks in k temporarily stop executing, and blocks until all tasks and asynchronous I/O operations in k have stopped. Multiple calls to Pause nest and require an equal number of calls to Unpause to resume execution.
func (*Kernel) PopulateNewCgroupHierarchy ¶
PopulateNewCgroupHierarchy moves all tasks into a newly created cgroup hierarchy.
Precondition: root must be a new cgroup with no tasks. This implies the controllers for root are also new and currently manage no task, which in turn implies the new cgroup can be populated without migrating tasks between cgroups.
func (*Kernel) RealtimeClock ¶
RealtimeClock returns the application CLOCK_REALTIME clock.
func (*Kernel) RebuildTraceContexts ¶
func (k *Kernel) RebuildTraceContexts()
RebuildTraceContexts rebuilds the trace context for all tasks.
Unfortunately, if these are built while tracing is not enabled, then we will not have meaningful trace data. Rebuilding here ensures that we can do so after tracing has been enabled.
func (*Kernel) ReceiveTaskStates ¶
func (k *Kernel) ReceiveTaskStates()
ReceiveTaskStates receives full states for all tasks.
func (*Kernel) RecordSocket ¶
func (k *Kernel) RecordSocket(sock *vfs.FileDescription)
RecordSocket adds a socket to the system-wide socket table for tracking.
Precondition: Caller must hold a reference to sock.
Note that the socket table will not hold a reference on the vfs.FileDescription.
func (*Kernel) RegisterContainerName ¶
RegisterContainerName registers a container name for a given container ID.
func (*Kernel) Release ¶
func (k *Kernel) Release()
Release releases resources owned by k.
Precondition: This should only be called after the kernel is fully initialized, e.g. after k.Start() has been called.
func (*Kernel) ReleaseCgroupHierarchy ¶
ReleaseCgroupHierarchy moves all tasks out of all cgroups belonging to the hierarchy with the provided id. This is intended for use during hierarchy teardown, as otherwise the tasks would be orphaned w.r.t to some controllers.
func (*Kernel) RemoveDevGofer ¶
RemoveDevGofer closes the dev gofer connection, if one exists, and stops tracking it.
func (*Kernel) ReplaceFSContextRoots ¶
func (k *Kernel) ReplaceFSContextRoots(ctx context.Context, oldRoot vfs.VirtualDentry, newRoot vfs.VirtualDentry)
ReplaceFSContextRoots updates root and cwd to `newRoot` in the FSContext across all tasks whose old root or cwd were `oldRoot`.
func (*Kernel) RestoreContainerMapping ¶
RestoreContainerMapping remaps old container IDs to new ones after a restore. containerIDs maps "name -> new container ID". Note that container names remain constant between restore sessions.
func (*Kernel) RootIPCNamespace ¶
func (k *Kernel) RootIPCNamespace() *IPCNamespace
RootIPCNamespace takes a reference and returns the root IPCNamespace.
func (*Kernel) RootNetworkNamespace ¶
RootNetworkNamespace returns the root network namespace, always non-nil.
func (*Kernel) RootPIDNamespace ¶
func (k *Kernel) RootPIDNamespace() *PIDNamespace
RootPIDNamespace returns the root PIDNamespace.
func (*Kernel) RootUTSNamespace ¶
func (k *Kernel) RootUTSNamespace() *UTSNamespace
RootUTSNamespace returns the root UTSNamespace.
func (*Kernel) RootUserNamespace ¶
func (k *Kernel) RootUserNamespace() *auth.UserNamespace
RootUserNamespace returns the root UserNamespace.
func (*Kernel) SaveStatus ¶
SaveStatus returns the sandbox save status. If it was saved successfully, autosaved indicates whether save was triggered by autosave. If it was not saved successfully, err indicates the sandbox error that caused the kernel to exit during save.
func (*Kernel) SaveTo ¶
SaveTo saves the state of k to w.
Preconditions: The kernel must be paused throughout the call to SaveTo.
func (*Kernel) SendContainerSignal ¶
func (k *Kernel) SendContainerSignal(cid string, info *linux.SignalInfo) error
SendContainerSignal sends the given signal to all processes inside the namespace that match the given container ID.
func (*Kernel) SendExternalSignal ¶
func (k *Kernel) SendExternalSignal(info *linux.SignalInfo, context string)
SendExternalSignal injects a signal into the kernel.
context is used only for debugging to describe how the signal was received.
Preconditions: Kernel must have an init process.
func (*Kernel) SendExternalSignalProcessGroup ¶
func (k *Kernel) SendExternalSignalProcessGroup(pg *ProcessGroup, info *linux.SignalInfo) error
SendExternalSignalProcessGroup sends a signal to all ThreadGroups in the given process group.
This function doesn't skip signals like SendExternalSignal does.
func (*Kernel) SendExternalSignalThreadGroup ¶
func (k *Kernel) SendExternalSignalThreadGroup(tg *ThreadGroup, info *linux.SignalInfo) error
SendExternalSignalThreadGroup injects a signal into an specific ThreadGroup.
This function doesn't skip signals like SendExternalSignal does.
func (*Kernel) SetHostMount ¶
SetHostMount sets the hostfs mount.
func (*Kernel) SetMemoryFile ¶
func (k *Kernel) SetMemoryFile(mf *pgalloc.MemoryFile)
SetMemoryFile sets Kernel.mf. SetMemoryFile must be called before Init or LoadFrom.
func (*Kernel) SetSaveError ¶
SetSaveError sets the sandbox error that caused the kernel to exit during save, if one is not already set.
func (*Kernel) SetSaveSuccess ¶
SetSaveSuccess sets the flag indicating that save completed successfully, if no status was already set.
func (*Kernel) SocketMount ¶
SocketMount returns the sockfs mount.
func (*Kernel) Start ¶
Start starts execution of all tasks in k.
Preconditions: Start may be called exactly once.
func (*Kernel) StartProcess ¶
func (k *Kernel) StartProcess(tg *ThreadGroup)
StartProcess starts running a process that was created with CreateProcess.
func (*Kernel) SupervisorContext ¶
SupervisorContext returns a Context with maximum privileges in k. It should only be used by goroutines outside the control of the emulated kernel defined by e.
Callers are responsible for ensuring that the returned Context is not used concurrently with changes to the Kernel.
func (*Kernel) TaskContainerName ¶
TaskContainerName returns the container name for a given task.
func (*Kernel) TestOnlySetGlobalInit ¶
func (k *Kernel) TestOnlySetGlobalInit(tg *ThreadGroup)
TestOnlySetGlobalInit sets the thread group with ID 1 in the root PID namespace.
func (*Kernel) Timekeeper ¶
func (k *Kernel) Timekeeper() *Timekeeper
Timekeeper returns the Timekeeper.
func (*Kernel) Unpause ¶
func (k *Kernel) Unpause()
Unpause ends the effect of a previous call to Pause. If Unpause is called without a matching preceding call to Pause, Unpause may panic.
func (*Kernel) VFS ¶
func (k *Kernel) VFS() *vfs.VirtualFilesystem
VFS returns the virtual filesystem for the kernel.
func (*Kernel) WaitExited ¶
func (k *Kernel) WaitExited()
WaitExited blocks until all tasks in k have exited.
type OldRSeqCriticalRegion ¶
type OldRSeqCriticalRegion struct { // When a task in this thread group has its CPU preempted (as defined by // platform.ErrContextCPUPreempted) or has a signal delivered to an // application handler while its instruction pointer is in CriticalSection, // set the instruction pointer to Restart and application register r10 (on // amd64) to the former instruction pointer. CriticalSection hostarch.AddrRange Restart hostarch.Addr }
OldRSeqCriticalRegion describes an old rseq critical region.
+stateify savable
type PIDNamespace ¶
type PIDNamespace struct {
// contains filtered or unexported fields
}
A PIDNamespace represents a PID namespace, a bimap between thread IDs and tasks. See the pid_namespaces(7) man page for further details.
N.B. A task is said to be visible in a PID namespace if the PID namespace contains a thread ID that maps to that task.
+stateify savable
func NewRootPIDNamespace ¶
func NewRootPIDNamespace(userns *auth.UserNamespace) *PIDNamespace
NewRootPIDNamespace creates the root PID namespace. 'owner' is not available yet when root namespace is created and must be set by caller.
func PIDNamespaceFromContext ¶
func PIDNamespaceFromContext(ctx context.Context) *PIDNamespace
PIDNamespaceFromContext returns the PID namespace in which ctx is executing, or nil if there is no such PID namespace.
func (*PIDNamespace) ID ¶
func (ns *PIDNamespace) ID() uint64
ID returns a non-zero ID that is unique across PID namespaces.
func (*PIDNamespace) IDOfProcessGroup ¶
func (ns *PIDNamespace) IDOfProcessGroup(pg *ProcessGroup) ProcessGroupID
IDOfProcessGroup returns the process group assigned to pg in PID namespace ns.
The same constraints apply as IDOfSession.
func (*PIDNamespace) IDOfSession ¶
func (ns *PIDNamespace) IDOfSession(s *Session) SessionID
IDOfSession returns the Session assigned to s in PID namespace ns.
If this group isn't visible in this namespace, zero will be returned. It is the callers responsibility to check that before using this function.
func (*PIDNamespace) IDOfTask ¶
func (ns *PIDNamespace) IDOfTask(t *Task) ThreadID
IDOfTask returns the TID assigned to the given task in PID namespace ns. If the task is not visible in that namespace, IDOfTask returns 0. (This return value is significant in some cases, e.g. getppid() is documented as returning 0 if the caller's parent is in an ancestor namespace and consequently not visible to the caller.) If the task is nil, IDOfTask returns 0.
func (*PIDNamespace) IDOfThreadGroup ¶
func (ns *PIDNamespace) IDOfThreadGroup(tg *ThreadGroup) ThreadID
IDOfThreadGroup returns the TID assigned to tg's leader in PID namespace ns. If the task is not visible in that namespace, IDOfThreadGroup returns 0.
func (*PIDNamespace) NewChild ¶
func (ns *PIDNamespace) NewChild(userns *auth.UserNamespace) *PIDNamespace
NewChild returns a new, empty PID namespace that is a child of ns. Authority over the new PID namespace is controlled by userns.
func (*PIDNamespace) NumTasks ¶
func (ns *PIDNamespace) NumTasks() int
NumTasks returns the number of tasks in ns.
func (*PIDNamespace) NumTasksPerContainer ¶
func (ns *PIDNamespace) NumTasksPerContainer(cid string) int
NumTasksPerContainer returns the number of tasks in ns that belongs to given container.
func (*PIDNamespace) ProcessGroupWithID ¶
func (ns *PIDNamespace) ProcessGroupWithID(id ProcessGroupID) *ProcessGroup
ProcessGroupWithID returns the ProcessGroup with the given ID in the PID namespace ns, or nil if that given ID is not defined in this namespace.
A reference is not taken on the process group.
func (*PIDNamespace) Root ¶
func (ns *PIDNamespace) Root() *PIDNamespace
Root returns the root PID namespace of ns.
func (*PIDNamespace) SessionWithID ¶
func (ns *PIDNamespace) SessionWithID(id SessionID) *Session
SessionWithID returns the Session with the given ID in the PID namespace ns, or nil if that given ID is not defined in this namespace.
A reference is not taken on the session.
func (*PIDNamespace) TaskWithID ¶
func (ns *PIDNamespace) TaskWithID(tid ThreadID) *Task
TaskWithID returns the task with thread ID tid in PID namespace ns. If no task has that TID, TaskWithID returns nil.
func (*PIDNamespace) Tasks ¶
func (ns *PIDNamespace) Tasks() []*Task
Tasks returns a snapshot of the tasks in ns.
func (*PIDNamespace) ThreadGroupWithID ¶
func (ns *PIDNamespace) ThreadGroupWithID(tid ThreadID) *ThreadGroup
ThreadGroupWithID returns the thread group led by the task with thread ID tid in PID namespace ns. If no task has that TID, or if the task with that TID is not a thread group leader, ThreadGroupWithID returns nil.
func (*PIDNamespace) ThreadGroups ¶
func (ns *PIDNamespace) ThreadGroups() []*ThreadGroup
ThreadGroups returns a snapshot of the thread groups in ns.
func (*PIDNamespace) ThreadGroupsAppend ¶
func (ns *PIDNamespace) ThreadGroupsAppend(tgs []*ThreadGroup) []*ThreadGroup
ThreadGroupsAppend appends a snapshot of the thread groups in ns to tgs.
func (*PIDNamespace) UserNamespace ¶
func (ns *PIDNamespace) UserNamespace() *auth.UserNamespace
UserNamespace returns the user namespace associated with PID namespace ns.
type ProcessGroup ¶
type ProcessGroup struct {
// contains filtered or unexported fields
}
ProcessGroup contains an originator threadgroup and a parent Session.
+stateify savable
func (*ProcessGroup) IsOrphan ¶
func (pg *ProcessGroup) IsOrphan() bool
IsOrphan returns true if this process group is an orphan.
func (*ProcessGroup) Originator ¶
func (pg *ProcessGroup) Originator() *ThreadGroup
Originator returns the originator of the process group.
func (*ProcessGroup) SendSignal ¶
func (pg *ProcessGroup) SendSignal(info *linux.SignalInfo) error
SendSignal sends a signal to all processes inside the process group. It is analogous to kernel/signal.c:kill_pgrp.
func (*ProcessGroup) Session ¶
func (pg *ProcessGroup) Session() *Session
Session returns the process group's session without taking a reference.
type Session ¶
type Session struct { SessionRefs // contains filtered or unexported fields }
Session contains a leader threadgroup and a list of ProcessGroups.
+stateify savable
type SignalAction ¶
type SignalAction int
SignalAction is an internal signal action.
const ( SignalActionTerm SignalAction = iota SignalActionCore SignalActionStop SignalActionIgnore SignalActionHandler )
Available signal actions. Note that although we refer the complete set internally, the application is only capable of using the Default and Ignore actions from the system call interface.
type SignalHandlers ¶
type SignalHandlers struct {
// contains filtered or unexported fields
}
SignalHandlers holds information about signal actions.
+stateify savable
func NewSignalHandlers ¶
func NewSignalHandlers() *SignalHandlers
NewSignalHandlers returns a new SignalHandlers specifying all default actions.
func (*SignalHandlers) CopyForExec ¶
func (sh *SignalHandlers) CopyForExec() *SignalHandlers
CopyForExec returns a copy of sh for a thread group that is undergoing an execve. (See comments in Task.finishExec.)
func (*SignalHandlers) Fork ¶
func (sh *SignalHandlers) Fork() *SignalHandlers
Fork returns a copy of sh for a new thread group.
type SocketRecord ¶
type SocketRecord struct { Sock *vfs.FileDescription ID uint64 // Socket table entry number. // contains filtered or unexported fields }
SocketRecord represents a socket recorded in Kernel.sockets.
+stateify savable
type SpecialOpts ¶
type SpecialOpts struct{}
SpecialOpts contains non-standard options for the kernel.
+stateify savable
type Stracer ¶
type Stracer interface { // SyscallEnter is called on syscall entry. // // The returned private data is passed to SyscallExit. SyscallEnter(t *Task, sysno uintptr, args arch.SyscallArguments, flags uint32) any // SyscallExit is called on syscall exit. SyscallExit(context any, t *Task, sysno, rval uintptr, err error) }
Stracer traces syscall execution.
type Syscall ¶
type Syscall struct { // Name is the syscall name. Name string // Fn is the implementation of the syscall. Fn SyscallFn // SupportLevel is the level of support implemented in gVisor. SupportLevel SyscallSupportLevel // Note describes the compatibility of the syscall. Note string // URLs is set of URLs to any relevant bugs or issues. URLs []string // PointCallback is an optional callback that converts syscall arguments // to a proto that can be used with seccheck.Sink. // Callback functions must follow this naming convention: // PointSyscallNameInCamelCase, e.g. PointReadat, PointRtSigaction. PointCallback SyscallToProto }
Syscall includes the syscall implementation and compatibility information.
type SyscallControl ¶
type SyscallControl struct {
// contains filtered or unexported fields
}
SyscallControl is returned by syscalls to control the behavior of Task.doSyscallInvoke.
type SyscallFlagsTable ¶
type SyscallFlagsTable struct {
// contains filtered or unexported fields
}
SyscallFlagsTable manages a set of enable/disable bit fields on a per-syscall basis.
func (*SyscallFlagsTable) Enable ¶
func (e *SyscallFlagsTable) Enable(bit uint32, s map[uintptr]bool, missingEnable bool)
Enable sets enable bit `bit` for all syscalls based on s.
Syscalls missing from `s` are disabled.
Syscalls missing from the initial table passed to Init cannot be added as individual syscalls. If present in s they will be ignored.
Callers to Word may see either the old or new value while this function is executing.
func (*SyscallFlagsTable) EnableAll ¶
func (e *SyscallFlagsTable) EnableAll(bit uint32)
EnableAll sets enable bit bit for all syscalls, present and missing.
func (*SyscallFlagsTable) UpdateSecCheck ¶
func (e *SyscallFlagsTable) UpdateSecCheck(state *seccheck.State)
UpdateSecCheck implements seccheck.SyscallFlagListener.
It is called when per-syscall seccheck event enablement changes.
func (*SyscallFlagsTable) Word ¶
func (e *SyscallFlagsTable) Word(sysno uintptr) uint32
Word returns the enable bitfield for sysno.
type SyscallFn ¶
type SyscallFn func(t *Task, sysno uintptr, args arch.SyscallArguments) (uintptr, *SyscallControl, error)
SyscallFn is a syscall implementation.
type SyscallInfo ¶
type SyscallInfo struct { Exit bool Sysno uintptr Args arch.SyscallArguments Rval uintptr Errno int }
SyscallInfo provides generic information about the syscall.
type SyscallRestartBlock ¶
SyscallRestartBlock represents the restart block for a syscall restartable with a custom function. It encapsulates the state required to restart a syscall across a S/R.
type SyscallSupportLevel ¶
type SyscallSupportLevel int
SyscallSupportLevel is a syscall support levels.
func (SyscallSupportLevel) String ¶
func (l SyscallSupportLevel) String() string
String returns a human readable representation of the support level.
type SyscallTable ¶
type SyscallTable struct { // OS is the operating system that this syscall table implements. OS abi.OS // Arch is the architecture that this syscall table targets. Arch arch.Arch // The OS version that this syscall table implements. Version Version // AuditNumber is a numeric constant that represents the syscall table. If // non-zero, auditNumber must be one of the AUDIT_ARCH_* values defined by // linux/audit.h. AuditNumber uint32 // Table is the collection of functions. Table map[uintptr]Syscall // Emulate is a collection of instruction addresses to emulate. The // keys are addresses, and the values are system call numbers. Emulate map[hostarch.Addr]uintptr // The function to call in case of a missing system call. Missing MissingFn // Stracer traces this syscall table. Stracer Stracer // External is used to handle an external callback. External func(*Kernel) // ExternalFilterBefore is called before External is called before the syscall is executed. // External is not called if it returns false. ExternalFilterBefore func(*Task, uintptr, arch.SyscallArguments) bool // ExternalFilterAfter is called before External is called after the syscall is executed. // External is not called if it returns false. ExternalFilterAfter func(*Task, uintptr, arch.SyscallArguments) bool // FeatureEnable stores the strace and one-shot enable bits. FeatureEnable SyscallFlagsTable // contains filtered or unexported fields }
SyscallTable is a lookup table of system calls.
Note that a SyscallTable is not savable directly. Instead, they are saved as an OS/Arch pair and lookup happens again on restore.
func LookupSyscallTable ¶
LookupSyscallTable returns the SyscallCall table for the OS/Arch combination.
func SyscallTables ¶
func SyscallTables() []*SyscallTable
SyscallTables returns a read-only slice of registered SyscallTables.
func (*SyscallTable) Init ¶
func (s *SyscallTable) Init()
Init initializes the system call table.
This should normally be called only during registration.
func (*SyscallTable) Lookup ¶
func (s *SyscallTable) Lookup(sysno uintptr) SyscallFn
Lookup returns the syscall implementation, if one exists.
func (*SyscallTable) LookupEmulate ¶
func (s *SyscallTable) LookupEmulate(addr hostarch.Addr) (uintptr, bool)
LookupEmulate looks up an emulation syscall number.
func (*SyscallTable) LookupName ¶
func (s *SyscallTable) LookupName(sysno uintptr) string
LookupName looks up a syscall name.
func (*SyscallTable) LookupNo ¶
func (s *SyscallTable) LookupNo(name string) (uintptr, error)
LookupNo looks up a syscall number by name.
func (*SyscallTable) LookupSyscallToProto ¶
func (s *SyscallTable) LookupSyscallToProto(sysno uintptr) SyscallToProto
LookupSyscallToProto looks up the SyscallToProto callback for the given syscall. It may return nil if none is registered.
func (*SyscallTable) MaxSysno ¶
func (s *SyscallTable) MaxSysno() (max uintptr)
MaxSysno returns the largest system call number.
type SyscallToProto ¶
type SyscallToProto func(*Task, seccheck.FieldSet, *pb.ContextData, SyscallInfo) (proto.Message, pb.MessageType)
SyscallToProto is a callback function that converts generic syscall data to schematized protobuf for the corresponding syscall.
type TTY ¶
type TTY struct { // Index is the terminal index. It is immutable. Index uint32 // contains filtered or unexported fields }
TTY defines the relationship between a thread group and its controlling terminal.
+stateify savable
func (*TTY) SignalForegroundProcessGroup ¶
func (tty *TTY) SignalForegroundProcessGroup(info *linux.SignalInfo)
SignalForegroundProcessGroup sends the signal to the foreground process group of the TTY.
type Task ¶
type Task struct { // Origin is the origin of the task. Origin TaskOrigin // contains filtered or unexported fields }
Task represents a thread of execution in the untrusted app. It includes registers and any thread-specific state that you would normally expect.
Each task is associated with a goroutine, called the task goroutine, that executes code (application code, system calls, etc.) on behalf of that task. See Task.run (task_run.go).
All fields that are "owned by the task goroutine" can only be mutated by the task goroutine while it is running. The task goroutine does not require synchronization to read these fields, although it still requires synchronization as described for those fields to mutate them.
All fields that are "exclusive to the task goroutine" can only be accessed by the task goroutine while it is running. The task goroutine does not require synchronization to read or write these fields.
+stateify savable
func TaskFromContext ¶
TaskFromContext returns the Task associated with ctx, or nil if there is no such Task.
func (*Task) Activate ¶
func (t *Task) Activate()
Activate ensures that the task has an active address space.
func (*Task) AppendSyscallFilter ¶
AppendSyscallFilter adds BPF program p as a system call filter.
Preconditions: The caller must be running on the task goroutine.
func (*Task) Arch ¶
Arch returns t's arch.Context64.
Preconditions: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) AsyncContext ¶
AsyncContext returns a context.Context representing t. The returned context.Context is intended for use by goroutines other than t's task goroutine; for example, signal delivery to t will not interrupt goroutines that are blocking using the returned context.Context.
func (*Task) BeginExternalStop ¶
func (t *Task) BeginExternalStop()
BeginExternalStop indicates the start of an external stop that applies to t. BeginExternalStop does not wait for t's task goroutine to stop.
func (*Task) BlockWithDeadline ¶
BlockWithDeadline blocks t until it is woken by an event, the application monotonic clock indicates a time of deadline (only if haveDeadline is true), or t is interrupted. It returns nil if an event is received from C, ETIMEDOUT if the deadline expired, and linuxerr.ErrInterrupted if t is interrupted.
Preconditions: The caller must be running on the task goroutine.
func (*Task) BlockWithDeadlineFrom ¶
func (t *Task) BlockWithDeadlineFrom(C <-chan struct{}, clock ktime.Clock, haveDeadline bool, deadline ktime.Time) error
BlockWithDeadlineFrom is similar to BlockWithDeadline, except it uses the passed clock (instead of application monotonic clock).
Most clients should use BlockWithDeadline or BlockWithTimeout instead.
Preconditions: The caller must be running on the task goroutine.
func (*Task) BlockWithTimeout ¶
func (t *Task) BlockWithTimeout(C chan struct{}, haveTimeout bool, timeout time.Duration) (time.Duration, error)
BlockWithTimeout blocks t until an event is received from C, the application monotonic clock indicates that timeout has elapsed (only if haveTimeout is true), or t is interrupted. It returns:
The remaining timeout, which is guaranteed to be 0 if the timeout expired, and is unspecified if haveTimeout is false.
An error which is nil if an event is received from C, ETIMEDOUT if the timeout expired, and linuxerr.ErrInterrupted if t is interrupted.
Preconditions: The caller must be running on the task goroutine.
func (*Task) BlockWithTimeoutOn ¶
func (t *Task) BlockWithTimeoutOn(w waiter.Waitable, mask waiter.EventMask, timeout time.Duration) (time.Duration, bool)
BlockWithTimeoutOn implements context.Context.BlockWithTimeoutOn.
func (*Task) CPUClock ¶
CPUClock returns a clock measuring the CPU time the task has spent executing application and "kernel" code.
func (*Task) CanTrace ¶
CanTrace checks that t is permitted to access target's state, as defined by ptrace(2), subsection "Ptrace access mode checking". If attach is true, it checks for access mode PTRACE_MODE_ATTACH; otherwise, it checks for access mode PTRACE_MODE_READ.
In Linux, ptrace access restrictions may be configured by LSMs. While we do not support LSMs, we do add additional restrictions based on the commoncap and YAMA LSMs.
TODO(gvisor.dev/issue/212): The result of CanTrace is immediately stale (e.g., a racing setuid(2) may change traceability). This may pose a risk when a task changes from traceable to not traceable. This is only problematic across execve, where privileges may increase.
We currently do not implement privileged executables (set-user/group-ID bits and file capabilities), so that case is not reachable.
func (*Task) CgroupPrepareMigrate ¶
func (t *Task) CgroupPrepareMigrate(dst Cgroup) (*CgroupMigrationContext, error)
CgroupPrepareMigrate starts a cgroup migration for this task to dst. The migration must be completed through the returned context.
func (*Task) ChargeFor ¶
func (t *Task) ChargeFor(other *Task, ctl CgroupControllerType, res CgroupResourceType, value int64) (bool, Cgroup, error)
ChargeFor charges t's cgroup on behalf of some other task. Returns the cgroup that's charged if any. Returned cgroup has an extra ref that's transferred to the caller.
func (*Task) ClearRSeq ¶
ClearRSeq unregisters addr as this thread's rseq structure.
Preconditions: The caller must be running on the task goroutine.
func (*Task) ClearYAMAException ¶
func (t *Task) ClearYAMAException()
ClearYAMAException removes any YAMA exception with t as the tracee.
func (*Task) Clone ¶
Clone implements the clone(2) syscall and returns the thread ID of the new task in t's PID namespace. Clone may return both a non-zero thread ID and a non-nil error.
Preconditions: The caller must be running Task.doSyscallInvoke on the task goroutine.
func (*Task) CompareAndSwapUint32 ¶
CompareAndSwapUint32 implements futex.Target.CompareAndSwapUint32.
func (*Task) ContainerID ¶
ContainerID returns t's container ID.
func (*Task) CopyContext ¶
CopyContext returns a marshal.CopyContext that copies to/from t's address space using opts.
func (*Task) CopyInBytes ¶
CopyInBytes is a fast version of CopyIn if the caller can serialize the data without reflection and pass in a byte slice.
This Task's AddressSpace must be active.
func (*Task) CopyInIovecs ¶
CopyInIovecs copies in IoVecs for Task.
Preconditions: Same as usermem.IO.CopyIn, plus: * The caller must be running on the task goroutine. * t's AddressSpace must be active.
func (*Task) CopyInIovecsAsSlice ¶
CopyInIovecsAsSlice copies in IoVecs and returns them in a slice.
Preconditions: Same as usermem.IO.CopyIn, plus:
- The caller must be running on the task goroutine or hold t.mu.
- t's AddressSpace must be active.
func (*Task) CopyInString ¶
CopyInString copies a NUL-terminated string of length at most maxlen in from the task's memory. The copy will fail with syscall.EFAULT if it traverses user memory that is unmapped or not readable by the user.
This Task's AddressSpace must be active.
func (*Task) CopyInVector ¶
CopyInVector copies a NULL-terminated vector of strings from the task's memory. The copy will fail with syscall.EFAULT if it traverses user memory that is unmapped or not readable by the user.
maxElemSize is the maximum size of each individual element.
maxTotalSize is the maximum total length of all elements plus the total number of elements. For example, the following strings correspond to the following set of sizes:
{ "a", "b", "c" } => 6 (3 for lengths, 3 for elements) { "abc" } => 4 (3 for length, 1 for elements)
This Task's AddressSpace must be active.
func (*Task) CopyOutBytes ¶
CopyOutBytes is a fast version of CopyOut if the caller can serialize the data without reflection and pass in a byte slice.
This Task's AddressSpace must be active.
func (*Task) CopyOutIovecs ¶
CopyOutIovecs converts src to an array of struct iovecs and copies it to the memory mapped at addr for Task.
Preconditions: Same as usermem.IO.CopyOut, plus:
- The caller must be running on the task goroutine.
- t's AddressSpace must be active.
func (*Task) CopyScratchBuffer ¶
CopyScratchBuffer returns a scratch buffer to be used in CopyIn/CopyOut functions. It must only be used within those functions and can only be used by the task goroutine; it exists to improve performance and thus intentionally lacks any synchronization.
Callers should pass a constant value as an argument if possible, which will allow the compiler to inline and optimize out the if statement below.
func (*Task) Credentials ¶
func (t *Task) Credentials() *auth.Credentials
Credentials returns t's credentials.
This value must be considered immutable.
func (*Task) Deactivate ¶
func (t *Task) Deactivate()
Deactivate relinquishes the task's active address space.
func (*Task) DebugDumpState ¶
func (t *Task) DebugDumpState()
DebugDumpState logs task state at log level debug.
Preconditions: The caller must be running on the task goroutine.
func (*Task) DropBoundingCapability ¶
func (t *Task) DropBoundingCapability(cp linux.Capability) error
DropBoundingCapability attempts to drop capability cp from t's capability bounding set.
func (*Task) EndExternalStop ¶
func (t *Task) EndExternalStop()
EndExternalStop indicates the end of an external stop started by a previous call to Task.BeginExternalStop. EndExternalStop does not wait for t's task goroutine to resume.
func (*Task) EnterInitialCgroups ¶
EnterInitialCgroups moves t into an initial set of cgroups. If initCgroups is not nil, the new task will be placed in the specified cgroups. Otherwise, if parent is not nil, the new task will be placed in the parent's cgroups. If neither is specified, the new task will be in the root cgroups.
This is analogous to Linux's kernel/cgroup/cgroup.c:cgroup_css_set_fork().
Precondition: t isn't in any cgroups yet, t.cgroups is empty.
func (*Task) Execve ¶
func (t *Task) Execve(newImage *TaskImage, argv, env []string, executable *vfs.FileDescription, pathname string) (*SyscallControl, error)
Execve implements the execve(2) syscall by killing all other tasks in its thread group and switching to newImage. Execve always takes ownership of newImage.
If executable is not nil, it is the first executable file that was loaded in the process of obtaining newImage, and pathname is a path to it.
Preconditions: The caller must be running Task.doSyscallInvoke on the task goroutine.
func (*Task) ExitState ¶
func (t *Task) ExitState() TaskExitState
ExitState returns t's current progress through the exit path.
func (*Task) ExitStatus ¶
func (t *Task) ExitStatus() linux.WaitStatus
ExitStatus returns t's exit status, which is only guaranteed to be meaningful if t.ExitState() != TaskExitNone.
func (*Task) FDTable ¶
FDTable returns t's FDTable. FDMTable does not take an additional reference on the returned FDMap.
Precondition: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) FSContext ¶
FSContext returns t's FSContext. FSContext does not take an additional reference on the returned FSContext.
Precondition: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) Futex ¶
Futex returns t's futex manager.
Preconditions: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) FutexWaiter ¶
FutexWaiter returns the Task's futex.Waiter.
func (*Task) GenerateProcTaskCgroup ¶
GenerateProcTaskCgroup writes the contents of /proc/<pid>/cgroup for t to buf.
func (*Task) GetCgroupEntries ¶
func (t *Task) GetCgroupEntries() []TaskCgroupEntry
GetCgroupEntries generates the contents of /proc/<pid>/cgroup as a TaskCgroupEntry array.
func (*Task) GetFile ¶
func (t *Task) GetFile(fd int32) *vfs.FileDescription
GetFile is a convenience wrapper for t.FDTable().Get.
Precondition: same as FDTable.Get.
func (*Task) GetIPCNamespace ¶
func (t *Task) GetIPCNamespace() *IPCNamespace
GetIPCNamespace takes a reference on the task IPC namespace and returns it. It will return nil if the task isn't alive.
func (*Task) GetMountNamespace ¶
func (t *Task) GetMountNamespace() *vfs.MountNamespace
GetMountNamespace returns t's MountNamespace. A reference is taken on the returned mount namespace.
func (*Task) GetNetworkNamespace ¶
GetNetworkNamespace takes a reference on the task network namespace and returns it. It can return nil if the task isn't alive.
func (*Task) GetRobustList ¶
GetRobustList sets the robust futex list for the task.
func (*Task) GetSharedKey ¶
GetSharedKey implements futex.Target.GetSharedKey.
func (*Task) GetUTSNamespace ¶
func (t *Task) GetUTSNamespace() *UTSNamespace
GetUTSNamespace takes a reference on the task UTS namespace and returns it. It will return nil if the task isn't alive.
func (*Task) Getitimer ¶
Getitimer implements getitimer(2).
Preconditions: The caller must be running on the task goroutine.
func (*Task) GoroutineID ¶
GoroutineID returns the ID of t's task goroutine.
func (*Task) HasCapability ¶
func (t *Task) HasCapability(cp linux.Capability) bool
HasCapability checks if the task has capability cp in its user namespace.
func (*Task) HasCapabilityIn ¶
func (t *Task) HasCapabilityIn(cp linux.Capability, ns *auth.UserNamespace) bool
HasCapabilityIn checks if the task has capability cp in user namespace ns.
func (*Task) IPCNamespace ¶
func (t *Task) IPCNamespace() *IPCNamespace
IPCNamespace returns the task's IPC namespace.
func (*Task) Interrupted ¶
Interrupted implements context.Context.Interrupted.
func (*Task) IntervalTimerCreate ¶
IntervalTimerCreate implements timer_create(2).
func (*Task) IntervalTimerDelete ¶
IntervalTimerDelete implements timer_delete(2).
func (*Task) IntervalTimerGetoverrun ¶
IntervalTimerGetoverrun implements timer_getoverrun(2).
Preconditions: The caller must be running on the task goroutine.
func (*Task) IntervalTimerGettime ¶
IntervalTimerGettime implements timer_gettime(2).
func (*Task) IntervalTimerSettime ¶
func (t *Task) IntervalTimerSettime(id linux.TimerID, its linux.Itimerspec, abs bool) (linux.Itimerspec, error)
IntervalTimerSettime implements timer_settime(2).
func (*Task) IovecsIOSequence ¶
func (t *Task) IovecsIOSequence(addr hostarch.Addr, iovcnt int, opts usermem.IOOpts) (usermem.IOSequence, error)
IovecsIOSequence returns a usermem.IOSequence representing the array of iovcnt struct iovecs at addr in t's address space. opts applies to the returned IOSequence, not the reading of the struct iovec array.
IovecsIOSequence is analogous to Linux's lib/iov_iter.c:import_iovec().
Preconditions: Same as Task.CopyInIovecs.
func (*Task) IsChrooted ¶
IsChrooted returns true if the root directory of t's FSContext is not the root directory of t's MountNamespace.
Preconditions: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) IsNetworkNamespaced ¶
IsNetworkNamespaced returns true if t is in a non-root network namespace.
func (*Task) JoinSessionKeyring ¶
JoinSessionKeyring causes the task to join a keyring with the given key description (not ID). If `keyDesc` is nil, then the task joins a newly-instantiated session keyring instead.
func (*Task) LeaveCgroups ¶
func (t *Task) LeaveCgroups()
LeaveCgroups removes t out from all its cgroups.
func (*Task) LoadUint32 ¶
LoadUint32 implements futex.Target.LoadUint32.
func (*Task) MaxRSS ¶
MaxRSS returns the maximum resident set size of the task in bytes. which should be one of RUSAGE_SELF, RUSAGE_CHILDREN, RUSAGE_THREAD, or RUSAGE_BOTH. See getrusage(2) for documentation on the behavior of these flags.
func (*Task) MemoryManager ¶
func (t *Task) MemoryManager() *mm.MemoryManager
MemoryManager returns t's MemoryManager. MemoryManager does not take an additional reference on the returned MM.
Preconditions: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) MigrateCgroup ¶
MigrateCgroup migrates this task to the dst cgroup.
func (*Task) MountNamespace ¶
func (t *Task) MountNamespace() *vfs.MountNamespace
MountNamespace returns t's MountNamespace.
func (*Task) NetworkContext ¶
NetworkContext returns the network stack used by the task. NetworkContext may return nil if no network stack is available.
TODO(gvisor.dev/issue/1833): Migrate callers of this method to NetworkNamespace().
func (*Task) NetworkNamespace ¶
NetworkNamespace returns the network namespace observed by the task.
func (*Task) NewFDAt ¶
func (t *Task) NewFDAt(fd int32, file *vfs.FileDescription, flags FDFlags) (*vfs.FileDescription, error)
NewFDAt is a convenience wrapper for t.FDTable().NewFDAt.
This automatically passes the task as the context.
Precondition: same as FDTable.
func (*Task) NewFDFrom ¶
NewFDFrom is a convenience wrapper for t.FDTable().NewFD.
This automatically passes the task as the context.
Precondition: same as FDTable.Get.
func (*Task) NewFDs ¶
NewFDs is a convenience wrapper for t.FDTable().NewFDs.
This automatically passes the task as the context.
Precondition: same as FDTable.
func (*Task) NotifyRlimitCPUUpdated ¶
func (t *Task) NotifyRlimitCPUUpdated()
NotifyRlimitCPUUpdated is called by setrlimit.
Preconditions: The caller must be running on the task goroutine.
func (*Task) NumaPolicy ¶
func (t *Task) NumaPolicy() (policy linux.NumaPolicy, nodeMask uint64)
NumaPolicy returns t's current numa policy.
func (*Task) OOMScoreAdj ¶
OOMScoreAdj gets the task's thread group's OOM score adjustment.
func (*Task) OldRSeqCPUAddr ¶
OldRSeqCPUAddr returns the address that old rseq will keep updated with t's CPU number.
Preconditions: The caller must be running on the task goroutine.
func (*Task) OldRSeqCriticalRegion ¶
func (t *Task) OldRSeqCriticalRegion() OldRSeqCriticalRegion
OldRSeqCriticalRegion returns a copy of t's thread group's current old restartable sequence.
func (*Task) OwnCopyContext ¶
OwnCopyContext returns a marshal.CopyContext that copies to/from t's address space using opts. The returned CopyContext may only be used by t's task goroutine.
Since t already implements marshal.CopyContext, this is only needed to override the usermem.IOOpts used for the copy.
func (*Task) PIDNamespace ¶
func (t *Task) PIDNamespace() *PIDNamespace
PIDNamespace returns the PID namespace containing t.
func (*Task) ParentDeathSignal ¶
ParentDeathSignal returns t's parent death signal.
func (*Task) ParentLocked ¶
ParentLocked returns t's parent. Caller must ensure t's TaskSet mu is locked for at least reading.
+checklocks:t.tg.pidns.owner.mu
func (*Task) PendingSignals ¶
PendingSignals returns the set of pending signals.
func (*Task) PrepareExit ¶
func (t *Task) PrepareExit(ws linux.WaitStatus)
PrepareExit indicates an exit with the given status.
Preconditions: The caller must be running on the task goroutine.
func (*Task) PrepareGroupExit ¶
func (t *Task) PrepareGroupExit(ws linux.WaitStatus)
PrepareGroupExit indicates a group exit with status es to t's thread group.
PrepareGroupExit is analogous to Linux's do_group_exit(), except that it does not tail-call do_exit(), except that it *does* set Task.exitStatus. (Linux does not do so until within do_exit(), since it reuses exit_code for ptrace.)
Preconditions: The caller must be running on the task goroutine.
func (*Task) QueueAIO ¶
func (t *Task) QueueAIO(cb AIOCallback)
QueueAIO queues an AIOCallback which will be run asynchronously.
func (*Task) RSeqAvailable ¶
RSeqAvailable returns true if t supports (old and new) restartable sequences.
func (*Task) RegisterWork ¶
func (t *Task) RegisterWork(work TaskWorker)
RegisterWork can be used to register additional task work that will be performed prior to returning to user space. See TaskWorker.TaskWork for semantics regarding registration.
func (*Task) ResetKcov ¶
func (t *Task) ResetKcov()
ResetKcov clears the kcov instance associated with t.
func (*Task) ResetMemCgIDFromCgroup ¶
ResetMemCgIDFromCgroup sets the memory cgroup id to zero, if the task has a memory cgroup.
func (*Task) RestoreContainerID ¶
RestoreContainerID sets t's container ID in case the restored container ID is different from when it was saved.
func (*Task) SeccompMode ¶
SeccompMode returns a SECCOMP_MODE_* constant indicating the task's current seccomp syscall filtering mode, appropriate for both prctl(PR_GET_SECCOMP) and /proc/[pid]/status.
func (*Task) SendGroupSignal ¶
func (t *Task) SendGroupSignal(info *linux.SignalInfo) error
SendGroupSignal sends the given signal to t's thread group.
func (*Task) SendSignal ¶
func (t *Task) SendSignal(info *linux.SignalInfo) error
SendSignal sends the given signal to t.
The following errors may be returned:
linuxerr.ESRCH - The task has exited. linuxerr.EINVAL - The signal is not valid. linuxerr.EAGAIN - THe signal is realtime, and cannot be queued.
func (*Task) SessionKeyring ¶
SessionKeyring returns this Task's session keyring. Session keyrings are inherited from the parent when a task is started. If the session keyring is unset, it is implicitly initialized. As such, this function should never return ENOKEY.
func (*Task) SetCPUMask ¶
SetCPUMask sets t's allowed CPU mask based on mask. It takes ownership of mask.
Preconditions: mask.Size() == sched.CPUSetSize(t.Kernel().ApplicationCores()).
func (*Task) SetCapabilitySets ¶
func (t *Task) SetCapabilitySets(permitted, inheritable, effective auth.CapabilitySet) error
SetCapabilitySets attempts to change t's permitted, inheritable, and effective capability sets.
func (*Task) SetClearTID ¶
SetClearTID sets t's cleartid.
Preconditions: The caller must be running on the task goroutine.
func (*Task) SetExtraGIDs ¶
SetExtraGIDs attempts to change t's supplemental groups. All IDs are interpreted as being in t's user namespace.
func (*Task) SetKeepCaps ¶
SetKeepCaps will set the keep capabilities flag PR_SET_KEEPCAPS.
func (*Task) SetMemCgID ¶
SetMemCgID sets the given memory cgroup id to the task.
func (*Task) SetMemCgIDFromCgroup ¶
SetMemCgIDFromCgroup sets the id of the given memory cgroup to the task.
func (*Task) SetNumaPolicy ¶
func (t *Task) SetNumaPolicy(policy linux.NumaPolicy, nodeMask uint64)
SetNumaPolicy sets t's numa policy.
func (*Task) SetOOMScoreAdj ¶
SetOOMScoreAdj sets the task's thread group's OOM score adjustment. The value should be between -1000 and 1000 inclusive.
func (*Task) SetOldRSeqCPUAddr ¶
SetOldRSeqCPUAddr replaces the address that old rseq will keep updated with t's CPU number.
Preconditions:
- t.RSeqAvailable() == true.
- The caller must be running on the task goroutine.
- t's AddressSpace must be active.
func (*Task) SetOldRSeqCriticalRegion ¶
func (t *Task) SetOldRSeqCriticalRegion(r OldRSeqCriticalRegion) error
SetOldRSeqCriticalRegion replaces t's thread group's old restartable sequence.
Preconditions: t.RSeqAvailable() == true.
func (*Task) SetParentDeathSignal ¶
SetParentDeathSignal sets t's parent death signal.
func (*Task) SetPermsOnKey ¶
SetPermsOnKey sets the permission bits on the given key using the task's credentials.
func (*Task) SetRSeq ¶
SetRSeq registers addr as this thread's rseq structure.
Preconditions: The caller must be running on the task goroutine.
func (*Task) SetRobustList ¶
SetRobustList sets the robust futex list for the task.
func (*Task) SetSavedSignalMask ¶
SetSavedSignalMask sets the saved signal mask (see Task.savedSignalMask's comment).
Preconditions: The caller must be running on the task goroutine.
func (*Task) SetSignalMask ¶
SetSignalMask sets t's signal mask.
Preconditions:
- The caller must be running on the task goroutine.
- t.exitState < TaskExitZombie.
func (*Task) SetSignalStack ¶
func (t *Task) SetSignalStack(alt linux.SignalStack) bool
SetSignalStack sets the task-private signal stack.
This value may not be changed if the task is currently executing on the signal stack, i.e. if t.onSignalStack returns true. In this case, this function will return false. Otherwise, true is returned.
func (*Task) SetSyscallRestartBlock ¶
func (t *Task) SetSyscallRestartBlock(r SyscallRestartBlock)
SetSyscallRestartBlock sets the restart block for use in restart_syscall(2). After registering a restart block, a syscall should return ERESTART_RESTARTBLOCK to request a restart using the block.
Precondition: The caller must be running on the task goroutine.
func (*Task) SetUserNamespace ¶
func (t *Task) SetUserNamespace(ns *auth.UserNamespace) error
SetUserNamespace attempts to move c into ns.
func (*Task) SetYAMAException ¶
SetYAMAException creates a YAMA exception allowing all descendants of tracer to trace t. If tracer is nil, then any task is allowed to trace t.
If there was an existing exception, it is overwritten with the new one.
func (*Task) Setitimer ¶
Setitimer implements setitimer(2).
Preconditions: The caller must be running on the task goroutine.
func (*Task) Setns ¶
func (t *Task) Setns(fd *vfs.FileDescription, flags int32) error
Setns reassociates thread with the specified namespace.
func (*Task) SigaltStack ¶
SigaltStack implements the sigaltstack syscall.
func (*Task) SignalMask ¶
SignalMask returns a copy of t's signal mask.
func (*Task) SignalRegister ¶
SignalRegister registers a waiter for pending signals.
func (*Task) SignalReturn ¶
func (t *Task) SignalReturn(rt bool) (*SyscallControl, error)
SignalReturn implements sigreturn(2) (if rt is false) or rt_sigreturn(2) (if rt is true).
func (*Task) SignalStack ¶
func (t *Task) SignalStack() linux.SignalStack
SignalStack returns the task-private signal stack.
By precondition, a full state has to be pulled.
func (*Task) SignalUnregister ¶
SignalUnregister unregisters a waiter for pending signals.
func (*Task) Sigtimedwait ¶
Sigtimedwait implements the semantics of sigtimedwait(2).
Preconditions:
- The caller must be running on the task goroutine.
- t.exitState < TaskExitZombie.
func (*Task) SingleIOSequence ¶
func (t *Task) SingleIOSequence(addr hostarch.Addr, length int, opts usermem.IOOpts) (usermem.IOSequence, error)
SingleIOSequence returns a usermem.IOSequence representing [addr, addr+length) in t's address space. If this contains addresses outside the application address range, it returns EFAULT. If length exceeds MAX_RW_COUNT, the range is silently truncated.
SingleIOSequence is analogous to Linux's lib/iov_iter.c:import_single_range(). (Note that the non-vectorized read and write syscalls in Linux do not use import_single_range(). However they check access_ok() in fs/read_write.c:vfs_read/vfs_write, and overflowing address ranges are truncated to MAX_RW_COUNT by fs/read_write.c:rw_verify_area().)
func (*Task) Stack ¶
Stack returns the userspace stack.
Preconditions: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) Start ¶
Start starts the task goroutine. Start must be called exactly once for each task returned by NewTask.
'tid' must be the task's TID in the root PID namespace and it's used for debugging purposes only (set as parameter to Task.run to make it visible in stack dumps).
func (*Task) StateStatus ¶
StateStatus returns a string representation of the task's current state, appropriate for /proc/[pid]/status.
func (*Task) SwapUint32 ¶
SwapUint32 implements futex.Target.SwapUint32.
func (*Task) SyscallRestartBlock ¶
func (t *Task) SyscallRestartBlock() SyscallRestartBlock
SyscallRestartBlock returns the currently registered restart block for use in restart_syscall(2). This function is *not* idempotent and may be called once per syscall. This function must not be called if a restart block has not been registered for the current syscall.
Precondition: The caller must be running on the task goroutine.
func (*Task) SyscallTable ¶
func (t *Task) SyscallTable() *SyscallTable
SyscallTable returns t's syscall table.
Preconditions: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) TGIDInRoot ¶
TGIDInRoot returns t's TGID in the root PID namespace.
func (*Task) TaskGoroutineSchedInfo ¶
func (t *Task) TaskGoroutineSchedInfo() TaskGoroutineSchedInfo
TaskGoroutineSchedInfo returns a copy of t's task goroutine scheduling info. Most clients should use t.CPUStats() instead.
func (*Task) TaskImage ¶
TaskImage returns t's TaskImage.
Precondition: The caller must be running on the task goroutine, or t.mu must be locked.
func (*Task) ThreadGroup ¶
func (t *Task) ThreadGroup() *ThreadGroup
ThreadGroup returns the thread group containing t.
func (*Task) ThreadID ¶
ThreadID returns t's thread ID in its own PID namespace. If the task is dead, ThreadID returns 0.
func (*Task) Timekeeper ¶
func (t *Task) Timekeeper() *Timekeeper
Timekeeper returns the system Timekeeper.
func (*Task) UTSNamespace ¶
func (t *Task) UTSNamespace() *UTSNamespace
UTSNamespace returns the task's UTS namespace.
func (*Task) UninterruptibleSleepFinish ¶
UninterruptibleSleepFinish implements context.Context.UninterruptibleSleepFinish.
func (*Task) UninterruptibleSleepStart ¶
UninterruptibleSleepStart implements context.Context.UninterruptibleSleepStart.
func (*Task) Unshare ¶
Unshare changes the set of resources t shares with other tasks, as specified by flags.
Preconditions: The caller must be running on the task goroutine.
func (*Task) UnshareFdTable ¶
UnshareFdTable unshares the FdTable that task t shares with other tasks, upto the maxFd.
Preconditions: The caller must be running on the task goroutine.
func (*Task) UserCPUClock ¶
UserCPUClock returns a clock measuring the CPU time the task has spent executing application code.
func (*Task) UserNamespace ¶
func (t *Task) UserNamespace() *auth.UserNamespace
UserNamespace returns the user namespace associated with the task.
func (*Task) Value ¶
Value implements context.Context.Value.
Preconditions: The caller must be running on the task goroutine.
func (*Task) Wait ¶
func (t *Task) Wait(opts *WaitOptions) (*WaitResult, error)
Wait waits for an event from a thread group that is a child of t's thread group, or a task in such a thread group, or a task that is ptraced by t, subject to the options specified in opts.
func (*Task) WithMuLocked ¶
WithMuLocked executes f with t.mu locked.
type TaskCgroupEntry ¶
type TaskCgroupEntry struct { HierarchyID uint32 `json:"hierarchy_id"` Controllers string `json:"controllers,omitempty"` Path string `json:"path,omitempty"` }
TaskCgroupEntry represents a line in /proc/<pid>/cgroup, and is used to format a cgroup for display.
type TaskConfig ¶
type TaskConfig struct { // Kernel is the owning Kernel. Kernel *Kernel // Parent is the new task's parent. Parent may be nil. Parent *Task // If InheritParent is not nil, use InheritParent's parent as the new // task's parent. InheritParent *Task // ThreadGroup is the ThreadGroup the new task belongs to. ThreadGroup *ThreadGroup // SignalMask is the new task's initial signal mask. SignalMask linux.SignalSet // TaskImage is the TaskImage of the new task. Ownership of the // TaskImage is transferred to TaskSet.NewTask, whether or not it // succeeds. TaskImage *TaskImage // FSContext is the FSContext of the new task. A reference must be held on // FSContext, which is transferred to TaskSet.NewTask whether or not it // succeeds. FSContext *FSContext // FDTable is the FDTableof the new task. A reference must be held on // FDMap, which is transferred to TaskSet.NewTask whether or not it // succeeds. FDTable *FDTable // Credentials is the Credentials of the new task. Credentials *auth.Credentials // Niceness is the niceness of the new task. Niceness int // NetworkNamespace is the network namespace to be used for the new task. NetworkNamespace *inet.Namespace // AllowedCPUMask contains the cpus that this task can run on. AllowedCPUMask sched.CPUSet // UTSNamespace is the UTSNamespace of the new task. UTSNamespace *UTSNamespace // IPCNamespace is the IPCNamespace of the new task. IPCNamespace *IPCNamespace // MountNamespace is the MountNamespace of the new task. MountNamespace *vfs.MountNamespace // RSeqAddr is a pointer to the userspace linux.RSeq structure. RSeqAddr hostarch.Addr // RSeqSignature is the signature that the rseq abort IP must be signed // with. RSeqSignature uint32 // ContainerID is the container the new task belongs to. ContainerID string // InitialCgroups are the cgroups the container is initialised to. InitialCgroups map[Cgroup]struct{} // UserCounters is user resource counters. UserCounters *UserCounters // SessionKeyring is the session keyring associated with the parent task. // It may be nil. SessionKeyring *auth.Key Origin TaskOrigin }
TaskConfig defines the configuration of a new Task (see below).
type TaskExitState ¶
type TaskExitState int
TaskExitState represents a step in the task exit path.
"Exiting" and "exited" are often ambiguous; prefer to name specific states.
const ( // TaskExitNone indicates that the task has not begun exiting. TaskExitNone TaskExitState = iota // TaskExitInitiated indicates that the task goroutine has entered the exit // path, and the task is no longer eligible to participate in group stops // or group signal handling. TaskExitInitiated is analogous to Linux's // PF_EXITING. TaskExitInitiated // TaskExitZombie indicates that the task has released its resources, and // the task no longer prevents a sibling thread from completing execve. TaskExitZombie // TaskExitDead indicates that the task's thread IDs have been released, // and the task no longer prevents its thread group leader from being // reaped. ("Reaping" refers to the transitioning of a task from // TaskExitZombie to TaskExitDead.) TaskExitDead )
func (TaskExitState) String ¶
func (t TaskExitState) String() string
String implements fmt.Stringer.
type TaskGoroutineSchedInfo ¶
type TaskGoroutineSchedInfo struct { // Timestamp was the value of Kernel.cpuClock when this // TaskGoroutineSchedInfo was last updated. Timestamp uint64 // State is the current state of the task goroutine. State TaskGoroutineState // UserTicks is the amount of time the task goroutine has spent executing // its associated Task's application code, in units of linux.ClockTick. UserTicks uint64 // SysTicks is the amount of time the task goroutine has spent executing in // the sentry, in units of linux.ClockTick. SysTicks uint64 }
TaskGoroutineSchedInfo contains task goroutine scheduling state which must be read and updated atomically.
+stateify savable
type TaskGoroutineState ¶
type TaskGoroutineState int
TaskGoroutineState is a coarse representation of the current execution status of a kernel.Task goroutine.
const ( // TaskGoroutineNonexistent indicates that the task goroutine has either // not yet been created by Task.Start() or has returned from Task.run(). // This must be the zero value for TaskGoroutineState. TaskGoroutineNonexistent TaskGoroutineState = iota // TaskGoroutineRunningSys indicates that the task goroutine is executing // sentry code. TaskGoroutineRunningSys // TaskGoroutineRunningApp indicates that the task goroutine is executing // application code. TaskGoroutineRunningApp // TaskGoroutineBlockedInterruptible indicates that the task goroutine is // blocked in Task.block(), and hence may be woken by Task.interrupt() // (e.g. due to signal delivery). TaskGoroutineBlockedInterruptible // TaskGoroutineBlockedUninterruptible indicates that the task goroutine is // stopped outside of Task.block() and Task.doStop(), and hence cannot be // woken by Task.interrupt(). TaskGoroutineBlockedUninterruptible // TaskGoroutineStopped indicates that the task goroutine is blocked in // Task.doStop(). TaskGoroutineStopped is similar to // TaskGoroutineBlockedUninterruptible, but is a separate state to make it // possible to determine when Task.stop is meaningful. TaskGoroutineStopped )
type TaskImage ¶
type TaskImage struct { // Name is the thread name set by the prctl(PR_SET_NAME) system call. Name string // Arch is the architecture-specific context (registers, etc.) Arch *arch.Context64 // MemoryManager is the task's address space. MemoryManager *mm.MemoryManager // contains filtered or unexported fields }
TaskImage is the subset of a task's data that is provided by the loader.
+stateify savable
func (*TaskImage) FileCaps ¶
FileCaps return the task image's security.capability extended attribute.
func (*TaskImage) Fork ¶
func (image *TaskImage) Fork(ctx context.Context, k *Kernel, shareAddressSpace bool) (*TaskImage, error)
Fork returns a duplicate of image. The copied TaskImage always has an independent arch.Context64. If shareAddressSpace is true, the copied TaskImage shares an address space with the original; otherwise, the copied TaskImage has an independent address space that is initially a duplicate of the original's.
type TaskOrigin ¶
type TaskOrigin int
TaskOrigin indicates how the task was initially created.
const ( // OriginUnknown indicates that task creation source is not known (or not important). OriginUnknown TaskOrigin = iota // OriginExec indicates that task was created due to an exec request inside a container. OriginExec )
type TaskSet ¶
type TaskSet struct { // Root is the root PID namespace, in which all tasks in the TaskSet are // visible. The Root pointer is immutable. Root *PIDNamespace // contains filtered or unexported fields }
A TaskSet comprises all tasks in a system.
+stateify savable
func (*TaskSet) BeginExternalStop ¶
func (ts *TaskSet) BeginExternalStop()
BeginExternalStop indicates the start of an external stop that applies to all current and future tasks in ts. BeginExternalStop does not wait for task goroutines to stop.
func (*TaskSet) EndExternalStop ¶
func (ts *TaskSet) EndExternalStop()
EndExternalStop indicates the end of an external stop started by a previous call to TaskSet.BeginExternalStop. EndExternalStop does not wait for task goroutines to resume.
func (*TaskSet) Kill ¶
func (ts *TaskSet) Kill(ws linux.WaitStatus)
Kill requests that all tasks in ts exit as if group exiting with status ws. Kill does not wait for tasks to exit.
Kill has no analogue in Linux; it's provided for save/restore only.
func (*TaskSet) NewTask ¶
NewTask creates a new task defined by cfg.
NewTask does not start the returned task; the caller must call Task.Start.
If successful, NewTask transfers references held by cfg to the new task. Otherwise, NewTask releases them.
func (*TaskSet) PullFullState ¶
func (ts *TaskSet) PullFullState()
PullFullState receives full states for all tasks.
type TaskStop ¶
type TaskStop interface { // Killable returns true if Task.Kill should end the stop prematurely. // Killable is analogous to Linux's TASK_WAKEKILL. Killable() bool }
A TaskStop is a condition visible to the task control flow graph that prevents a task goroutine from running or exiting, i.e. an internal stop.
NOTE(b/30793614): Most TaskStops don't contain any data; they're distinguished by their type. The obvious way to implement such a TaskStop is:
type groupStop struct{} func (groupStop) Killable() bool { return true } ... t.beginInternalStop(groupStop{})
However, this doesn't work because the state package can't serialize values, only pointers. Furthermore, the correctness of save/restore depends on the ability to pass a TaskStop to endInternalStop that will compare equal to the TaskStop that was passed to beginInternalStop, even if a save/restore cycle occurred between the two. As a result, the current idiom is to always use a typecast nil for data-free TaskStops:
type groupStop struct{} func (*groupStop) Killable() bool { return true } ... t.beginInternalStop((*groupStop)(nil))
This is pretty gross, but the alternatives seem grosser.
type TaskWorker ¶
type TaskWorker interface { // TaskWork will be executed prior to returning to user space. Note that // TaskWork may call RegisterWork again, but this will not be executed until // the next return to user space, unlike in Linux. This effectively allows // registration of indefinite user return hooks, but not by default. TaskWork(t *Task) }
TaskWorker is a deferred task.
This must be savable.
type ThreadGroup ¶
type ThreadGroup struct {
// contains filtered or unexported fields
}
A ThreadGroup is a logical grouping of tasks that has widespread significance to other kernel features (e.g. signal handling). ("Thread groups" are usually called "processes" in userspace documentation.)
ThreadGroup is a superset of Linux's struct signal_struct.
+stateify savable
func (*ThreadGroup) CPUClock ¶
func (tg *ThreadGroup) CPUClock() ktime.Clock
CPUClock returns a ktime.Clock that measures the time that a thread group has spent executing, including sentry time.
func (*ThreadGroup) CPUStats ¶
func (tg *ThreadGroup) CPUStats() usage.CPUStats
CPUStats returns the combined CPU usage statistics of all past and present threads in tg.
func (*ThreadGroup) Count ¶
func (tg *ThreadGroup) Count() int
Count returns the number of non-exited threads in the group.
func (*ThreadGroup) CreateProcessGroup ¶
func (tg *ThreadGroup) CreateProcessGroup() error
CreateProcessGroup creates a new process group.
An EPERM error will be returned if the ThreadGroup belongs to a different Session, is a Session leader or the group already exists.
func (*ThreadGroup) CreateSession ¶
func (tg *ThreadGroup) CreateSession() (SessionID, error)
CreateSession creates a new Session, with the ThreadGroup as the leader.
EPERM may be returned if either the given ThreadGroup is already a Session leader, or a ProcessGroup already exists for the ThreadGroup's ID.
func (*ThreadGroup) ExitStatus ¶
func (tg *ThreadGroup) ExitStatus() linux.WaitStatus
ExitStatus returns the exit status that would be returned by a consuming wait*() on tg.
func (*ThreadGroup) ForegroundProcessGroupID ¶
func (tg *ThreadGroup) ForegroundProcessGroupID(tty *TTY) (ProcessGroupID, error)
ForegroundProcessGroupID returns the foreground process group ID of the thread group.
func (*ThreadGroup) ID ¶
func (tg *ThreadGroup) ID() ThreadID
ID returns tg's leader's thread ID in its own PID namespace. If tg's leader is dead, ID returns 0.
func (*ThreadGroup) IOUsage ¶
func (tg *ThreadGroup) IOUsage() *usage.IO
IOUsage returns the total io usage of all dead and live threads in the group.
func (*ThreadGroup) IsChildSubreaper ¶
func (tg *ThreadGroup) IsChildSubreaper() bool
IsChildSubreaper returns whether this ThreadGroup is a child subreaper.
func (*ThreadGroup) IsInitIn ¶
func (tg *ThreadGroup) IsInitIn(pidns *PIDNamespace) bool
IsInitIn returns whether this ThreadGroup has TID 1 int the given PIDNamespace.
func (*ThreadGroup) JoinProcessGroup ¶
func (tg *ThreadGroup) JoinProcessGroup(pidns *PIDNamespace, pgid ProcessGroupID, checkExec bool) error
JoinProcessGroup joins an existing process group.
This function will return EACCES if an exec has been performed since fork by the given ThreadGroup, and EPERM if the Sessions are not the same or the group does not exist.
If checkExec is set, then the join is not permitted after the process has executed exec at least once.
func (*ThreadGroup) JoinedChildCPUStats ¶
func (tg *ThreadGroup) JoinedChildCPUStats() usage.CPUStats
JoinedChildCPUStats implements the semantics of RUSAGE_CHILDREN: "Return resource usage statistics for all children of [tg] that have terminated and been waited for. These statistics will include the resources used by grandchildren, and further removed descendants, if all of the intervening descendants waited on their terminated children."
func (*ThreadGroup) Limits ¶
func (tg *ThreadGroup) Limits() *limits.LimitSet
Limits returns tg's limits.
func (*ThreadGroup) MemberIDs ¶
func (tg *ThreadGroup) MemberIDs(pidns *PIDNamespace) []ThreadID
MemberIDs returns a snapshot of the ThreadIDs (in PID namespace pidns) for all tasks in tg.
func (*ThreadGroup) MigrateCgroup ¶
func (tg *ThreadGroup) MigrateCgroup(dst Cgroup) error
MigrateCgroup migrates all tasks in tg to the dst cgroup. Either all tasks are migrated, or none are. Atomicity of migrations wrt cgroup membership (i.e. a task can't switch cgroups mid-migration due to another migration) is guaranteed because migrations are serialized by TaskSet.mu.
func (*ThreadGroup) PIDNamespace ¶
func (tg *ThreadGroup) PIDNamespace() *PIDNamespace
PIDNamespace returns the PID namespace containing tg.
func (*ThreadGroup) ProcessGroup ¶
func (tg *ThreadGroup) ProcessGroup() *ProcessGroup
ProcessGroup returns the ThreadGroup's ProcessGroup.
A reference is not taken on the process group.
func (*ThreadGroup) Release ¶
func (tg *ThreadGroup) Release(ctx context.Context)
Release releases the thread group's resources.
func (*ThreadGroup) ReleaseControllingTTY ¶
func (tg *ThreadGroup) ReleaseControllingTTY(tty *TTY) error
ReleaseControllingTTY gives up tty as the controlling tty of tg.
func (*ThreadGroup) SendSignal ¶
func (tg *ThreadGroup) SendSignal(info *linux.SignalInfo) error
SendSignal sends the given signal to tg, using tg's leader to determine if the signal is blocked.
func (*ThreadGroup) Session ¶
func (tg *ThreadGroup) Session() *Session
Session returns the ThreadGroup's Session.
A reference is not taken on the session.
func (*ThreadGroup) SetChildSubreaper ¶
func (tg *ThreadGroup) SetChildSubreaper(isSubreaper bool)
SetChildSubreaper marks this ThreadGroup sets the isChildSubreaper field on this ThreadGroup, and marks all child ThreadGroups as having a subreaper. Recursion stops if we find another subreaper process, which is either a ThreadGroup with isChildSubreaper bit set, or a ThreadGroup with PID=1 inside a PID namespace.
func (*ThreadGroup) SetControllingTTY ¶
func (tg *ThreadGroup) SetControllingTTY(tty *TTY, steal bool, isReadable bool) error
SetControllingTTY sets tty as the controlling terminal of tg.
func (*ThreadGroup) SetForegroundProcessGroupID ¶
func (tg *ThreadGroup) SetForegroundProcessGroupID(tty *TTY, pgid ProcessGroupID) error
SetForegroundProcessGroupID sets the foreground process group of tty to pgid.
func (*ThreadGroup) SetSigAction ¶
func (tg *ThreadGroup) SetSigAction(sig linux.Signal, actptr *linux.SigAction) (linux.SigAction, error)
SetSigAction atomically sets the thread group's signal action for signal sig to *actptr (if actptr is not nil) and returns the old signal action.
func (*ThreadGroup) SignalHandlers ¶
func (tg *ThreadGroup) SignalHandlers() *SignalHandlers
SignalHandlers returns the signal handlers used by tg.
Preconditions: The caller must provide the synchronization required to read tg.signalHandlers, as described in the field's comment.
func (*ThreadGroup) TTY ¶
func (tg *ThreadGroup) TTY() *TTY
TTY returns the thread group's controlling terminal. If nil, there is no controlling terminal.
func (*ThreadGroup) TaskSet ¶
func (tg *ThreadGroup) TaskSet() *TaskSet
TaskSet returns the TaskSet containing tg.
func (*ThreadGroup) TerminationSignal ¶
func (tg *ThreadGroup) TerminationSignal() linux.Signal
TerminationSignal returns the thread group's termination signal, which is the signal that will be sent to its leader's parent when all threads have exited.
func (*ThreadGroup) UserCPUClock ¶
func (tg *ThreadGroup) UserCPUClock() ktime.Clock
UserCPUClock returns a ktime.Clock that measures the time that a thread group has spent executing.
func (*ThreadGroup) WaitExited ¶
func (tg *ThreadGroup) WaitExited()
WaitExited blocks until all task goroutines in tg have exited.
WaitExited does not correspond to anything in Linux; it's provided so that external callers of Kernel.CreateProcess can wait for the created thread group to terminate.
type Timekeeper ¶
type Timekeeper struct {
// contains filtered or unexported fields
}
Timekeeper manages all of the kernel clocks.
+stateify savable
func NewTimekeeper ¶
func NewTimekeeper(mf *pgalloc.MemoryFile, paramPage memmap.FileRange) *Timekeeper
NewTimekeeper returns a Timekeeper that is automatically kept up-to-date. NewTimekeeper does not take ownership of paramPage.
SetClocks must be called on the returned Timekeeper before it is usable.
func (*Timekeeper) AfterFunc ¶
func (t *Timekeeper) AfterFunc(d time.Duration, f func()) tcpip.Timer
AfterFunc implements tcpip.Clock.
func (*Timekeeper) BootTime ¶
func (t *Timekeeper) BootTime() ktime.Time
BootTime returns the system boot real time.
func (*Timekeeper) Destroy ¶
func (t *Timekeeper) Destroy()
Destroy destroys the Timekeeper, freeing all associated resources.
func (*Timekeeper) GetTime ¶
func (t *Timekeeper) GetTime(c sentrytime.ClockID) (int64, error)
GetTime returns the current time in nanoseconds.
func (*Timekeeper) NowMonotonic ¶
func (t *Timekeeper) NowMonotonic() tcpip.MonotonicTime
NowMonotonic implements tcpip.Clock.
func (*Timekeeper) PauseUpdates ¶
func (t *Timekeeper) PauseUpdates()
PauseUpdates stops clock parameter updates. This should only be used when Tasks are not running and thus cannot access the clock.
func (*Timekeeper) ResumeUpdates ¶
func (t *Timekeeper) ResumeUpdates()
ResumeUpdates restarts clock parameter updates stopped by PauseUpdates.
func (*Timekeeper) SetClocks ¶
func (t *Timekeeper) SetClocks(c sentrytime.Clocks)
SetClocks the backing clock source.
SetClocks must be called before the Timekeeper is used, and it may not be called more than once, as changing the clock source without extra correction could cause time discontinuities.
It must also be called after Load.
type UTSNamespace ¶
type UTSNamespace struct {
// contains filtered or unexported fields
}
UTSNamespace represents a UTS namespace, a holder of two system identifiers: the hostname and domain name.
+stateify savable
func NewUTSNamespace ¶
func NewUTSNamespace(hostName, domainName string, userns *auth.UserNamespace) *UTSNamespace
NewUTSNamespace creates a new UTS namespace.
func UTSNamespaceFromContext ¶
func UTSNamespaceFromContext(ctx context.Context) *UTSNamespace
UTSNamespaceFromContext returns the UTS namespace in which ctx is executing, or nil if there is no such UTS namespace.
func (*UTSNamespace) Clone ¶
func (u *UTSNamespace) Clone(userns *auth.UserNamespace) *UTSNamespace
Clone makes a copy of this UTS namespace, associating the given user namespace.
func (*UTSNamespace) DecRef ¶
func (u *UTSNamespace) DecRef(ctx context.Context)
DecRef decrements the namespace's refcount.
func (*UTSNamespace) Destroy ¶
func (u *UTSNamespace) Destroy(ctx context.Context)
Destroy implements nsfs.Namespace.Destroy.
func (*UTSNamespace) DomainName ¶
func (u *UTSNamespace) DomainName() string
DomainName returns the domain name of this UTS namespace.
func (*UTSNamespace) GetInode ¶
func (u *UTSNamespace) GetInode() *nsfs.Inode
GetInode returns the nsfs inode associated with the UTS namespace.
func (*UTSNamespace) HostName ¶
func (u *UTSNamespace) HostName() string
HostName returns the host name of this UTS namespace.
func (*UTSNamespace) IncRef ¶
func (u *UTSNamespace) IncRef()
IncRef increments the Namespace's refcount.
func (*UTSNamespace) SetDomainName ¶
func (u *UTSNamespace) SetDomainName(domain string)
SetDomainName sets the domain name of this UTS namespace.
func (*UTSNamespace) SetHostName ¶
func (u *UTSNamespace) SetHostName(host string)
SetHostName sets the host name of this UTS namespace.
func (*UTSNamespace) SetInode ¶
func (u *UTSNamespace) SetInode(inode *nsfs.Inode)
SetInode sets the nsfs `inode` to the UTS namespace.
func (*UTSNamespace) Type ¶
func (u *UTSNamespace) Type() string
Type implements nsfs.Namespace.Type.
func (*UTSNamespace) UserNamespace ¶
func (u *UTSNamespace) UserNamespace() *auth.UserNamespace
UserNamespace returns the user namespace associated with this UTS namespace.
type UserCounters ¶
type UserCounters struct {
// contains filtered or unexported fields
}
UserCounters is a set of user counters.
+stateify savable
type VDSOParamPage ¶
type VDSOParamPage struct {
// contains filtered or unexported fields
}
VDSOParamPage manages a VDSO parameter page.
Its memory layout looks like:
type page struct { // seq is a sequence counter that protects the fields below. seq uint64 vdsoParams }
Everything in the struct is 8 bytes for easy alignment.
It must be kept in sync with params in vdso/vdso_time.cc.
+stateify savable
func NewVDSOParamPage ¶
func NewVDSOParamPage(mf *pgalloc.MemoryFile, fr memmap.FileRange) *VDSOParamPage
NewVDSOParamPage returns a VDSOParamPage.
Preconditions:
- fr is a single page allocated from mf. VDSOParamPage does not take ownership of fr; it must remain allocated for the lifetime of the VDSOParamPage.
- VDSOParamPage must be the only writer to fr.
- mf.MapInternal(fr) must return a single safemem.Block.
func (*VDSOParamPage) Write ¶
func (v *VDSOParamPage) Write(f func() vdsoParams) error
Write updates the VDSO parameters.
Write starts a write block, calls f to get the new parameters, writes out the new parameters, then ends the write block.
type Version ¶
type Version struct { // Operating system name (e.g. "Linux"). Sysname string // Operating system release (e.g. "4.4-amd64"). Release string // Operating system version. On Linux this takes the shape // "#VERSION CONFIG_FLAGS TIMESTAMP" // where: // - VERSION is a sequence counter incremented on every successful build // - CONFIG_FLAGS is a space-separated list of major enabled kernel features // (e.g. "SMP" and "PREEMPT") // - TIMESTAMP is the build timestamp as returned by `date` Version string }
Version defines the application-visible system version.
type WaitOptions ¶
type WaitOptions struct { // If SpecificTID is non-zero, only events from the task with thread ID // SpecificTID are eligible to be waited for. SpecificTID is resolved in // the PID namespace of the waiter (the method receiver of Task.Wait). If // no such task exists, or that task would not otherwise be eligible to be // waited for by the waiting task, then there are no waitable tasks and // Wait will return ECHILD. SpecificTID ThreadID // If SpecificPGID is non-zero, only events from ThreadGroups with a // matching ProcessGroupID are eligible to be waited for. (Same // constraints as SpecificTID apply.) SpecificPGID ProcessGroupID // If NonCloneTasks is true, events from non-clone tasks are eligible to be // waited for. NonCloneTasks bool // If CloneTasks is true, events from clone tasks are eligible to be waited // for. CloneTasks bool // If SiblingChildren is true, events from children tasks of any task // in the thread group of the waiter are eligible to be waited for. SiblingChildren bool // Events is a bitwise combination of the events defined above that specify // what events are of interest to the call to Wait. Events waiter.EventMask // If ConsumeEvent is true, the Wait should consume the event such that it // cannot be returned by a future Wait. Note that if a task exit is // consumed in this way, in most cases the task will be reaped. ConsumeEvent bool // If BlockInterruptErr is not nil, Wait will block until either an event // is available or there are no tasks that could produce a waitable event; // if that blocking is interrupted, Wait returns BlockInterruptErr. If // BlockInterruptErr is nil, Wait will not block. BlockInterruptErr error }
WaitOptions controls the behavior of Task.Wait.
type WaitResult ¶
type WaitResult struct { // Task is the task that reported the event. Task *Task // TID is the thread ID of Task in the PID namespace of the task that // called Wait (that is, the method receiver of the call to Task.Wait). TID // is provided because consuming exit waits cause the thread ID to be // deallocated. TID ThreadID // UID is the real UID of Task in the user namespace of the task that // called Wait. UID auth.UID // Event is exactly one of the events defined above. Event waiter.EventMask // Status is the wait status associated with the event. Status linux.WaitStatus }
WaitResult contains information about a waited-for event.
Source Files ¶
- aio.go
- cgroup.go
- context.go
- fd_table.go
- fd_table_unsafe.go
- fs_context.go
- ipc_namespace.go
- kcov.go
- kcov_unsafe.go
- kernel.go
- kernel_opts.go
- kernel_state.go
- pending_signals.go
- pending_signals_state.go
- posixtimer.go
- ptrace.go
- ptrace_amd64.go
- rseq.go
- seccheck.go
- seccomp.go
- sessions.go
- signal.go
- signal_handlers.go
- syscalls.go
- syscalls_state.go
- syslog.go
- task.go
- task_acct.go
- task_block.go
- task_cgroup.go
- task_clone.go
- task_context.go
- task_exec.go
- task_exit.go
- task_futex.go
- task_identity.go
- task_image.go
- task_key.go
- task_log.go
- task_net.go
- task_run.go
- task_sched.go
- task_signals.go
- task_start.go
- task_stop.go
- task_syscall.go
- task_usermem.go
- task_work.go
- thread_group.go
- threads.go
- threads_impl.go
- timekeeper.go
- timekeeper_state.go
- tty.go
- uts_namespace.go
- vdso.go
- version.go
Directories ¶
Path | Synopsis |
---|---|
Package auth implements an access control model that is a subset of Linux's.
|
Package auth implements an access control model that is a subset of Linux's. |
Package contexttest provides a test context.Context which includes a dummy kernel pointing to a valid platform.
|
Package contexttest provides a test context.Context which includes a dummy kernel pointing to a valid platform. |
Package fasync provides FIOASYNC related functionality.
|
Package fasync provides FIOASYNC related functionality. |
Package futex provides an implementation of the futex interface as found in the Linux kernel.
|
Package futex provides an implementation of the futex interface as found in the Linux kernel. |
Package ipc defines functionality and utilities common to sysvipc mechanisms.
|
Package ipc defines functionality and utilities common to sysvipc mechanisms. |
Package memevent implements the memory usage events controller, which periodically emits events via the eventchannel.
|
Package memevent implements the memory usage events controller, which periodically emits events via the eventchannel. |
Package mq provides an implementation for POSIX message queues.
|
Package mq provides an implementation for POSIX message queues. |
Package msgqueue implements System V message queues.
|
Package msgqueue implements System V message queues. |
Package pipe provides a pipe implementation.
|
Package pipe provides a pipe implementation. |
Package sched implements scheduler related features.
|
Package sched implements scheduler related features. |
Package semaphore implements System V semaphores.
|
Package semaphore implements System V semaphores. |
Package shm implements sysv shared memory segments.
|
Package shm implements sysv shared memory segments. |
Package time defines the Timer type, which provides a periodic timer that works by sampling a user-provided clock.
|
Package time defines the Timer type, which provides a periodic timer that works by sampling a user-provided clock. |