syscall

package
v0.9.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 31, 2024 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Index

Constants

View Source
const (
	CLONE_VM             = 0x00000100 // set if VM shared between processes
	CLONE_FS             = 0x00000200 // set if fs info shared between processes
	CLONE_FILES          = 0x00000400 // set if open files shared between processes
	CLONE_SIGHAND        = 0x00000800 // set if signal handlers and blocked signals shared
	CLONE_PIDFD          = 0x00001000 // set if a pidfd should be placed in parent
	CLONE_PTRACE         = 0x00002000 // set if we want to let tracing continue on the child too
	CLONE_VFORK          = 0x00004000 // set if the parent wants the child to wake it up on mm_release
	CLONE_PARENT         = 0x00008000 // set if we want to have the same parent as the cloner
	CLONE_THREAD         = 0x00010000 // Same thread group?
	CLONE_NEWNS          = 0x00020000 // New mount namespace group
	CLONE_SYSVSEM        = 0x00040000 // share system V SEM_UNDO semantics
	CLONE_SETTLS         = 0x00080000 // create a new TLS for the child
	CLONE_PARENT_SETTID  = 0x00100000 // set the TID in the parent
	CLONE_CHILD_CLEARTID = 0x00200000 // clear the TID in the child
	CLONE_DETACHED       = 0x00400000 // Unused, ignored
	CLONE_UNTRACED       = 0x00800000 // set if the tracing process can't force CLONE_PTRACE on this clone
	CLONE_CHILD_SETTID   = 0x01000000 // set the TID in the child
	CLONE_NEWCGROUP      = 0x02000000 // New cgroup namespace
	CLONE_NEWUTS         = 0x04000000 // New utsname namespace
	CLONE_NEWIPC         = 0x08000000 // New ipc namespace
	CLONE_NEWUSER        = 0x10000000 // New user namespace
	CLONE_NEWPID         = 0x20000000 // New pid namespace
	CLONE_NEWNET         = 0x40000000 // New network namespace
	CLONE_IO             = 0x80000000 // Clone io context

	CLONE_CLEAR_SIGHAND = 0x100000000 // Clear any signal handler and reset to SIG_DFL.
	CLONE_INTO_CGROUP   = 0x200000000 // Clone into a specific cgroup given the right permissions.

	CLONE_NEWTIME = 0x00000080 // New time namespace
)

Linux unshare/clone/clone2/clone3 flags, architecture-independent, copied from linux/sched.h.

Variables

View Source
var (
	Stdin  = 0
	Stdout = 1
	Stderr = 2
)
View Source
var ForkLock sync.RWMutex

ForkLock is used to synchronize creation of new file descriptors with fork.

We want the child in a fork/exec sequence to inherit only the file descriptors we intend. To do that, we mark all file descriptors close-on-exec and then, in the child, explicitly unmark the ones we want the exec'ed program to keep. Unix doesn't make this easy: there is, in general, no way to allocate a new file descriptor close-on-exec. Instead you have to allocate the descriptor and then mark it close-on-exec. If a fork happens between those two events, the child's exec will inherit an unwanted file descriptor.

This lock solves that race: the create new fd/mark close-on-exec operation is done holding ForkLock for reading, and the fork itself is done holding ForkLock for writing. At least, that's the idea. There are some complications.

Some system calls that create new file descriptors can block for arbitrarily long times: open on a hung NFS server or named pipe, accept on a socket, and so on. We can't reasonably grab the lock across those operations.

It is worse to inherit some file descriptors than others. If a non-malicious child accidentally inherits an open ordinary file, that's not a big deal. On the other hand, if a long-lived child accidentally inherits the write end of a pipe, then the reader of that pipe will not see EOF until that child exits, potentially causing the parent program to hang. This is a common problem in threaded C programs that use popen.

Luckily, the file descriptors that are most important not to inherit are not the ones that can take an arbitrarily long time to create: pipe returns instantly, and the net package uses non-blocking I/O to accept on a listening socket. The rules for which file descriptor-creating operations use the ForkLock are as follows:

  • Pipe. Use pipe2 if available. Otherwise, does not block, so use ForkLock.
  • Socket. Use SOCK_CLOEXEC if available. Otherwise, does not block, so use ForkLock.
  • Open. Use O_CLOEXEC if available. Otherwise, may block, so live with the race.
  • Dup. Use F_DUPFD_CLOEXEC or dup3 if available. Otherwise, does not block, so use ForkLock.

Functions

func Clearenv added in v0.9.2

func Clearenv()

func Close added in v0.8.10

func Close(fd int) (err error)

func CloseOnExec added in v0.9.1

func CloseOnExec(fd int)

func Environ added in v0.9.2

func Environ() []string

func Exec added in v0.9.1

func Exec(argv0 string, argv []string, envv []string) (err error)

Exec invokes the execve(2) system call.

func Exit added in v0.9.1

func Exit(code int)

func Faccessat added in v0.9.1

func Faccessat(dirfd int, path string, mode uint32, flags int) (err error)

func ForkExec added in v0.9.1

func ForkExec(argv0 string, argv []string, attr *ProcAttr) (pid int, err error)

Combination of fork and exec, careful to be thread safe.

func Getcwd

func Getcwd(buf []byte) (n int, err error)

func Getenv added in v0.8.10

func Getenv(key string) (value string, found bool)

func Getpagesize added in v0.9.1

func Getpagesize() int

func Getpid added in v0.8.10

func Getpid() (pid int)

func Getrlimit added in v0.9.2

func Getrlimit(which int, lim *Rlimit) (err error)

func Getwd

func Getwd() (string, error)

func Kill added in v0.8.10

func Kill(pid int, signum Signal) (err error)

func Lstat added in v0.9.1

func Lstat(path string, stat *Stat_t) (err error)

func Open added in v0.8.10

func Open(path string, mode int, perm uint32) (fd int, err error)

func Pipe added in v0.9.2

func Pipe(p []int) (err error)

func Pipe2 added in v0.9.3

func Pipe2(p []int, flags int) error

func Read added in v0.8.10

func Read(fd int, p []byte) (n int, err error)

func Seek added in v0.8.10

func Seek(fd int, offset int64, whence int) (newoffset int64, err error)

func SetNonblock added in v0.9.1

func SetNonblock(fd int, nonblocking bool) (err error)

func Setenv added in v0.9.2

func Setenv(key, value string) error

func Setrlimit added in v0.9.2

func Setrlimit(resource int, rlim *Rlimit) error

func StartProcess added in v0.9.1

func StartProcess(argv0 string, argv []string, attr *ProcAttr) (pid int, handle uintptr, err error)

StartProcess wraps ForkExec for package os.

func Stat added in v0.9.1

func Stat(path string, stat *Stat_t) (err error)

func Unsetenv added in v0.9.2

func Unsetenv(key string) error

func Wait4 added in v0.9.3

func Wait4(pid int, wstatus *WaitStatus, options int, rusage *syscall.Rusage) (wpid int, err error)

Types

type Credential added in v0.9.1

type Credential struct {
	Uid         uint32   // User ID.
	Gid         uint32   // Group ID.
	Groups      []uint32 // Supplementary group IDs.
	NoSetGroups bool     // If true, don't set supplementary groups
}

Credential holds user and group identities to be assumed by a child process started by StartProcess.

type Errno added in v0.8.10

type Errno uintptr

func (Errno) Error added in v0.8.10

func (e Errno) Error() string

func (Errno) Is added in v0.8.10

func (e Errno) Is(target error) bool

func (Errno) Temporary added in v0.8.10

func (e Errno) Temporary() bool

func (Errno) Timeout added in v0.8.10

func (e Errno) Timeout() bool

type ProcAttr added in v0.9.1

type ProcAttr struct {
	Dir   string    // Current working directory.
	Env   []string  // Environment.
	Files []uintptr // File descriptors.
	Sys   *SysProcAttr
}

ProcAttr holds attributes that will be applied to a new process started by StartProcess.

type Rlimit added in v0.9.2

type Rlimit syscall.Rlimit

type Signal added in v0.8.10

type Signal int

A Signal is a number describing a process signal. It implements the os.Signal interface.

func (Signal) Signal added in v0.8.10

func (s Signal) Signal()

func (Signal) String added in v0.8.10

func (s Signal) String() string

type Stat_t added in v0.9.1

type Stat_t = syscall.Stat_t

type SysProcAttr added in v0.9.1

type SysProcAttr struct {
	Chroot     string      // Chroot.
	Credential *Credential // Credential.
	// Ptrace tells the child to call ptrace(PTRACE_TRACEME).
	// Call runtime.LockOSThread before starting a process with this set,
	// and don't call UnlockOSThread until done with PtraceSyscall calls.
	Ptrace bool
	Setsid bool // Create session.
	// Setpgid sets the process group ID of the child to Pgid,
	// or, if Pgid == 0, to the new child's process ID.
	Setpgid bool
	// Setctty sets the controlling terminal of the child to
	// file descriptor Ctty. Ctty must be a descriptor number
	// in the child process: an index into ProcAttr.Files.
	// This is only meaningful if Setsid is true.
	Setctty bool
	Noctty  bool // Detach fd 0 from controlling terminal
	Ctty    int  // Controlling TTY fd
	// Foreground places the child process group in the foreground.
	// This implies Setpgid. The Ctty field must be set to
	// the descriptor of the controlling TTY.
	// Unlike Setctty, in this case Ctty must be a descriptor
	// number in the parent process.
	Foreground bool
	Pgid       int // Child's process group ID if Setpgid.
	// Pdeathsig, if non-zero, is a signal that the kernel will send to
	// the child process when the creating thread dies. Note that the signal
	// is sent on thread termination, which may happen before process termination.
	// There are more details at https://go.dev/issue/27505.
	Pdeathsig    Signal
	Cloneflags   uintptr        // Flags for clone calls (Linux only)
	Unshareflags uintptr        // Flags for unshare calls (Linux only)
	UidMappings  []SysProcIDMap // User ID mappings for user namespaces.
	GidMappings  []SysProcIDMap // Group ID mappings for user namespaces.
	// GidMappingsEnableSetgroups enabling setgroups syscall.
	// If false, then setgroups syscall will be disabled for the child process.
	// This parameter is no-op if GidMappings == nil. Otherwise for unprivileged
	// users this should be set to false for mappings work.
	GidMappingsEnableSetgroups bool
	AmbientCaps                []uintptr // Ambient capabilities (Linux only)
	UseCgroupFD                bool      // Whether to make use of the CgroupFD field.
	CgroupFD                   int       // File descriptor of a cgroup to put the new process into.
}

type SysProcIDMap added in v0.9.1

type SysProcIDMap struct {
	ContainerID int // Container ID.
	HostID      int // Host ID.
	Size        int // Size.
}

SysProcIDMap holds Container ID to Host ID mappings used for User Namespaces in Linux. See user_namespaces(7).

type Timespec added in v0.9.1

type Timespec syscall.Timespec

func (*Timespec) Nano added in v0.9.1

func (ts *Timespec) Nano() int64

Nano returns the time stored in ts as nanoseconds.

func (*Timespec) Unix added in v0.9.1

func (ts *Timespec) Unix() (sec int64, nsec int64)

Unix returns the time stored in ts as seconds plus nanoseconds.

type Timeval added in v0.9.1

type Timeval syscall.Timeval

func (*Timeval) Nano added in v0.9.1

func (tv *Timeval) Nano() int64

Nano returns the time stored in tv as nanoseconds.

func (*Timeval) Unix added in v0.9.1

func (tv *Timeval) Unix() (sec int64, nsec int64)

Unix returns the time stored in tv as seconds plus nanoseconds.

type WaitStatus added in v0.9.3

type WaitStatus uint32

func (WaitStatus) Continued added in v0.9.3

func (w WaitStatus) Continued() bool

func (WaitStatus) CoreDump added in v0.9.3

func (w WaitStatus) CoreDump() bool

func (WaitStatus) ExitStatus added in v0.9.3

func (w WaitStatus) ExitStatus() int

func (WaitStatus) Exited added in v0.9.3

func (w WaitStatus) Exited() bool

func (WaitStatus) Signal added in v0.9.3

func (w WaitStatus) Signal() Signal

func (WaitStatus) Signaled added in v0.9.3

func (w WaitStatus) Signaled() bool

func (WaitStatus) StopSignal added in v0.9.3

func (w WaitStatus) StopSignal() Signal

func (WaitStatus) Stopped added in v0.9.3

func (w WaitStatus) Stopped() bool

func (WaitStatus) TrapCause added in v0.9.6

func (w WaitStatus) TrapCause() int

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL