vfs

package
v0.0.0-...-ba09d25 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 29, 2021 License: Apache-2.0, MIT Imports: 28 Imported by: 0

README

The gVisor Virtual Filesystem

Implementation Notes

Reference Counting

Filesystem, Dentry, Mount, MountNamespace, and FileDescription are all reference-counted. Mount and MountNamespace are exclusively VFS-managed; when their reference count reaches zero, VFS releases their resources. Filesystem and FileDescription management is shared between VFS and filesystem implementations; when their reference count reaches zero, VFS notifies the implementation by calling FilesystemImpl.Release() or FileDescriptionImpl.Release() respectively and then releases VFS-owned resources. Dentries are exclusively managed by filesystem implementations; reference count changes are abstracted through DentryImpl, which should release resources when reference count reaches zero.

Filesystem references are held by:

  • Mount: Each referenced Mount holds a reference on the mounted Filesystem.

Dentry references are held by:

  • FileDescription: Each referenced FileDescription holds a reference on the Dentry through which it was opened, via FileDescription.vd.dentry.

  • Mount: Each referenced Mount holds a reference on its mount point and on the mounted filesystem root. The mount point is mutable (mount(MS_MOVE)).

Mount references are held by:

  • FileDescription: Each referenced FileDescription holds a reference on the Mount on which it was opened, via FileDescription.vd.mount.

  • Mount: Each referenced Mount holds a reference on its parent, which is the mount containing its mount point.

  • VirtualFilesystem: A reference is held on each Mount that has been connected to a mount point, but not yet umounted.

MountNamespace and FileDescription references are held by users of VFS. The expectation is that each kernel.Task holds a reference on its corresponding MountNamespace, and each file descriptor holds a reference on its represented FileDescription.

Notes:

  • Dentries do not hold a reference on their owning Filesystem. Instead, all uses of a Dentry occur in the context of a Mount, which holds a reference on the relevant Filesystem (see e.g. the VirtualDentry type). As a corollary, when releasing references on both a Dentry and its corresponding Mount, the Dentry's reference must be released first (because releasing the Mount's reference may release the last reference on the Filesystem, whose state may be required to release the Dentry reference).
The Inheritance Pattern

Filesystem, Dentry, and FileDescription are all concepts featuring both state that must be shared between VFS and filesystem implementations, and operations that are implementation-defined. To facilitate this, each of these three concepts follows the same pattern, shown below for Dentry:

// Dentry represents a node in a filesystem tree.
type Dentry struct {
  // VFS-required dentry state.
  parent *Dentry
  // ...

  // impl is the DentryImpl associated with this Dentry. impl is immutable.
  // This should be the last field in Dentry.
  impl DentryImpl
}

// Init must be called before first use of d.
func (d *Dentry) Init(impl DentryImpl) {
  d.impl = impl
}

// Impl returns the DentryImpl associated with d.
func (d *Dentry) Impl() DentryImpl {
  return d.impl
}

// DentryImpl contains implementation-specific details of a Dentry.
// Implementations of DentryImpl should contain their associated Dentry by
// value as their first field.
type DentryImpl interface {
  // VFS-required implementation-defined dentry operations.
  IncRef()
  // ...
}

This construction, which is essentially a type-safe analogue to Linux's container_of pattern, has the following properties:

  • VFS works almost exclusively with pointers to Dentry rather than DentryImpl interface objects, such as in the type of Dentry.parent. This avoids interface method calls (which are somewhat expensive to perform, and defeat inlining and escape analysis), reduces the size of VFS types (since an interface object is two pointers in size), and allows pointers to be loaded and stored atomically using sync/atomic. Implementation-defined behavior is accessed via Dentry.impl when required.

  • Filesystem implementations can access the implementation-defined state associated with objects of VFS types by type-asserting or type-switching (e.g. Dentry.Impl().(*myDentry)). Type assertions to a concrete type require only an equality comparison of the interface object's type pointer to a static constant, and are consequently very fast.

  • Filesystem implementations can access the VFS state associated with objects of implementation-defined types directly.

  • VFS and implementation-defined state for a given type occupy the same object, minimizing memory allocations and maximizing memory locality. impl is the last field in Dentry, and Dentry is the first field in DentryImpl implementations, for similar reasons: this tends to cause fetching of the Dentry.impl interface object to also fetch DentryImpl fields, either because they are in the same cache line or via next-line prefetching.

Future Work

  • Most mount(2) features, and unmounting, are incomplete.

  • VFS1 filesystems are not directly compatible with VFS2. It may be possible to implement shims that implement vfs.FilesystemImpl for fs.MountNamespace, vfs.DentryImpl for fs.Dirent, and vfs.FileDescriptionImpl for fs.File, which may be adequate for filesystems that are not performance-critical (e.g. sysfs); however, it is not clear that this will be less effort than simply porting the filesystems in question. Practically speaking, the following filesystems will probably need to be ported or made compatible through a shim to evaluate filesystem performance on realistic workloads:

    • devfs/procfs/sysfs, which will realistically be necessary to execute most applications. (Note that procfs and sysfs do not support hard links, so they do not require the complexity of separate inode objects. Also note that Linux's /dev is actually a variant of tmpfs called devtmpfs.)

    • tmpfs. This should be relatively straightforward: copy/paste memfs, store regular file contents in pgalloc-allocated memory instead of []byte, and add support for file timestamps. (In fact, it probably makes more sense to convert memfs to tmpfs and not keep the former.)

    • A remote filesystem, either lisafs (if it is ready by the time that other benchmarking prerequisites are) or v9fs (aka 9P, aka gofers).

    • epoll files.

    Filesystems that will need to be ported before switching to VFS2, but can probably be skipped for early testing:

    • overlayfs, which is needed for (at least) synthetic mount points.

    • Support for host ttys.

    • timerfd files.

    Filesystems that can be probably dropped:

    • ashmem, which is far too incomplete to use.

    • binder, which is similarly far too incomplete to use.

  • Save/restore. For instance, it is unclear if the current implementation of the state package supports the inheritance pattern described above.

  • Many features that were previously implemented by VFS must now be implemented by individual filesystems (though, in most cases, this should consist of calls to hooks or libraries provided by vfs or other packages). This includes, but is not necessarily limited to:

    • Block and character device special files

    • Inotify

    • File locking

    • O_ASYNC

Documentation

Overview

Package vfs implements a virtual filesystem layer.

Lock order:

EpollInstance.interestMu

FileDescription.epollMu
  FilesystemImpl/FileDescriptionImpl locks
    VirtualFilesystem.mountMu
      Dentry.mu
        Locks acquired by FilesystemImpls between Prepare{Delete,Rename}Dentry and Commit{Delete,Rename*}Dentry
      VirtualFilesystem.filesystemsMu
    fdnotifier.notifier.mu
      EpollInstance.mu
        Locks acquired by FileDescriptionImpl.Readiness
    Inotify.mu
      Watches.mu
        Inotify.evMu

VirtualFilesystem.fsTypesMu

Locking Dentry.mu in multiple Dentries requires holding VirtualFilesystem.mountMu. Locking EpollInstance.interestMu in multiple EpollInstances requires holding epollCycleMu.

Index

Constants

View Source
const (
	// CtxMountNamespace is a Context.Value key for a MountNamespace.
	CtxMountNamespace contextID = iota

	// CtxRoot is a Context.Value key for a VFS root.
	CtxRoot
)
View Source
const FileCreationFlags = linux.O_CREAT | linux.O_EXCL | linux.O_NOCTTY | linux.O_TRUNC

FileCreationFlags are the set of flags passed to FileDescription.Init() but omitted from FileDescription.StatusFlags().

Variables

This section is empty.

Functions

func CanActAsOwner

func CanActAsOwner(creds *auth.Credentials, kuid auth.KUID) bool

CanActAsOwner returns true if creds can act as the owner of a file with the given owning UID, consistent with Linux's fs/inode.c:inode_owner_or_capable().

func CheckDeleteSticky

func CheckDeleteSticky(creds *auth.Credentials, parentMode linux.FileMode, parentKUID auth.KUID, childKUID auth.KUID, childKGID auth.KGID) error

CheckDeleteSticky checks whether the sticky bit is set on a directory with the given file mode, and if so, checks whether creds has permission to remove a file owned by childKUID from a directory with the given mode. CheckDeleteSticky is consistent with fs/linux.h:check_sticky().

func CheckLimit

func CheckLimit(ctx context.Context, offset, size int64) (int64, error)

CheckLimit enforces file size rlimits. It returns error if the write operation must not proceed. Otherwise it returns the max length allowed to without violating the limit.

func CheckSetStat

func CheckSetStat(ctx context.Context, creds *auth.Credentials, opts *SetStatOptions, mode linux.FileMode, kuid auth.KUID, kgid auth.KGID) error

CheckSetStat checks that creds has permission to change the metadata of a file with the given permissions, UID, and GID as specified by stat, subject to the rules of Linux's fs/attr.c:setattr_prepare().

func CheckXattrPermissions

func CheckXattrPermissions(creds *auth.Credentials, ats AccessTypes, mode linux.FileMode, kuid auth.KUID, name string) error

CheckXattrPermissions checks permissions for extended attribute access. This is analogous to fs/xattr.c:xattr_permission(). Some key differences:

  • Does not check for read-only filesystem property.
  • Does not check inode immutability or append only mode. In both cases EPERM must be returned by filesystem implementations.
  • Does not do inode permission checks. Filesystem implementations should handle inode permission checks as they may differ across implementations.

func ClearSUIDAndSGID

func ClearSUIDAndSGID(mode uint32) uint32

ClearSUIDAndSGID clears the setuid and/or setgid bits after a chown or write. Depending on the mode, neither bit, only the setuid bit, or both are cleared.

func CopyRegularFileData

func CopyRegularFileData(ctx context.Context, dstFD, srcFD *FileDescription) (int64, error)

CopyRegularFileData copies data from srcFD to dstFD until reading from srcFD returns EOF or an error. It returns the number of bytes copied.

func GenericCheckPermissions

func GenericCheckPermissions(creds *auth.Credentials, ats AccessTypes, mode linux.FileMode, kuid auth.KUID, kgid auth.KGID) error

GenericCheckPermissions checks that creds has the given access rights on a file with the given permissions, UID, and GID, subject to the rules of fs/namei.c:generic_permission().

func GenericConfigureMMap

func GenericConfigureMMap(fd *FileDescription, m memmap.Mappable, opts *memmap.MMapOpts) error

GenericConfigureMMap may be used by most implementations of FileDescriptionImpl.ConfigureMMap.

func GenericParseMountOptions

func GenericParseMountOptions(str string) map[string]string

GenericParseMountOptions parses a comma-separated list of options of the form "key" or "key=value", where neither key nor value contain commas, and returns it as a map. If str contains duplicate keys, then the last value wins. For example:

str = "key0=value0,key1,key2=value2,key0=value3" -> map{'key0':'value3','key1':”,'key2':'value2'}

GenericParseMountOptions is not appropriate if values may contain commas, e.g. in the case of the mpol mount option for tmpfs(5).

func GenericStatFS

func GenericStatFS(fsMagic uint64) linux.Statfs

GenericStatFS returns a statfs struct filled with the common fields for a general filesystem. This is analogous to Linux's fs/libfs.cs:simple_statfs().

func HasCapabilityOnFile

func HasCapabilityOnFile(creds *auth.Credentials, cp linux.Capability, kuid auth.KUID, kgid auth.KGID) bool

HasCapabilityOnFile returns true if creds has the given capability with respect to a file with the given owning UID and GID, consistent with Linux's kernel/capability.c:capable_wrt_inode_uidgid().

func InotifyEventFromStatMask

func InotifyEventFromStatMask(mask uint32) uint32

InotifyEventFromStatMask generates the appropriate events for an operation that set the stats specified in mask.

func InotifyRemoveChild

func InotifyRemoveChild(ctx context.Context, self, parent *Watches, name string)

InotifyRemoveChild sends the appriopriate notifications to the watch sets of the child being removed and its parent. Note that unlike most pairs of parent/child notifications, the child is notified first in this case.

func InotifyRename

func InotifyRename(ctx context.Context, renamed, oldParent, newParent *Watches, oldName, newName string, isDir bool)

InotifyRename sends the appriopriate notifications to the watch sets of the file being renamed and its old/new parents.

func MayLink(creds *auth.Credentials, mode linux.FileMode, kuid auth.KUID, kgid auth.KGID) error

MayLink determines whether creating a hard link to a file with the given mode, kuid, and kgid is permitted.

This corresponds to Linux's fs/namei.c:may_linkat.

func MayReadFileWithOpenFlags

func MayReadFileWithOpenFlags(flags uint32) bool

MayReadFileWithOpenFlags returns true if a file with the given open flags should be readable.

func MayWriteFileWithOpenFlags

func MayWriteFileWithOpenFlags(flags uint32) bool

MayWriteFileWithOpenFlags returns true if a file with the given open flags should be writable.

func WithMountNamespace

func WithMountNamespace(ctx context.Context, mntns *MountNamespace) context.Context

WithMountNamespace returns a copy of ctx with the given MountNamespace.

func WithRoot

func WithRoot(ctx context.Context, root VirtualDentry) context.Context

WithRoot returns a copy of ctx with the given root.

Types

type AccessTypes

type AccessTypes uint16

AccessTypes is a bitmask of Unix file permissions.

+stateify savable

const (
	MayExec  AccessTypes = 1
	MayWrite AccessTypes = 2
	MayRead  AccessTypes = 4
)

Bits in AccessTypes.

func AccessTypesForOpenFlags

func AccessTypesForOpenFlags(opts *OpenOptions) AccessTypes

AccessTypesForOpenFlags returns the access types required to open a file with the given OpenOptions.Flags. Note that this is NOT the same thing as the set of accesses permitted for the opened file:

- O_TRUNC causes MayWrite to be set in the returned AccessTypes (since it mutates the file), but does not permit writing to the open file description thereafter.

- "Linux reserves the special, nonstandard access mode 3 (binary 11) in flags to mean: check for read and write permission on the file and return a file descriptor that can't be used for reading or writing." - open(2). Thus AccessTypesForOpenFlags returns MayRead|MayWrite in this case.

Use May{Read,Write}FileWithOpenFlags() for these checks instead.

func (AccessTypes) MayExec

func (a AccessTypes) MayExec() bool

MayExec returns true if access allows exec.

func (AccessTypes) MayRead

func (a AccessTypes) MayRead() bool

MayRead returns true if access allows read.

func (AccessTypes) MayWrite

func (a AccessTypes) MayWrite() bool

MayWrite returns true if access allows write.

func (AccessTypes) OnlyRead

func (a AccessTypes) OnlyRead() bool

OnlyRead returns true if access _only_ allows read.

type BadLockFD

type BadLockFD struct{}

BadLockFD implements Lock*/Unlock* portion of FileDescriptionImpl interface returning EBADF.

+stateify savable

func (BadLockFD) LockBSD

func (BadLockFD) LockBSD(ctx context.Context, uid fslock.UniqueID, ownerPID int32, t fslock.LockType, block bool) error

LockBSD implements FileDescriptionImpl.LockBSD.

func (BadLockFD) LockPOSIX

func (BadLockFD) LockPOSIX(ctx context.Context, uid fslock.UniqueID, ownerPID int32, t fslock.LockType, r fslock.LockRange, block bool) error

LockPOSIX implements FileDescriptionImpl.LockPOSIX.

func (BadLockFD) SupportsLocks

func (BadLockFD) SupportsLocks() bool

SupportsLocks implements FileDescriptionImpl.SupportsLocks.

func (BadLockFD) TestPOSIX

TestPOSIX implements FileDescriptionImpl.TestPOSIX.

func (BadLockFD) UnlockBSD

func (BadLockFD) UnlockBSD(ctx context.Context, uid fslock.UniqueID) error

UnlockBSD implements FileDescriptionImpl.UnlockBSD.

func (BadLockFD) UnlockPOSIX

func (BadLockFD) UnlockPOSIX(ctx context.Context, uid fslock.UniqueID, r fslock.LockRange) error

UnlockPOSIX implements FileDescriptionImpl.UnlockPOSIX.

type BoundEndpointOptions

type BoundEndpointOptions struct {
	// Addr is the path of the file whose socket endpoint is being retrieved.
	// It is generally irrelevant: most endpoints are stored at a dentry that
	// was created through a bind syscall, so the path can be stored on creation.
	// However, if the endpoint was created in FilesystemImpl.BoundEndpointAt(),
	// then we may not know what the original bind address was.
	//
	// For example, if connect(2) is called with address "foo" which corresponds
	// a remote named socket in goferfs, we need to generate an endpoint wrapping
	// that file. In this case, we can use Addr to set the endpoint address to
	// "foo". Note that Addr is only a best-effort attempt--we still do not know
	// the exact address that was used on the remote fs to bind the socket (it
	// may have been "foo", "./foo", etc.).
	Addr string
}

BoundEndpointOptions contains options to VirtualFilesystem.BoundEndpointAt() and FilesystemImpl.BoundEndpointAt().

+stateify savable

type CompleteRestoreOptions

type CompleteRestoreOptions struct {
	// If ValidateFileSizes is true, filesystem implementations backed by
	// remote filesystems should verify that file sizes have not changed
	// between checkpoint and restore.
	ValidateFileSizes bool

	// If ValidateFileModificationTimestamps is true, filesystem
	// implementations backed by remote filesystems should validate that file
	// mtimes have not changed between checkpoint and restore.
	ValidateFileModificationTimestamps bool
}

CompleteRestoreOptions contains options to VirtualFilesystem.CompleteRestore() and FilesystemImplSaveRestoreExtension.CompleteRestore().

type Dentry

type Dentry struct {
	// contains filtered or unexported fields
}

Dentry represents a node in a Filesystem tree at which a file exists.

Dentries are reference-counted. Unless otherwise specified, all Dentry methods require that a reference is held.

Dentry is loosely analogous to Linux's struct dentry, but:

- VFS does not associate Dentries with inodes. gVisor interacts primarily with filesystems that are accessed through filesystem APIs (as opposed to raw block devices); many such APIs support only paths and file descriptors, and not inodes. Furthermore, when parties outside the scope of VFS can rename inodes on such filesystems, VFS generally cannot "follow" the rename, both due to synchronization issues and because it may not even be able to name the destination path; this implies that it would in fact be incorrect for Dentries to be associated with inodes on such filesystems. Consequently, operations that are inode operations in Linux are FilesystemImpl methods and/or FileDescriptionImpl methods in gVisor's VFS. Filesystems that do support inodes may store appropriate state in implementations of DentryImpl.

- VFS does not require that Dentries are instantiated for all paths accessed through VFS, only those that are tracked beyond the scope of a single Filesystem operation. This includes file descriptions, mount points, mount roots, process working directories, and chroots. This avoids instantiation of Dentries for operations on mutable remote filesystems that can't actually cache any state in the Dentry.

- VFS does not track filesystem structure (i.e. relationships between Dentries), since both the relevant state and synchronization are filesystem-specific.

- For the reasons above, VFS is not directly responsible for managing Dentry lifetime. Dentry reference counts only indicate the extent to which VFS requires Dentries to exist; Filesystems may elect to cache or discard Dentries with zero references.

+stateify savable

func (*Dentry) DecRef

func (d *Dentry) DecRef(ctx context.Context)

DecRef decrements d's reference count.

func (*Dentry) Impl

func (d *Dentry) Impl() DentryImpl

Impl returns the DentryImpl associated with d.

func (*Dentry) IncRef

func (d *Dentry) IncRef()

IncRef increments d's reference count.

func (*Dentry) Init

func (d *Dentry) Init(impl DentryImpl)

Init must be called before first use of d.

func (*Dentry) InotifyWithParent

func (d *Dentry) InotifyWithParent(ctx context.Context, events, cookie uint32, et EventType)

InotifyWithParent notifies all watches on the targets represented by d and its parent of events.

func (*Dentry) IsDead

func (d *Dentry) IsDead() bool

IsDead returns true if d has been deleted or invalidated by its owning filesystem.

func (*Dentry) OnZeroWatches

func (d *Dentry) OnZeroWatches(ctx context.Context)

OnZeroWatches performs cleanup tasks whenever the number of watches on a dentry drops to zero.

func (*Dentry) TryIncRef

func (d *Dentry) TryIncRef() bool

TryIncRef increments d's reference count and returns true. If d's reference count is zero, TryIncRef may instead do nothing and return false.

func (*Dentry) Watches

func (d *Dentry) Watches() *Watches

Watches returns the set of inotify watches associated with d.

Watches will return nil if d belongs to a FilesystemImpl that does not support inotify.

type DentryImpl

type DentryImpl interface {
	// IncRef increments the Dentry's reference count. A Dentry with a non-zero
	// reference count must remain coherent with the state of the filesystem.
	IncRef()

	// TryIncRef increments the Dentry's reference count and returns true. If
	// the Dentry's reference count is zero, TryIncRef may do nothing and
	// return false. (It is also permitted to succeed if it can restore the
	// guarantee that the Dentry is coherent with the state of the filesystem.)
	//
	// TryIncRef does not require that a reference is held on the Dentry.
	TryIncRef() bool

	// DecRef decrements the Dentry's reference count.
	DecRef(ctx context.Context)

	// InotifyWithParent notifies all watches on the targets represented by this
	// dentry and its parent. The parent's watches are notified first, followed
	// by this dentry's.
	//
	// InotifyWithParent automatically adds the IN_ISDIR flag for dentries
	// representing directories.
	//
	// Note that the events may not actually propagate up to the user, depending
	// on the event masks.
	InotifyWithParent(ctx context.Context, events, cookie uint32, et EventType)

	// Watches returns the set of inotify watches for the file corresponding to
	// the Dentry. Dentries that are hard links to the same underlying file
	// share the same watches.
	//
	// Watches may return nil if the dentry belongs to a FilesystemImpl that
	// does not support inotify. If an implementation returns a non-nil watch
	// set, it must always return a non-nil watch set. Likewise, if an
	// implementation returns a nil watch set, it must always return a nil watch
	// set.
	//
	// The caller does not need to hold a reference on the dentry.
	Watches() *Watches

	// OnZeroWatches is called whenever the number of watches on a dentry drops
	// to zero. This is needed by some FilesystemImpls (e.g. gofer) to manage
	// dentry lifetime.
	//
	// The caller does not need to hold a reference on the dentry. OnZeroWatches
	// may acquire inotify locks, so to prevent deadlock, no inotify locks should
	// be held by the caller.
	OnZeroWatches(ctx context.Context)
}

DentryImpl contains implementation details for a Dentry. Implementations of DentryImpl should contain their associated Dentry by value as their first field.

+stateify savable

type DentryMetadataFileDescriptionImpl

type DentryMetadataFileDescriptionImpl struct{}

DentryMetadataFileDescriptionImpl may be embedded by implementations of FileDescriptionImpl for which FileDescriptionOptions.UseDentryMetadata is true to obtain implementations of Stat and SetStat that panic.

+stateify savable

func (DentryMetadataFileDescriptionImpl) SetStat

SetStat implements FileDescriptionImpl.SetStat.

func (DentryMetadataFileDescriptionImpl) Stat

Stat implements FileDescriptionImpl.Stat.

type Device

type Device interface {
	// Open returns a FileDescription representing this device.
	Open(ctx context.Context, mnt *Mount, d *Dentry, opts OpenOptions) (*FileDescription, error)
}

A Device backs device special files.

type DeviceKind

type DeviceKind uint32

DeviceKind indicates whether a device is a block or character device.

+stateify savable

const (
	// BlockDevice indicates a block device.
	BlockDevice DeviceKind = iota

	// CharDevice indicates a character device.
	CharDevice
)

func (DeviceKind) String

func (kind DeviceKind) String() string

String implements fmt.Stringer.String.

type DirectoryFileDescriptionDefaultImpl

type DirectoryFileDescriptionDefaultImpl struct{}

DirectoryFileDescriptionDefaultImpl may be embedded by implementations of FileDescriptionImpl that always represent directories to obtain implementations of non-directory I/O methods that return EISDIR.

+stateify savable

func (DirectoryFileDescriptionDefaultImpl) Allocate

func (DirectoryFileDescriptionDefaultImpl) Allocate(ctx context.Context, mode, offset, length uint64) error

Allocate implements DirectoryFileDescriptionDefaultImpl.Allocate.

func (DirectoryFileDescriptionDefaultImpl) PRead

PRead implements FileDescriptionImpl.PRead.

func (DirectoryFileDescriptionDefaultImpl) PWrite

PWrite implements FileDescriptionImpl.PWrite.

func (DirectoryFileDescriptionDefaultImpl) Read

Read implements FileDescriptionImpl.Read.

func (DirectoryFileDescriptionDefaultImpl) Write

Write implements FileDescriptionImpl.Write.

type Dirent

type Dirent struct {
	// Name is the filename.
	Name string

	// Type is the file type, a linux.DT_* constant.
	Type uint8

	// Ino is the inode number.
	Ino uint64

	// NextOff is the offset of the *next* Dirent in the directory; that is,
	// FileDescription.Seek(NextOff, SEEK_SET) (as called by seekdir(3)) will
	// cause the next call to FileDescription.IterDirents() to yield the next
	// Dirent. (The offset of the first Dirent in a directory is always 0.)
	NextOff int64
}

Dirent holds the information contained in struct linux_dirent64.

+stateify savable

type DynamicBytesFileDescriptionImpl

type DynamicBytesFileDescriptionImpl struct {
	// contains filtered or unexported fields
}

DynamicBytesFileDescriptionImpl may be embedded by implementations of FileDescriptionImpl that represent read-only regular files whose contents are backed by a bytes.Buffer that is regenerated when necessary, consistent with Linux's fs/seq_file.c:single_open().

If data additionally implements WritableDynamicBytesSource, writes are dispatched to the implementer. The source data is not automatically modified.

DynamicBytesFileDescriptionImpl.SetDataSource() must be called before first use.

+stateify savable

func (*DynamicBytesFileDescriptionImpl) PRead

PRead implements FileDescriptionImpl.PRead.

func (*DynamicBytesFileDescriptionImpl) PWrite

PWrite implements FileDescriptionImpl.PWrite.

func (*DynamicBytesFileDescriptionImpl) Read

Read implements FileDescriptionImpl.Read.

func (*DynamicBytesFileDescriptionImpl) Seek

func (fd *DynamicBytesFileDescriptionImpl) Seek(ctx context.Context, offset int64, whence int32) (int64, error)

Seek implements FileDescriptionImpl.Seek.

func (*DynamicBytesFileDescriptionImpl) SetDataSource

func (fd *DynamicBytesFileDescriptionImpl) SetDataSource(data DynamicBytesSource)

SetDataSource must be called exactly once on fd before first use.

func (*DynamicBytesFileDescriptionImpl) Write

Write implements FileDescriptionImpl.Write.

type DynamicBytesSource

type DynamicBytesSource interface {
	// Generate writes the file's contents to buf.
	Generate(ctx context.Context, buf *bytes.Buffer) error
}

DynamicBytesSource represents a data source for a DynamicBytesFileDescriptionImpl.

+stateify savable

type EpollInstance

type EpollInstance struct {
	FileDescriptionDefaultImpl
	DentryMetadataFileDescriptionImpl
	NoLockFD
	// contains filtered or unexported fields
}

EpollInstance represents an epoll instance, as described by epoll(7).

+stateify savable

func (*EpollInstance) AddInterest

func (ep *EpollInstance) AddInterest(file *FileDescription, num int32, event linux.EpollEvent) error

AddInterest implements the semantics of EPOLL_CTL_ADD.

Preconditions: A reference must be held on file.

func (*EpollInstance) DeleteInterest

func (ep *EpollInstance) DeleteInterest(file *FileDescription, num int32) error

DeleteInterest implements the semantics of EPOLL_CTL_DEL.

Preconditions: A reference must be held on file.

func (*EpollInstance) EventRegister

func (ep *EpollInstance) EventRegister(e *waiter.Entry) error

EventRegister implements waiter.Waitable.EventRegister.

func (*EpollInstance) EventUnregister

func (ep *EpollInstance) EventUnregister(e *waiter.Entry)

EventUnregister implements waiter.Waitable.EventUnregister.

func (*EpollInstance) ModifyInterest

func (ep *EpollInstance) ModifyInterest(file *FileDescription, num int32, event linux.EpollEvent) error

ModifyInterest implements the semantics of EPOLL_CTL_MOD.

Preconditions: A reference must be held on file.

func (*EpollInstance) ReadEvents

func (ep *EpollInstance) ReadEvents(events []linux.EpollEvent, maxEvents int) []linux.EpollEvent

ReadEvents appends up to maxReady events to events and returns the updated slice of events.

func (*EpollInstance) Readiness

func (ep *EpollInstance) Readiness(mask waiter.EventMask) waiter.EventMask

Readiness implements waiter.Waitable.Readiness.

func (*EpollInstance) Release

func (ep *EpollInstance) Release(ctx context.Context)

Release implements FileDescriptionImpl.Release.

func (*EpollInstance) Seek

func (ep *EpollInstance) Seek(ctx context.Context, offset int64, whence int32) (int64, error)

Seek implements FileDescriptionImpl.Seek.

type ErrCorruption

type ErrCorruption struct {
	// Err is the wrapped error.
	Err error
}

ErrCorruption indicates a failed restore due to external file system state in corruption.

func (ErrCorruption) Error

func (e ErrCorruption) Error() string

Error returns a sensible description of the restore error.

type Event

type Event struct {
	// contains filtered or unexported fields
}

Event represents a struct inotify_event from linux.

+stateify savable

func (*Event) CopyTo

func (e *Event) CopyTo(ctx context.Context, buf []byte, dst usermem.IOSequence) (int64, error)

CopyTo serializes this event to dst. buf is used as a scratch buffer to construct the output. We use a buffer allocated ahead of time for performance. buf must be at least inotifyEventBaseSize bytes.

type EventType

type EventType uint8

EventType defines different kinds of inotfiy events.

The way events are labelled appears somewhat arbitrary, but they must match Linux so that IN_EXCL_UNLINK behaves as it does in Linux.

+stateify savable

const (
	PathEvent  EventType = iota
	InodeEvent EventType = iota
)

PathEvent and InodeEvent correspond to FSNOTIFY_EVENT_PATH and FSNOTIFY_EVENT_INODE in Linux.

type FileAsync

type FileAsync interface {
	Register(w waiter.Waitable) error
	Unregister(w waiter.Waitable)
}

A FileAsync sends signals to its owner when w is ready for IO. This is only implemented by pkg/sentry/fasync:FileAsync, but we unfortunately need this interface to avoid circular dependencies.

type FileDescription

type FileDescription struct {
	FileDescriptionRefs
	// contains filtered or unexported fields
}

A FileDescription represents an open file description, which is the entity referred to by a file descriptor (POSIX.1-2017 3.258 "Open File Description").

FileDescriptions are reference-counted. Unless otherwise specified, all FileDescription methods require that a reference is held.

FileDescription is analogous to Linux's struct file.

+stateify savable

func NewInotifyFD

func NewInotifyFD(ctx context.Context, vfsObj *VirtualFilesystem, flags uint32) (*FileDescription, error)

NewInotifyFD constructs a new Inotify instance.

func (*FileDescription) Allocate

func (fd *FileDescription) Allocate(ctx context.Context, mode, offset, length uint64) error

Allocate grows file represented by FileDescription to offset + length bytes.

func (*FileDescription) AsyncHandler

func (fd *FileDescription) AsyncHandler() FileAsync

AsyncHandler returns the FileAsync for fd.

func (*FileDescription) ComputeLockRange

func (fd *FileDescription) ComputeLockRange(ctx context.Context, start uint64, length uint64, whence int16) (lock.LockRange, error)

ComputeLockRange computes the range of a file lock based on the given values.

func (*FileDescription) ConfigureMMap

func (fd *FileDescription) ConfigureMMap(ctx context.Context, opts *memmap.MMapOpts) error

ConfigureMMap mutates opts to implement mmap(2) for the file represented by fd.

func (*FileDescription) DecRef

func (fd *FileDescription) DecRef(ctx context.Context)

DecRef decrements fd's reference count.

func (*FileDescription) Dentry

func (fd *FileDescription) Dentry() *Dentry

Dentry returns the dentry at which fd was opened. It does not take a reference on the returned Dentry.

func (*FileDescription) DeviceID

func (fd *FileDescription) DeviceID() uint64

DeviceID implements memmap.MappingIdentity.DeviceID.

func (*FileDescription) EventRegister

func (fd *FileDescription) EventRegister(e *waiter.Entry) error

EventRegister implements waiter.Waitable.EventRegister.

It registers e for I/O readiness events in mask.

func (*FileDescription) EventUnregister

func (fd *FileDescription) EventUnregister(e *waiter.Entry)

EventUnregister implements waiter.Waitable.EventUnregister.

It unregisters e for I/O readiness events.

func (*FileDescription) GetXattr

func (fd *FileDescription) GetXattr(ctx context.Context, opts *GetXattrOptions) (string, error)

GetXattr returns the value associated with the given extended attribute for the file represented by fd.

If the size of the return value exceeds opts.Size, ERANGE may be returned (note that implementations are free to ignore opts.Size entirely and return without error). In all cases, if opts.Size is 0, the value should be returned without error, regardless of size.

func (*FileDescription) Impl

Impl returns the FileDescriptionImpl associated with fd.

func (*FileDescription) Init

func (fd *FileDescription) Init(impl FileDescriptionImpl, flags uint32, mnt *Mount, d *Dentry, opts *FileDescriptionOptions) error

Init must be called before first use of fd. If it succeeds, it takes references on mnt and d. flags is the initial file description flags, which is usually the full set of flags passed to open(2).

func (*FileDescription) InodeID

func (fd *FileDescription) InodeID() uint64

InodeID implements memmap.MappingIdentity.InodeID.

func (*FileDescription) Ioctl

Ioctl implements the ioctl(2) syscall.

func (*FileDescription) IsReadable

func (fd *FileDescription) IsReadable() bool

IsReadable returns true if fd was opened for reading.

func (*FileDescription) IsWritable

func (fd *FileDescription) IsWritable() bool

IsWritable returns true if fd was opened for writing.

func (*FileDescription) IterDirents

func (fd *FileDescription) IterDirents(ctx context.Context, cb IterDirentsCallback) error

IterDirents invokes cb on each entry in the directory represented by fd. If IterDirents has been called since the last call to Seek, it continues iteration from the end of the last call.

func (*FileDescription) ListXattr

func (fd *FileDescription) ListXattr(ctx context.Context, size uint64) ([]string, error)

ListXattr returns all extended attribute names for the file represented by fd.

If the size of the list (including a NUL terminating byte after every entry) would exceed size, ERANGE may be returned. Note that implementations are free to ignore size entirely and return without error). In all cases, if size is 0, the list should be returned without error, regardless of size.

func (*FileDescription) LockBSD

func (fd *FileDescription) LockBSD(ctx context.Context, ownerPID int32, lockType lock.LockType, block bool) error

LockBSD tries to acquire a BSD-style advisory file lock.

func (*FileDescription) LockPOSIX

func (fd *FileDescription) LockPOSIX(ctx context.Context, uid lock.UniqueID, ownerPID int32, t lock.LockType, r lock.LockRange, block bool) error

LockPOSIX locks a POSIX-style file range lock.

func (*FileDescription) MappedName

func (fd *FileDescription) MappedName(ctx context.Context) string

MappedName implements memmap.MappingIdentity.MappedName.

func (*FileDescription) Mount

func (fd *FileDescription) Mount() *Mount

Mount returns the mount on which fd was opened. It does not take a reference on the returned Mount.

func (*FileDescription) Msync

Msync implements memmap.MappingIdentity.Msync.

func (*FileDescription) OnClose

func (fd *FileDescription) OnClose(ctx context.Context) error

OnClose is called when a file descriptor representing the FileDescription is closed. Returning a non-nil error should not prevent the file descriptor from being closed.

func (*FileDescription) Options

Options returns the options passed to fd.Init().

func (*FileDescription) PRead

func (fd *FileDescription) PRead(ctx context.Context, dst usermem.IOSequence, offset int64, opts ReadOptions) (int64, error)

PRead reads from the file represented by fd into dst, starting at the given offset, and returns the number of bytes read. PRead is permitted to return partial reads with a nil error.

func (*FileDescription) PWrite

func (fd *FileDescription) PWrite(ctx context.Context, src usermem.IOSequence, offset int64, opts WriteOptions) (int64, error)

PWrite writes src to the file represented by fd, starting at the given offset, and returns the number of bytes written. PWrite is permitted to return partial writes with a nil error.

func (*FileDescription) Read

Read is similar to PRead, but does not specify an offset.

func (*FileDescription) Readiness

func (fd *FileDescription) Readiness(mask waiter.EventMask) waiter.EventMask

Readiness implements waiter.Waitable.Readiness.

It returns fd's I/O readiness.

func (*FileDescription) RemoveXattr

func (fd *FileDescription) RemoveXattr(ctx context.Context, name string) error

RemoveXattr removes the given extended attribute from the file represented by fd.

func (*FileDescription) Seek

func (fd *FileDescription) Seek(ctx context.Context, offset int64, whence int32) (int64, error)

Seek changes fd's offset (assuming one exists) and returns its new value.

func (*FileDescription) SetAsyncHandler

func (fd *FileDescription) SetAsyncHandler(newHandler func() FileAsync) (FileAsync, error)

SetAsyncHandler sets fd.asyncHandler if it has not been set before and returns it.

func (*FileDescription) SetStat

func (fd *FileDescription) SetStat(ctx context.Context, opts SetStatOptions) error

SetStat updates metadata for the file represented by fd.

func (*FileDescription) SetStatusFlags

func (fd *FileDescription) SetStatusFlags(ctx context.Context, creds *auth.Credentials, flags uint32) error

SetStatusFlags sets file description status flags, as for fcntl(F_SETFL).

func (*FileDescription) SetXattr

func (fd *FileDescription) SetXattr(ctx context.Context, opts *SetXattrOptions) error

SetXattr changes the value associated with the given extended attribute for the file represented by fd.

func (*FileDescription) Stat

func (fd *FileDescription) Stat(ctx context.Context, opts StatOptions) (linux.Statx, error)

Stat returns metadata for the file represented by fd.

func (*FileDescription) StatFS

func (fd *FileDescription) StatFS(ctx context.Context) (linux.Statfs, error)

StatFS returns metadata for the filesystem containing the file represented by fd.

func (*FileDescription) StatusFlags

func (fd *FileDescription) StatusFlags() uint32

StatusFlags returns file description status flags, as for fcntl(F_GETFL).

func (*FileDescription) SupportsLocks

func (fd *FileDescription) SupportsLocks() bool

SupportsLocks indicates whether file locks are supported.

func (*FileDescription) Sync

func (fd *FileDescription) Sync(ctx context.Context) error

Sync has the semantics of fsync(2).

func (*FileDescription) SyncFS

func (fd *FileDescription) SyncFS(ctx context.Context) error

SyncFS instructs the filesystem containing fd to execute the semantics of syncfs(2).

func (*FileDescription) TestPOSIX

TestPOSIX returns information about whether the specified lock can be held.

func (*FileDescription) UnlockBSD

func (fd *FileDescription) UnlockBSD(ctx context.Context) error

UnlockBSD releases a BSD-style advisory file lock.

func (*FileDescription) UnlockPOSIX

func (fd *FileDescription) UnlockPOSIX(ctx context.Context, uid lock.UniqueID, r lock.LockRange) error

UnlockPOSIX unlocks a POSIX-style file range lock.

func (*FileDescription) VirtualDentry

func (fd *FileDescription) VirtualDentry() VirtualDentry

VirtualDentry returns the location at which fd was opened. It does not take a reference on the returned VirtualDentry.

func (*FileDescription) Write

Write is similar to PWrite, but does not specify an offset.

type FileDescriptionDefaultImpl

type FileDescriptionDefaultImpl struct{}

FileDescriptionDefaultImpl may be embedded by implementations of FileDescriptionImpl to obtain implementations of many FileDescriptionImpl methods with default behavior analogous to Linux's.

+stateify savable

func (FileDescriptionDefaultImpl) Allocate

func (FileDescriptionDefaultImpl) Allocate(ctx context.Context, mode, offset, length uint64) error

Allocate implements FileDescriptionImpl.Allocate analogously to fallocate called on an invalid type of file in Linux.

Note that directories can rely on this implementation even though they should technically return EISDIR. Allocate should never be called for a directory, because it requires a writable fd.

func (FileDescriptionDefaultImpl) ConfigureMMap

func (FileDescriptionDefaultImpl) ConfigureMMap(ctx context.Context, opts *memmap.MMapOpts) error

ConfigureMMap implements FileDescriptionImpl.ConfigureMMap analogously to file_operations::mmap == NULL in Linux.

func (FileDescriptionDefaultImpl) EventRegister

func (FileDescriptionDefaultImpl) EventRegister(e *waiter.Entry) error

EventRegister implements waiter.Waitable.EventRegister analogously to file_operations::poll == NULL in Linux.

func (FileDescriptionDefaultImpl) EventUnregister

func (FileDescriptionDefaultImpl) EventUnregister(e *waiter.Entry)

EventUnregister implements waiter.Waitable.EventUnregister analogously to file_operations::poll == NULL in Linux.

func (FileDescriptionDefaultImpl) GetXattr

GetXattr implements FileDescriptionImpl.GetXattr analogously to inode::i_opflags & IOP_XATTR == 0 in Linux.

func (FileDescriptionDefaultImpl) Ioctl

Ioctl implements FileDescriptionImpl.Ioctl analogously to file_operations::unlocked_ioctl == NULL in Linux.

func (FileDescriptionDefaultImpl) IterDirents

IterDirents implements FileDescriptionImpl.IterDirents analogously to file_operations::iterate == file_operations::iterate_shared == NULL in Linux.

func (FileDescriptionDefaultImpl) ListXattr

func (FileDescriptionDefaultImpl) ListXattr(ctx context.Context, size uint64) ([]string, error)

ListXattr implements FileDescriptionImpl.ListXattr analogously to inode_operations::listxattr == NULL in Linux.

func (FileDescriptionDefaultImpl) OnClose

OnClose implements FileDescriptionImpl.OnClose analogously to file_operations::flush == NULL in Linux.

func (FileDescriptionDefaultImpl) PRead

PRead implements FileDescriptionImpl.PRead analogously to file_operations::read == file_operations::read_iter == NULL in Linux.

func (FileDescriptionDefaultImpl) PWrite

PWrite implements FileDescriptionImpl.PWrite analogously to file_operations::write == file_operations::write_iter == NULL in Linux.

func (FileDescriptionDefaultImpl) Read

Read implements FileDescriptionImpl.Read analogously to file_operations::read == file_operations::read_iter == NULL in Linux.

func (FileDescriptionDefaultImpl) Readiness

Readiness implements waiter.Waitable.Readiness analogously to file_operations::poll == NULL in Linux.

func (FileDescriptionDefaultImpl) RemoveXattr

func (FileDescriptionDefaultImpl) RemoveXattr(ctx context.Context, name string) error

RemoveXattr implements FileDescriptionImpl.RemoveXattr analogously to inode::i_opflags & IOP_XATTR == 0 in Linux.

func (FileDescriptionDefaultImpl) Seek

func (FileDescriptionDefaultImpl) Seek(ctx context.Context, offset int64, whence int32) (int64, error)

Seek implements FileDescriptionImpl.Seek analogously to file_operations::llseek == NULL in Linux.

func (FileDescriptionDefaultImpl) SetXattr

SetXattr implements FileDescriptionImpl.SetXattr analogously to inode::i_opflags & IOP_XATTR == 0 in Linux.

func (FileDescriptionDefaultImpl) StatFS

StatFS implements FileDescriptionImpl.StatFS analogously to super_operations::statfs == NULL in Linux.

func (FileDescriptionDefaultImpl) Sync

Sync implements FileDescriptionImpl.Sync analogously to file_operations::fsync == NULL in Linux.

func (FileDescriptionDefaultImpl) Write

Write implements FileDescriptionImpl.Write analogously to file_operations::write == file_operations::write_iter == NULL in Linux.

type FileDescriptionImpl

type FileDescriptionImpl interface {
	// Release is called when the associated FileDescription reaches zero
	// references.
	Release(ctx context.Context)

	// OnClose is called when a file descriptor representing the
	// FileDescription is closed. Note that returning a non-nil error does not
	// prevent the file descriptor from being closed.
	OnClose(ctx context.Context) error

	// Stat returns metadata for the file represented by the FileDescription.
	Stat(ctx context.Context, opts StatOptions) (linux.Statx, error)

	// SetStat updates metadata for the file represented by the
	// FileDescription. Implementations are responsible for checking if the
	// operation can be performed (see vfs.CheckSetStat() for common checks).
	SetStat(ctx context.Context, opts SetStatOptions) error

	// StatFS returns metadata for the filesystem containing the file
	// represented by the FileDescription.
	StatFS(ctx context.Context) (linux.Statfs, error)

	// Allocate grows the file to offset + length bytes.
	// Only mode == 0 is supported currently.
	//
	// Allocate should return EISDIR on directories, ESPIPE on pipes, and ENODEV on
	// other files where it is not supported.
	//
	// Preconditions: The FileDescription was opened for writing.
	Allocate(ctx context.Context, mode, offset, length uint64) error

	// waiter.Waitable methods may be used to poll for I/O events.
	waiter.Waitable

	// PRead reads from the file into dst, starting at the given offset, and
	// returns the number of bytes read. PRead is permitted to return partial
	// reads with a nil error.
	//
	// Errors:
	//
	// - If opts.Flags specifies unsupported options, PRead returns EOPNOTSUPP.
	//
	// Preconditions:
	// * The FileDescription was opened for reading.
	// * FileDescriptionOptions.DenyPRead == false.
	PRead(ctx context.Context, dst usermem.IOSequence, offset int64, opts ReadOptions) (int64, error)

	// Read is similar to PRead, but does not specify an offset.
	//
	// For files with an implicit FileDescription offset (e.g. regular files),
	// Read begins at the FileDescription offset, and advances the offset by
	// the number of bytes read; note that POSIX 2.9.7 "Thread Interactions
	// with Regular File Operations" requires that all operations that may
	// mutate the FileDescription offset are serialized.
	//
	// Errors:
	//
	// - If opts.Flags specifies unsupported options, Read returns EOPNOTSUPP.
	//
	// Preconditions: The FileDescription was opened for reading.
	Read(ctx context.Context, dst usermem.IOSequence, opts ReadOptions) (int64, error)

	// PWrite writes src to the file, starting at the given offset, and returns
	// the number of bytes written. PWrite is permitted to return partial
	// writes with a nil error.
	//
	// As in Linux (but not POSIX), if O_APPEND is in effect for the
	// FileDescription, PWrite should ignore the offset and append data to the
	// end of the file.
	//
	// Errors:
	//
	// - If opts.Flags specifies unsupported options, PWrite returns
	// EOPNOTSUPP.
	//
	// Preconditions:
	// * The FileDescription was opened for writing.
	// * FileDescriptionOptions.DenyPWrite == false.
	PWrite(ctx context.Context, src usermem.IOSequence, offset int64, opts WriteOptions) (int64, error)

	// Write is similar to PWrite, but does not specify an offset, which is
	// implied as for Read.
	//
	// Write is a FileDescriptionImpl method, instead of a wrapper around
	// PWrite that uses a FileDescription offset, to make it possible for
	// remote filesystems to implement O_APPEND correctly (i.e. atomically with
	// respect to writers outside the scope of VFS).
	//
	// Errors:
	//
	// - If opts.Flags specifies unsupported options, Write returns EOPNOTSUPP.
	//
	// Preconditions: The FileDescription was opened for writing.
	Write(ctx context.Context, src usermem.IOSequence, opts WriteOptions) (int64, error)

	// IterDirents invokes cb on each entry in the directory represented by the
	// FileDescription. If IterDirents has been called since the last call to
	// Seek, it continues iteration from the end of the last call.
	IterDirents(ctx context.Context, cb IterDirentsCallback) error

	// Seek changes the FileDescription offset (assuming one exists) and
	// returns its new value.
	//
	// For directories, if whence == SEEK_SET and offset == 0, the caller is
	// rewinddir(), such that Seek "shall also cause the directory stream to
	// refer to the current state of the corresponding directory" -
	// POSIX.1-2017.
	Seek(ctx context.Context, offset int64, whence int32) (int64, error)

	// Sync requests that cached state associated with the file represented by
	// the FileDescription is synchronized with persistent storage, and blocks
	// until this is complete.
	Sync(ctx context.Context) error

	// ConfigureMMap mutates opts to implement mmap(2) for the file. Most
	// implementations that support memory mapping can call
	// GenericConfigureMMap with the appropriate memmap.Mappable.
	ConfigureMMap(ctx context.Context, opts *memmap.MMapOpts) error

	// Ioctl implements the ioctl(2) syscall.
	Ioctl(ctx context.Context, uio usermem.IO, args arch.SyscallArguments) (uintptr, error)

	// ListXattr returns all extended attribute names for the file.
	ListXattr(ctx context.Context, size uint64) ([]string, error)

	// GetXattr returns the value associated with the given extended attribute
	// for the file.
	GetXattr(ctx context.Context, opts GetXattrOptions) (string, error)

	// SetXattr changes the value associated with the given extended attribute
	// for the file.
	SetXattr(ctx context.Context, opts SetXattrOptions) error

	// RemoveXattr removes the given extended attribute from the file.
	RemoveXattr(ctx context.Context, name string) error

	// SupportsLocks indicates whether file locks are supported.
	SupportsLocks() bool

	// LockBSD tries to acquire a BSD-style advisory file lock.
	LockBSD(ctx context.Context, uid lock.UniqueID, ownerPID int32, t lock.LockType, block bool) error

	// UnlockBSD releases a BSD-style advisory file lock.
	UnlockBSD(ctx context.Context, uid lock.UniqueID) error

	// LockPOSIX tries to acquire a POSIX-style advisory file lock.
	LockPOSIX(ctx context.Context, uid lock.UniqueID, ownerPID int32, t lock.LockType, r lock.LockRange, block bool) error

	// UnlockPOSIX releases a POSIX-style advisory file lock.
	UnlockPOSIX(ctx context.Context, uid lock.UniqueID, ComputeLockRange lock.LockRange) error

	// TestPOSIX returns information about whether the specified lock can be held, in the style of the F_GETLK fcntl.
	TestPOSIX(ctx context.Context, uid lock.UniqueID, t lock.LockType, r lock.LockRange) (linux.Flock, error)
}

FileDescriptionImpl contains implementation details for an FileDescription. Implementations of FileDescriptionImpl should contain their associated FileDescription by value as their first field.

For all functions that return linux.Statx, Statx.Uid and Statx.Gid will be interpreted as IDs in the root UserNamespace (i.e. as auth.KUID and auth.KGID respectively).

All methods may return errors not specified.

FileDescriptionImpl is analogous to Linux's struct file_operations.

type FileDescriptionOptions

type FileDescriptionOptions struct {
	// If AllowDirectIO is true, allow O_DIRECT to be set on the file.
	AllowDirectIO bool

	// If DenyPRead is true, calls to FileDescription.PRead() return ESPIPE.
	DenyPRead bool

	// If DenyPWrite is true, calls to FileDescription.PWrite() return
	// ESPIPE.
	DenyPWrite bool

	// If UseDentryMetadata is true, calls to FileDescription methods that
	// interact with file and filesystem metadata (Stat, SetStat, StatFS,
	// ListXattr, GetXattr, SetXattr, RemoveXattr) are implemented by calling
	// the corresponding FilesystemImpl methods instead of the corresponding
	// FileDescriptionImpl methods.
	//
	// UseDentryMetadata is intended for file descriptions that are implemented
	// outside of individual filesystems, such as pipes, sockets, and device
	// special files. FileDescriptions for which UseDentryMetadata is true may
	// embed DentryMetadataFileDescriptionImpl to obtain appropriate
	// implementations of FileDescriptionImpl methods that should not be
	// called.
	UseDentryMetadata bool

	// If DenySpliceIn is true, splice into descriptor isn't allowed.
	DenySpliceIn bool
}

FileDescriptionOptions contains options to FileDescription.Init().

+stateify savable

type FileLocks

type FileLocks struct {
	// contains filtered or unexported fields
}

FileLocks supports POSIX and BSD style locks, which correspond to fcntl(2) and flock(2) respectively in Linux. It can be embedded into various file implementations for VFS2 that support locking.

Note that in Linux these two types of locks are _not_ cooperative, because race and deadlock conditions make merging them prohibitive. We do the same and keep them oblivious to each other.

+stateify savable

func (*FileLocks) LockBSD

func (fl *FileLocks) LockBSD(ctx context.Context, uid fslock.UniqueID, ownerID int32, t fslock.LockType, block bool) error

LockBSD tries to acquire a BSD-style lock on the entire file.

func (*FileLocks) LockPOSIX

func (fl *FileLocks) LockPOSIX(ctx context.Context, uid fslock.UniqueID, ownerPID int32, t fslock.LockType, r fslock.LockRange, block bool) error

LockPOSIX tries to acquire a POSIX-style lock on a file region.

func (*FileLocks) TestPOSIX

TestPOSIX returns information about whether the specified lock can be held, in the style of the F_GETLK fcntl.

func (*FileLocks) UnlockBSD

func (fl *FileLocks) UnlockBSD(uid fslock.UniqueID)

UnlockBSD releases a BSD-style lock on the entire file.

This operation is always successful, even if there did not exist a lock on the requested region held by uid in the first place.

func (*FileLocks) UnlockPOSIX

func (fl *FileLocks) UnlockPOSIX(ctx context.Context, uid fslock.UniqueID, r fslock.LockRange) error

UnlockPOSIX releases a POSIX-style lock on a file region.

This operation is always successful, even if there did not exist a lock on the requested region held by uid in the first place.

type Filesystem

type Filesystem struct {
	FilesystemRefs
	// contains filtered or unexported fields
}

A Filesystem is a tree of nodes represented by Dentries, which forms part of a VirtualFilesystem.

Filesystems are reference-counted. Unless otherwise specified, all Filesystem methods require that a reference is held.

Filesystem is analogous to Linux's struct super_block.

+stateify savable

func (*Filesystem) DecRef

func (fs *Filesystem) DecRef(ctx context.Context)

DecRef decrements fs' reference count.

func (*Filesystem) FilesystemType

func (fs *Filesystem) FilesystemType() FilesystemType

FilesystemType returns the FilesystemType for this Filesystem.

func (*Filesystem) Impl

func (fs *Filesystem) Impl() FilesystemImpl

Impl returns the FilesystemImpl associated with fs.

func (*Filesystem) Init

func (fs *Filesystem) Init(vfsObj *VirtualFilesystem, fsType FilesystemType, impl FilesystemImpl)

Init must be called before first use of fs.

func (*Filesystem) VirtualFilesystem

func (fs *Filesystem) VirtualFilesystem() *VirtualFilesystem

VirtualFilesystem returns the containing VirtualFilesystem.

type FilesystemImpl

type FilesystemImpl interface {
	// Release is called when the associated Filesystem reaches zero
	// references.
	Release(ctx context.Context)

	// Sync "causes all pending modifications to filesystem metadata and cached
	// file data to be written to the underlying [filesystem]", as by syncfs(2).
	Sync(ctx context.Context) error

	// AccessAt checks whether a user with creds can access the file at rp.
	AccessAt(ctx context.Context, rp *ResolvingPath, creds *auth.Credentials, ats AccessTypes) error

	// GetDentryAt returns a Dentry representing the file at rp. A reference is
	// taken on the returned Dentry.
	//
	// GetDentryAt does not correspond directly to a Linux syscall; it is used
	// in the implementation of:
	//
	// - Syscalls that need to resolve two paths: link(), linkat().
	//
	// - Syscalls that need to refer to a filesystem position outside the
	// context of a file description: chdir(), fchdir(), chroot(), mount(),
	// umount().
	GetDentryAt(ctx context.Context, rp *ResolvingPath, opts GetDentryOptions) (*Dentry, error)

	// GetParentDentryAt returns a Dentry representing the directory at the
	// second-to-last path component in rp. (Note that, despite the name, this
	// is not necessarily the parent directory of the file at rp, since the
	// last path component in rp may be "." or "..".) A reference is taken on
	// the returned Dentry.
	//
	// GetParentDentryAt does not correspond directly to a Linux syscall; it is
	// used in the implementation of the rename() family of syscalls, which
	// must resolve the parent directories of two paths.
	//
	// Preconditions: !rp.Done().
	//
	// Postconditions: If GetParentDentryAt returns a nil error, then
	// rp.Final(). If GetParentDentryAt returns an error returned by
	// ResolvingPath.Resolve*(), then !rp.Done().
	GetParentDentryAt(ctx context.Context, rp *ResolvingPath) (*Dentry, error)

	// LinkAt creates a hard link at rp representing the same file as vd. It
	// does not take ownership of references on vd.
	//
	// Errors:
	//
	// - If the last path component in rp is "." or "..", LinkAt returns
	// EEXIST.
	//
	// - If a file already exists at rp, LinkAt returns EEXIST.
	//
	// - If rp.MustBeDir(), LinkAt returns ENOENT.
	//
	// - If the directory in which the link would be created has been removed
	// by RmdirAt or RenameAt, LinkAt returns ENOENT.
	//
	// - If rp.Mount != vd.Mount(), LinkAt returns EXDEV.
	//
	// - If vd represents a directory, LinkAt returns EPERM.
	//
	// - If vd represents a file for which all existing links have been
	// removed, or a file created by open(O_TMPFILE|O_EXCL), LinkAt returns
	// ENOENT. Equivalently, if vd represents a file with a link count of 0 not
	// created by open(O_TMPFILE) without O_EXCL, LinkAt returns ENOENT.
	//
	// Preconditions:
	// * !rp.Done().
	// * For the final path component in rp, !rp.ShouldFollowSymlink().
	//
	// Postconditions: If LinkAt returns an error returned by
	// ResolvingPath.Resolve*(), then !rp.Done().
	LinkAt(ctx context.Context, rp *ResolvingPath, vd VirtualDentry) error

	// MkdirAt creates a directory at rp.
	//
	// Errors:
	//
	// - If the last path component in rp is "." or "..", MkdirAt returns
	// EEXIST.
	//
	// - If a file already exists at rp, MkdirAt returns EEXIST.
	//
	// - If the directory in which the new directory would be created has been
	// removed by RmdirAt or RenameAt, MkdirAt returns ENOENT.
	//
	// Preconditions:
	// * !rp.Done().
	// * For the final path component in rp, !rp.ShouldFollowSymlink().
	//
	// Postconditions: If MkdirAt returns an error returned by
	// ResolvingPath.Resolve*(), then !rp.Done().
	MkdirAt(ctx context.Context, rp *ResolvingPath, opts MkdirOptions) error

	// MknodAt creates a regular file, device special file, or named pipe at
	// rp.
	//
	// Errors:
	//
	// - If the last path component in rp is "." or "..", MknodAt returns
	// EEXIST.
	//
	// - If a file already exists at rp, MknodAt returns EEXIST.
	//
	// - If rp.MustBeDir(), MknodAt returns ENOENT.
	//
	// - If the directory in which the file would be created has been removed
	// by RmdirAt or RenameAt, MknodAt returns ENOENT.
	//
	// Preconditions:
	// * !rp.Done().
	// * For the final path component in rp, !rp.ShouldFollowSymlink().
	//
	// Postconditions: If MknodAt returns an error returned by
	// ResolvingPath.Resolve*(), then !rp.Done().
	MknodAt(ctx context.Context, rp *ResolvingPath, opts MknodOptions) error

	// OpenAt returns an FileDescription providing access to the file at rp. A
	// reference is taken on the returned FileDescription.
	//
	// Errors:
	//
	// - If opts.Flags specifies O_TMPFILE and this feature is unsupported by
	// the implementation, OpenAt returns EOPNOTSUPP. (All other unsupported
	// features are silently ignored, consistently with Linux's open*(2).)
	OpenAt(ctx context.Context, rp *ResolvingPath, opts OpenOptions) (*FileDescription, error)

	// ReadlinkAt returns the target of the symbolic link at rp.
	//
	// Errors:
	//
	// - If the file at rp is not a symbolic link, ReadlinkAt returns EINVAL.
	ReadlinkAt(ctx context.Context, rp *ResolvingPath) (string, error)

	// RenameAt renames the file named oldName in directory oldParentVD to rp.
	// It does not take ownership of references on oldParentVD.
	//
	// Errors [1]:
	//
	// - If opts.Flags specifies unsupported options, RenameAt returns EINVAL.
	//
	// - If the last path component in rp is "." or "..", and opts.Flags
	// contains RENAME_NOREPLACE, RenameAt returns EEXIST.
	//
	// - If the last path component in rp is "." or "..", and opts.Flags does
	// not contain RENAME_NOREPLACE, RenameAt returns EBUSY.
	//
	// - If rp.Mount != oldParentVD.Mount(), RenameAt returns EXDEV.
	//
	// - If the renamed file is not a directory, and opts.MustBeDir is true,
	// RenameAt returns ENOTDIR.
	//
	// - If renaming would replace an existing file and opts.Flags contains
	// RENAME_NOREPLACE, RenameAt returns EEXIST.
	//
	// - If there is no existing file at rp and opts.Flags contains
	// RENAME_EXCHANGE, RenameAt returns ENOENT.
	//
	// - If there is an existing non-directory file at rp, and rp.MustBeDir()
	// is true, RenameAt returns ENOTDIR.
	//
	// - If the renamed file is not a directory, opts.Flags does not contain
	// RENAME_EXCHANGE, and rp.MustBeDir() is true, RenameAt returns ENOTDIR.
	// (This check is not subsumed by the check for directory replacement below
	// since it applies even if there is no file to replace.)
	//
	// - If the renamed file is a directory, and the new parent directory of
	// the renamed file is either the renamed directory or a descendant
	// subdirectory of the renamed directory, RenameAt returns EINVAL.
	//
	// - If renaming would exchange the renamed file with an ancestor directory
	// of the renamed file, RenameAt returns EINVAL.
	//
	// - If renaming would replace an ancestor directory of the renamed file,
	// RenameAt returns ENOTEMPTY. (This check would be subsumed by the
	// non-empty directory check below; however, this check takes place before
	// the self-rename check.)
	//
	// - If the renamed file would replace or exchange with itself (i.e. the
	// source and destination paths resolve to the same file), RenameAt returns
	// nil, skipping the checks described below.
	//
	// - If the source or destination directory is not writable by the provider
	// of rp.Credentials(), RenameAt returns EACCES.
	//
	// - If the renamed file is a directory, and renaming would replace a
	// non-directory file, RenameAt returns ENOTDIR.
	//
	// - If the renamed file is not a directory, and renaming would replace a
	// directory, RenameAt returns EISDIR.
	//
	// - If the new parent directory of the renamed file has been removed by
	// RmdirAt or a preceding call to RenameAt, RenameAt returns ENOENT.
	//
	// - If the renamed file is a directory, it is not writable by the
	// provider of rp.Credentials(), and the source and destination parent
	// directories are different, RenameAt returns EACCES. (This is nominally
	// required to change the ".." entry in the renamed directory.)
	//
	// - If renaming would replace a non-empty directory, RenameAt returns
	// ENOTEMPTY.
	//
	// Preconditions:
	// * !rp.Done().
	// * For the final path component in rp, !rp.ShouldFollowSymlink().
	// * oldParentVD.Dentry() was obtained from a previous call to
	//   oldParentVD.Mount().Filesystem().Impl().GetParentDentryAt().
	// * oldName is not "." or "..".
	//
	// Postconditions: If RenameAt returns an error returned by
	// ResolvingPath.Resolve*(), then !rp.Done().
	//
	// [1] "The worst of all namespace operations - renaming directory.
	// "Perverted" doesn't even start to describe it. Somebody in UCB had a
	// heck of a trip..." - fs/namei.c:vfs_rename()
	RenameAt(ctx context.Context, rp *ResolvingPath, oldParentVD VirtualDentry, oldName string, opts RenameOptions) error

	// RmdirAt removes the directory at rp.
	//
	// Errors:
	//
	// - If the last path component in rp is ".", RmdirAt returns EINVAL.
	//
	// - If the last path component in rp is "..", RmdirAt returns ENOTEMPTY.
	//
	// - If no file exists at rp, RmdirAt returns ENOENT.
	//
	// - If the file at rp exists but is not a directory, RmdirAt returns
	// ENOTDIR.
	//
	// Preconditions:
	// * !rp.Done().
	// * For the final path component in rp, !rp.ShouldFollowSymlink().
	//
	// Postconditions: If RmdirAt returns an error returned by
	// ResolvingPath.Resolve*(), then !rp.Done().
	RmdirAt(ctx context.Context, rp *ResolvingPath) error

	// SetStatAt updates metadata for the file at the given path. Implementations
	// are responsible for checking if the operation can be performed
	// (see vfs.CheckSetStat() for common checks).
	//
	// Errors:
	//
	// - If opts specifies unsupported options, SetStatAt returns EINVAL.
	SetStatAt(ctx context.Context, rp *ResolvingPath, opts SetStatOptions) error

	// StatAt returns metadata for the file at rp.
	StatAt(ctx context.Context, rp *ResolvingPath, opts StatOptions) (linux.Statx, error)

	// StatFSAt returns metadata for the filesystem containing the file at rp.
	// (This method takes a path because a FilesystemImpl may consist of any
	// number of constituent filesystems.)
	StatFSAt(ctx context.Context, rp *ResolvingPath) (linux.Statfs, error)

	// SymlinkAt creates a symbolic link at rp referring to the given target.
	//
	// Errors:
	//
	// - If the last path component in rp is "." or "..", SymlinkAt returns
	// EEXIST.
	//
	// - If a file already exists at rp, SymlinkAt returns EEXIST.
	//
	// - If rp.MustBeDir(), SymlinkAt returns ENOENT.
	//
	// - If the directory in which the symbolic link would be created has been
	// removed by RmdirAt or RenameAt, SymlinkAt returns ENOENT.
	//
	// Preconditions:
	// * !rp.Done().
	// * For the final path component in rp, !rp.ShouldFollowSymlink().
	//
	// Postconditions: If SymlinkAt returns an error returned by
	// ResolvingPath.Resolve*(), then !rp.Done().
	SymlinkAt(ctx context.Context, rp *ResolvingPath, target string) error

	// UnlinkAt removes the file at rp.
	//
	// Errors:
	//
	// - If the last path component in rp is "." or "..", UnlinkAt returns
	// EISDIR.
	//
	// - If no file exists at rp, UnlinkAt returns ENOENT.
	//
	// - If rp.MustBeDir(), and the file at rp exists and is not a directory,
	// UnlinkAt returns ENOTDIR.
	//
	// - If the file at rp exists but is a directory, UnlinkAt returns EISDIR.
	//
	// Preconditions:
	// * !rp.Done().
	// * For the final path component in rp, !rp.ShouldFollowSymlink().
	//
	// Postconditions: If UnlinkAt returns an error returned by
	// ResolvingPath.Resolve*(), then !rp.Done().
	UnlinkAt(ctx context.Context, rp *ResolvingPath) error

	// ListXattrAt returns all extended attribute names for the file at rp.
	//
	// Errors:
	//
	// - If extended attributes are not supported by the filesystem,
	// ListXattrAt returns ENOTSUP.
	//
	// - If the size of the list (including a NUL terminating byte after every
	// entry) would exceed size, ERANGE may be returned. Note that
	// implementations are free to ignore size entirely and return without
	// error). In all cases, if size is 0, the list should be returned without
	// error, regardless of size.
	ListXattrAt(ctx context.Context, rp *ResolvingPath, size uint64) ([]string, error)

	// GetXattrAt returns the value associated with the given extended
	// attribute for the file at rp.
	//
	// Errors:
	//
	// - If extended attributes are not supported by the filesystem, GetXattrAt
	// returns ENOTSUP.
	//
	// - If an extended attribute named opts.Name does not exist, ENODATA is
	// returned.
	//
	// - If the size of the return value exceeds opts.Size, ERANGE may be
	// returned (note that implementations are free to ignore opts.Size entirely
	// and return without error). In all cases, if opts.Size is 0, the value
	// should be returned without error, regardless of size.
	GetXattrAt(ctx context.Context, rp *ResolvingPath, opts GetXattrOptions) (string, error)

	// SetXattrAt changes the value associated with the given extended
	// attribute for the file at rp.
	//
	// Errors:
	//
	// - If extended attributes are not supported by the filesystem, SetXattrAt
	// returns ENOTSUP.
	//
	// - If XATTR_CREATE is set in opts.Flag and opts.Name already exists,
	// EEXIST is returned. If XATTR_REPLACE is set and opts.Name does not exist,
	// ENODATA is returned.
	SetXattrAt(ctx context.Context, rp *ResolvingPath, opts SetXattrOptions) error

	// RemoveXattrAt removes the given extended attribute from the file at rp.
	//
	// Errors:
	//
	// - If extended attributes are not supported by the filesystem,
	// RemoveXattrAt returns ENOTSUP.
	//
	// - If name does not exist, ENODATA is returned.
	RemoveXattrAt(ctx context.Context, rp *ResolvingPath, name string) error

	// BoundEndpointAt returns the Unix socket endpoint bound at the path rp.
	//
	// Errors:
	//
	// - If the file does not have write permissions, then BoundEndpointAt
	// returns EACCES.
	//
	// - If a non-socket file exists at rp, then BoundEndpointAt returns
	// ECONNREFUSED.
	BoundEndpointAt(ctx context.Context, rp *ResolvingPath, opts BoundEndpointOptions) (transport.BoundEndpoint, error)

	// PrependPath prepends a path from vd to vd.Mount().Root() to b.
	//
	// If vfsroot.Ok(), it is the contextual VFS root; if it is encountered
	// before vd.Mount().Root(), PrependPath should stop prepending path
	// components and return a PrependPathAtVFSRootError.
	//
	// If traversal of vd.Dentry()'s ancestors encounters an independent
	// ("root") Dentry that is not vd.Mount().Root() (i.e. vd.Dentry() is not a
	// descendant of vd.Mount().Root()), PrependPath should stop prepending
	// path components and return a PrependPathAtNonMountRootError.
	//
	// Filesystems for which Dentries do not have meaningful paths may prepend
	// an arbitrary descriptive string to b and then return a
	// PrependPathSyntheticError.
	//
	// Most implementations can acquire the appropriate locks to ensure that
	// Dentry.Name() and Dentry.Parent() are fixed for vd.Dentry() and all of
	// its ancestors, then call GenericPrependPath.
	//
	// Preconditions: vd.Mount().Filesystem().Impl() == this FilesystemImpl.
	PrependPath(ctx context.Context, vfsroot, vd VirtualDentry, b *fspath.Builder) error

	// MountOptions returns mount options for the current filesystem. This
	// should only return options specific to the filesystem (i.e. don't return
	// "ro", "rw", etc). Options should be returned as a comma-separated string,
	// similar to the input to the 5th argument to mount.
	//
	// If the implementation has no filesystem-specific options, it should
	// return the empty string.
	MountOptions() string
}

FilesystemImpl contains implementation details for a Filesystem. Implementations of FilesystemImpl should contain their associated Filesystem by value as their first field.

All methods that take a ResolvingPath must resolve the path before performing any other checks, including rejection of the operation if not supported by the FilesystemImpl. This is because the final FilesystemImpl (responsible for actually implementing the operation) isn't known until path resolution is complete.

Unless otherwise specified, FilesystemImpl methods are responsible for performing permission checks. In many cases, vfs package functions in permissions.go may be used to help perform these checks.

When multiple specified error conditions apply to a given method call, the implementation may return any applicable errno unless otherwise specified, but returning the earliest error specified is preferable to maximize compatibility with Linux.

All methods may return errors not specified, notably including:

- ENOENT if a required path component does not exist.

- ENOTDIR if an intermediate path component is not a directory.

- Errors from vfs-package functions (ResolvingPath.Resolve*(), Mount.CheckBeginWrite(), permission-checking functions, etc.)

For all methods that take or return linux.Statx, Statx.Uid and Statx.Gid should be interpreted as IDs in the root UserNamespace (i.e. as auth.KUID and auth.KGID respectively).

FilesystemImpl combines elements of Linux's struct super_operations and struct inode_operations, for reasons described in the documentation for Dentry.

type FilesystemImplSaveRestoreExtension

type FilesystemImplSaveRestoreExtension interface {
	// PrepareSave prepares this filesystem for serialization.
	PrepareSave(ctx context.Context) error

	// CompleteRestore completes restoration from checkpoint for this
	// filesystem after deserialization.
	CompleteRestore(ctx context.Context, opts CompleteRestoreOptions) error
}

FilesystemImplSaveRestoreExtension is an optional extension to FilesystemImpl.

type FilesystemType

type FilesystemType interface {
	// GetFilesystem returns a Filesystem configured by the given options,
	// along with its mount root. A reference is taken on the returned
	// Filesystem and Dentry whose ownership is transferred to the caller.
	GetFilesystem(ctx context.Context, vfsObj *VirtualFilesystem, creds *auth.Credentials, source string, opts GetFilesystemOptions) (*Filesystem, *Dentry, error)

	// Name returns the name of this FilesystemType.
	Name() string

	// Release releases all resources held by this FilesystemType.
	Release(ctx context.Context)
}

A FilesystemType constructs filesystems.

FilesystemType is analogous to Linux's struct file_system_type.

type GetDentryOptions

type GetDentryOptions struct {
	// If CheckSearchable is true, FilesystemImpl.GetDentryAt() must check that
	// the returned Dentry is a directory for which creds has search
	// permission.
	CheckSearchable bool
}

GetDentryOptions contains options to VirtualFilesystem.GetDentryAt() and FilesystemImpl.GetDentryAt().

+stateify savable

type GetFilesystemOptions

type GetFilesystemOptions struct {
	// Data is the string passed as the 5th argument to mount(2), which is
	// usually a comma-separated list of filesystem-specific mount options.
	Data string

	// InternalData holds opaque FilesystemType-specific data. There is
	// intentionally no way for applications to specify InternalData; if it is
	// not nil, the call to GetFilesystem originates from within the sentry.
	InternalData interface{}
}

GetFilesystemOptions contains options to FilesystemType.GetFilesystem.

type GetXattrOptions

type GetXattrOptions struct {
	// Name is the name of the extended attribute to retrieve.
	Name string

	// Size is the maximum value size that the caller will tolerate. If the value
	// is larger than size, getxattr methods may return ERANGE, but they are also
	// free to ignore the hint entirely (i.e. the value returned may be larger
	// than size). All size checking is done independently at the syscall layer.
	Size uint64
}

GetXattrOptions contains options to VirtualFilesystem.GetXattrAt(), FilesystemImpl.GetXattrAt(), FileDescription.GetXattr(), and FileDescriptionImpl.GetXattr().

+stateify savable

type Inotify

type Inotify struct {
	FileDescriptionDefaultImpl
	DentryMetadataFileDescriptionImpl
	NoLockFD
	// contains filtered or unexported fields
}

Inotify represents an inotify instance created by inotify_init(2) or inotify_init1(2). Inotify implements FileDescriptionImpl.

+stateify savable

func (*Inotify) AddWatch

func (i *Inotify) AddWatch(target *Dentry, mask uint32) (int32, error)

AddWatch constructs a new inotify watch and adds it to the target. It returns the watch descriptor returned by inotify_add_watch(2).

The caller must hold a reference on target.

func (*Inotify) Allocate

func (i *Inotify) Allocate(ctx context.Context, mode, offset, length uint64) error

Allocate implements FileDescription.Allocate.

func (*Inotify) EventRegister

func (i *Inotify) EventRegister(e *waiter.Entry) error

EventRegister implements waiter.Waitable.

func (*Inotify) EventUnregister

func (i *Inotify) EventUnregister(e *waiter.Entry)

EventUnregister implements waiter.Waitable.

func (*Inotify) Ioctl

func (i *Inotify) Ioctl(ctx context.Context, uio usermem.IO, args arch.SyscallArguments) (uintptr, error)

Ioctl implements FileDescriptionImpl.Ioctl.

func (*Inotify) PRead

func (*Inotify) PRead(ctx context.Context, dst usermem.IOSequence, offset int64, opts ReadOptions) (int64, error)

PRead implements FileDescriptionImpl.PRead.

func (*Inotify) PWrite

func (*Inotify) PWrite(ctx context.Context, src usermem.IOSequence, offset int64, opts WriteOptions) (int64, error)

PWrite implements FileDescriptionImpl.PWrite.

func (*Inotify) Read

func (i *Inotify) Read(ctx context.Context, dst usermem.IOSequence, opts ReadOptions) (int64, error)

Read implements FileDescriptionImpl.Read.

func (*Inotify) Readiness

func (i *Inotify) Readiness(mask waiter.EventMask) waiter.EventMask

Readiness implements waiter.Waitable.Readiness.

Readiness indicates whether there are pending events for an inotify instance.

func (*Inotify) Release

func (i *Inotify) Release(ctx context.Context)

Release implements FileDescriptionImpl.Release. Release removes all watches and frees all resources for an inotify instance.

func (*Inotify) RmWatch

func (i *Inotify) RmWatch(ctx context.Context, wd int32) error

RmWatch looks up an inotify watch for the given 'wd' and configures the target to stop sending events to this inotify instance.

func (*Inotify) Write

func (*Inotify) Write(ctx context.Context, src usermem.IOSequence, opts WriteOptions) (int64, error)

Write implements FileDescriptionImpl.Write.

type IterDirentsCallback

type IterDirentsCallback interface {
	// Handle handles the given iterated Dirent. If Handle returns a non-nil
	// error, FileDescriptionImpl.IterDirents must stop iteration and return
	// the error; the next call to FileDescriptionImpl.IterDirents should
	// restart with the same Dirent.
	Handle(dirent Dirent) error
}

IterDirentsCallback receives Dirents from FileDescriptionImpl.IterDirents.

type IterDirentsCallbackFunc

type IterDirentsCallbackFunc func(dirent Dirent) error

IterDirentsCallbackFunc implements IterDirentsCallback for a function with the semantics of IterDirentsCallback.Handle.

func (IterDirentsCallbackFunc) Handle

func (f IterDirentsCallbackFunc) Handle(dirent Dirent) error

Handle implements IterDirentsCallback.Handle.

type LockFD

type LockFD struct {
	// contains filtered or unexported fields
}

LockFD may be used by most implementations of FileDescriptionImpl.Lock* functions. Caller must call Init().

+stateify savable

func (*LockFD) Init

func (fd *LockFD) Init(locks *FileLocks)

Init initializes fd with FileLocks to use.

func (*LockFD) LockBSD

func (fd *LockFD) LockBSD(ctx context.Context, uid fslock.UniqueID, ownerPID int32, t fslock.LockType, block bool) error

LockBSD implements FileDescriptionImpl.LockBSD.

func (*LockFD) LockPOSIX

func (fd *LockFD) LockPOSIX(ctx context.Context, uid fslock.UniqueID, ownerPID int32, t fslock.LockType, r fslock.LockRange, block bool) error

LockPOSIX implements FileDescriptionImpl.LockPOSIX.

func (*LockFD) Locks

func (fd *LockFD) Locks() *FileLocks

Locks returns the locks associated with this file.

func (LockFD) SupportsLocks

func (LockFD) SupportsLocks() bool

SupportsLocks implements FileDescriptionImpl.SupportsLocks.

func (*LockFD) TestPOSIX

func (fd *LockFD) TestPOSIX(ctx context.Context, uid fslock.UniqueID, t fslock.LockType, r fslock.LockRange) (linux.Flock, error)

TestPOSIX implements FileDescriptionImpl.TestPOSIX.

func (*LockFD) UnlockBSD

func (fd *LockFD) UnlockBSD(ctx context.Context, uid fslock.UniqueID) error

UnlockBSD implements FileDescriptionImpl.UnlockBSD.

func (*LockFD) UnlockPOSIX

func (fd *LockFD) UnlockPOSIX(ctx context.Context, uid fslock.UniqueID, r fslock.LockRange) error

UnlockPOSIX implements FileDescriptionImpl.UnlockPOSIX.

type MkdirOptions

type MkdirOptions struct {
	// Mode is the file mode bits for the created directory.
	Mode linux.FileMode

	// If ForSyntheticMountpoint is true, FilesystemImpl.MkdirAt() may create
	// the given directory in memory only (as opposed to persistent storage).
	// The created directory should be able to support the creation of
	// subdirectories with ForSyntheticMountpoint == true. It does not need to
	// support the creation of subdirectories with ForSyntheticMountpoint ==
	// false, or files of other types.
	//
	// FilesystemImpls are permitted to ignore the ForSyntheticMountpoint
	// option.
	//
	// The ForSyntheticMountpoint option exists because, unlike mount(2), the
	// OCI Runtime Specification permits the specification of mount points that
	// do not exist, under the expectation that container runtimes will create
	// them. (More accurately, the OCI Runtime Specification completely fails
	// to document this feature, but it's implemented by runc.)
	// ForSyntheticMountpoint allows such mount points to be created even when
	// the underlying persistent filesystem is immutable.
	ForSyntheticMountpoint bool
}

MkdirOptions contains options to VirtualFilesystem.MkdirAt() and FilesystemImpl.MkdirAt().

+stateify savable

type MknodOptions

type MknodOptions struct {
	// Mode is the file type and mode bits for the created file.
	Mode linux.FileMode

	// If Mode specifies a character or block device special file, DevMajor and
	// DevMinor are the major and minor device numbers for the created device.
	DevMajor uint32
	DevMinor uint32

	// Endpoint is the endpoint to bind to the created file, if a socket file is
	// being created for bind(2) on a Unix domain socket.
	Endpoint transport.BoundEndpoint
}

MknodOptions contains options to VirtualFilesystem.MknodAt() and FilesystemImpl.MknodAt().

+stateify savable

type Mount

type Mount struct {

	// ID is the immutable mount ID.
	ID uint64

	// Flags contains settings as specified for mount(2), e.g. MS_NOEXEC, except
	// for MS_RDONLY which is tracked in "writers". Immutable.
	Flags MountFlags
	// contains filtered or unexported fields
}

A Mount is a replacement of a Dentry (Mount.key.point) from one Filesystem (Mount.key.parent.fs) with a Dentry (Mount.root) from another Filesystem (Mount.fs), which applies to path resolution in the context of a particular Mount (Mount.key.parent).

Mounts are reference-counted. Unless otherwise specified, all Mount methods require that a reference is held.

Mount and Filesystem are distinct types because it's possible for a single Filesystem to be mounted at multiple locations and/or in multiple mount namespaces.

Mount is analogous to Linux's struct mount. (gVisor does not distinguish between struct mount and struct vfsmount.)

+stateify savable

func (*Mount) CheckBeginWrite

func (mnt *Mount) CheckBeginWrite() error

CheckBeginWrite increments the counter of in-progress write operations on mnt. If mnt is mounted MS_RDONLY, CheckBeginWrite does nothing and returns EROFS.

If CheckBeginWrite succeeds, EndWrite must be called when the write operation is finished.

func (*Mount) DecRef

func (mnt *Mount) DecRef(ctx context.Context)

DecRef decrements mnt's reference count.

func (*Mount) EndWrite

func (mnt *Mount) EndWrite()

EndWrite indicates that a write operation signaled by a previous successful call to CheckBeginWrite has finished.

func (*Mount) Filesystem

func (mnt *Mount) Filesystem() *Filesystem

Filesystem returns the mounted Filesystem. It does not take a reference on the returned Filesystem.

func (*Mount) IncRef

func (mnt *Mount) IncRef()

IncRef increments mnt's reference count.

func (*Mount) LeakMessage

func (mnt *Mount) LeakMessage() string

LeakMessage implements refsvfs2.CheckedObject.LeakMessage.

func (*Mount) LogRefs

func (mnt *Mount) LogRefs() bool

LogRefs implements refsvfs2.CheckedObject.LogRefs.

This should only be set to true for debugging purposes, as it can generate an extremely large amount of output and drastically degrade performance.

func (*Mount) Options

func (mnt *Mount) Options() MountOptions

Options returns a copy of the MountOptions currently applicable to mnt.

func (*Mount) ReadOnly

func (mnt *Mount) ReadOnly() bool

ReadOnly returns true if mount is readonly.

func (*Mount) RefType

func (mnt *Mount) RefType() string

RefType implements refsvfs2.CheckedObject.Type.

func (*Mount) Root

func (mnt *Mount) Root() *Dentry

Root returns the mount's root. It does not take a reference on the returned Dentry.

type MountFlags

type MountFlags struct {
	// NoExec is equivalent to MS_NOEXEC.
	NoExec bool

	// NoATime is equivalent to MS_NOATIME and indicates that the
	// filesystem should not update access time in-place.
	NoATime bool

	// NoDev is equivalent to MS_NODEV and indicates that the
	// filesystem should not allow access to devices (special files).
	// TODO(gVisor.dev/issue/3186): respect this flag in non FUSE
	// filesystems.
	NoDev bool

	// NoSUID is equivalent to MS_NOSUID and indicates that the
	// filesystem should not honor set-user-ID and set-group-ID bits or
	// file capabilities when executing programs.
	NoSUID bool
}

MountFlags contains flags as specified for mount(2), e.g. MS_NOEXEC. MS_RDONLY is not part of MountFlags because it's tracked in Mount.writers.

+stateify savable

type MountNamespace

type MountNamespace struct {
	MountNamespaceRefs

	// Owner is the usernamespace that owns this mount namespace.
	Owner *auth.UserNamespace
	// contains filtered or unexported fields
}

A MountNamespace is a collection of Mounts.// MountNamespaces are reference-counted. Unless otherwise specified, all MountNamespace methods require that a reference is held.

MountNamespace is analogous to Linux's struct mnt_namespace.

+stateify savable

func MountNamespaceFromContext

func MountNamespaceFromContext(ctx context.Context) *MountNamespace

MountNamespaceFromContext returns the MountNamespace used by ctx. If ctx is not associated with a MountNamespace, MountNamespaceFromContext returns nil.

A reference is taken on the returned MountNamespace.

func (*MountNamespace) DecRef

func (mntns *MountNamespace) DecRef(ctx context.Context)

DecRef decrements mntns' reference count.

func (*MountNamespace) Root

func (mntns *MountNamespace) Root() VirtualDentry

Root returns mntns' root. It does not take a reference on the returned Dentry.

type MountOptions

type MountOptions struct {
	// Flags contains flags as specified for mount(2), e.g. MS_NOEXEC.
	Flags MountFlags

	// ReadOnly is equivalent to MS_RDONLY.
	ReadOnly bool

	// GetFilesystemOptions contains options to FilesystemType.GetFilesystem().
	GetFilesystemOptions GetFilesystemOptions

	// InternalMount indicates whether the mount operation is coming from the
	// application, i.e. through mount(2). If InternalMount is true, allow the use
	// of filesystem types for which RegisterFilesystemTypeOptions.AllowUserMount
	// == false.
	InternalMount bool
}

MountOptions contains options to VirtualFilesystem.MountAt().

+stateify savable

type NoLockFD

type NoLockFD struct{}

NoLockFD implements Lock*/Unlock* portion of FileDescriptionImpl interface returning ENOLCK.

+stateify savable

func (NoLockFD) LockBSD

func (NoLockFD) LockBSD(ctx context.Context, uid fslock.UniqueID, ownerPID int32, t fslock.LockType, block bool) error

LockBSD implements FileDescriptionImpl.LockBSD.

func (NoLockFD) LockPOSIX

func (NoLockFD) LockPOSIX(ctx context.Context, uid fslock.UniqueID, ownerPID int32, t fslock.LockType, r fslock.LockRange, block bool) error

LockPOSIX implements FileDescriptionImpl.LockPOSIX.

func (NoLockFD) SupportsLocks

func (NoLockFD) SupportsLocks() bool

SupportsLocks implements FileDescriptionImpl.SupportsLocks.

func (NoLockFD) TestPOSIX

TestPOSIX implements FileDescriptionImpl.TestPOSIX.

func (NoLockFD) UnlockBSD

func (NoLockFD) UnlockBSD(ctx context.Context, uid fslock.UniqueID) error

UnlockBSD implements FileDescriptionImpl.UnlockBSD.

func (NoLockFD) UnlockPOSIX

func (NoLockFD) UnlockPOSIX(ctx context.Context, uid fslock.UniqueID, r fslock.LockRange) error

UnlockPOSIX implements FileDescriptionImpl.UnlockPOSIX.

type OpenOptions

type OpenOptions struct {
	// Flags contains access mode and flags as specified for open(2).
	//
	// FilesystemImpls are responsible for implementing the following flags:
	// O_RDONLY, O_WRONLY, O_RDWR, O_APPEND, O_CREAT, O_DIRECT, O_DSYNC,
	// O_EXCL, O_NOATIME, O_NOCTTY, O_NONBLOCK, O_SYNC, O_TMPFILE, and
	// O_TRUNC. VFS is responsible for handling O_DIRECTORY, O_LARGEFILE, and
	// O_NOFOLLOW. VFS users are responsible for handling O_CLOEXEC, since file
	// descriptors are mostly outside the scope of VFS.
	Flags uint32

	// If FilesystemImpl.OpenAt() creates a file, Mode is the file mode for the
	// created file.
	Mode linux.FileMode

	// FileExec is set when the file is being opened to be executed.
	// VirtualFilesystem.OpenAt() checks that the caller has execute permissions
	// on the file, that the file is a regular file, and that the mount doesn't
	// have MS_NOEXEC set.
	FileExec bool
}

OpenOptions contains options to VirtualFilesystem.OpenAt() and FilesystemImpl.OpenAt().

+stateify savable

type PathOperation

type PathOperation struct {
	// Root is the VFS root. References on Root are borrowed from the provider
	// of the PathOperation.
	//
	// Invariants: Root.Ok().
	Root VirtualDentry

	// Start is the starting point for the path traversal. References on Start
	// are borrowed from the provider of the PathOperation (i.e. the caller of
	// the VFS method to which the PathOperation was passed).
	//
	// Invariants: Start.Ok(). If Path.Absolute, then Start == Root.
	Start VirtualDentry

	// Path is the pathname traversed by this operation.
	Path fspath.Path

	// If FollowFinalSymlink is true, and the Dentry traversed by the final
	// path component represents a symbolic link, the symbolic link should be
	// followed.
	FollowFinalSymlink bool
}

PathOperation specifies the path operated on by a VFS method.

PathOperation is passed to VFS methods by pointer to reduce memory copying: it's somewhat large and should never escape. (Options structs are passed by pointer to VFS and FileDescription methods for the same reason.)

+stateify savable

type PrependPathAtNonMountRootError

type PrependPathAtNonMountRootError struct{}

PrependPathAtNonMountRootError is returned by implementations of FilesystemImpl.PrependPath() when they encounter an independent ancestor Dentry that is not the Mount root.

+stateify savable

func (PrependPathAtNonMountRootError) Error

Error implements error.Error.

type PrependPathAtVFSRootError

type PrependPathAtVFSRootError struct{}

PrependPathAtVFSRootError is returned by implementations of FilesystemImpl.PrependPath() when they encounter the contextual VFS root.

+stateify savable

func (PrependPathAtVFSRootError) Error

Error implements error.Error.

type PrependPathSyntheticError

type PrependPathSyntheticError struct{}

PrependPathSyntheticError is returned by implementations of FilesystemImpl.PrependPath() for which prepended names do not represent real paths.

+stateify savable

func (PrependPathSyntheticError) Error

Error implements error.Error.

type ReadOptions

type ReadOptions struct {
	// Flags contains flags as specified for preadv2(2).
	Flags uint32
}

ReadOptions contains options to FileDescription.PRead(), FileDescriptionImpl.PRead(), FileDescription.Read(), and FileDescriptionImpl.Read().

+stateify savable

type RegisterDeviceOptions

type RegisterDeviceOptions struct {
	// GroupName is the name shown for this device registration in
	// /proc/devices. If GroupName is empty, this registration will not be
	// shown in /proc/devices.
	GroupName string
}

RegisterDeviceOptions contains options to VirtualFilesystem.RegisterDevice().

+stateify savable

type RegisterFilesystemTypeOptions

type RegisterFilesystemTypeOptions struct {
	// AllowUserMount determines whether users are allowed to mount a file system
	// of this type, i.e. through mount(2). If AllowUserMount is true, allow calls
	// to VirtualFilesystem.MountAt() for which MountOptions.InternalMount == false
	// to use this filesystem type.
	AllowUserMount bool

	// If AllowUserList is true, make this filesystem type visible in
	// /proc/filesystems.
	AllowUserList bool

	// If RequiresDevice is true, indicate that mounting this filesystem
	// requires a block device as the mount source in /proc/filesystems.
	RequiresDevice bool
}

RegisterFilesystemTypeOptions contains options to VirtualFilesystem.RegisterFilesystem().

+stateify savable

type RenameOptions

type RenameOptions struct {
	// Flags contains flags as specified for renameat2(2).
	Flags uint32

	// If MustBeDir is true, the renamed file must be a directory.
	MustBeDir bool
}

RenameOptions contains options to VirtualFilesystem.RenameAt() and FilesystemImpl.RenameAt().

+stateify savable

type ResolvingPath

type ResolvingPath struct {
	// contains filtered or unexported fields
}

ResolvingPath represents the state of an in-progress path resolution, shared between VFS and FilesystemImpl methods that take a path.

From the perspective of FilesystemImpl methods, a ResolvingPath represents a starting Dentry on the associated Filesystem (on which a reference is already held), a stream of path components relative to that Dentry, and elements of the invoking Context that are commonly required by FilesystemImpl methods.

ResolvingPath is loosely analogous to Linux's struct nameidata.

+stateify savable

func (*ResolvingPath) Advance

func (rp *ResolvingPath) Advance()

Advance advances the stream of path components represented by rp.

Preconditions: !rp.Done().

func (*ResolvingPath) CheckMount

func (rp *ResolvingPath) CheckMount(ctx context.Context, d *Dentry) error

CheckMount is called after resolving the parent or child of another Dentry to d. If d is a mount point, such that path resolution should switch to another Mount, CheckMount returns a non-nil error. Otherwise, CheckMount returns nil.

func (*ResolvingPath) CheckRoot

func (rp *ResolvingPath) CheckRoot(ctx context.Context, d *Dentry) (bool, error)

CheckRoot is called before resolving the parent of the Dentry d. If the Dentry is contextually a VFS root, such that path resolution should treat d's parent as itself, CheckRoot returns (true, nil). If the Dentry is the root of a non-root mount, such that path resolution should switch to another Mount, CheckRoot returns (unspecified, non-nil error). Otherwise, path resolution should resolve d's parent normally, and CheckRoot returns (false, nil).

func (*ResolvingPath) Component

func (rp *ResolvingPath) Component() string

Component returns the current path component in the stream represented by rp.

Preconditions: !rp.Done().

func (*ResolvingPath) Copy

func (rp *ResolvingPath) Copy() *ResolvingPath

Copy creates another ResolvingPath with the same state as the original. Copies are independent, using the copy does not change the original and vice-versa.

Caller must call Resease() when done.

func (*ResolvingPath) Credentials

func (rp *ResolvingPath) Credentials() *auth.Credentials

Credentials returns the credentials of rp's provider.

func (*ResolvingPath) Done

func (rp *ResolvingPath) Done() bool

Done returns true if there are no remaining path components in the stream represented by rp.

func (*ResolvingPath) Final

func (rp *ResolvingPath) Final() bool

Final returns true if there is exactly one remaining path component in the stream represented by rp.

Preconditions: !rp.Done().

func (*ResolvingPath) HandleJump

func (rp *ResolvingPath) HandleJump(target VirtualDentry) error

HandleJump is called when the current path component is a "magic" link to the given VirtualDentry, like /proc/[pid]/fd/[fd]. If the calling Filesystem method should continue path traversal, HandleMagicSymlink updates the path component stream to reflect the magic link target and returns nil. Otherwise it returns a non-nil error.

Preconditions: !rp.Done().

func (rp *ResolvingPath) HandleSymlink(target string) error

HandleSymlink is called when the current path component is a symbolic link to the given target. If the calling Filesystem method should continue path traversal, HandleSymlink updates the path component stream to reflect the symlink target and returns nil. Otherwise it returns a non-nil error.

Preconditions: !rp.Done().

Postconditions: If HandleSymlink returns a nil error, then !rp.Done().

func (*ResolvingPath) Mount

func (rp *ResolvingPath) Mount() *Mount

Mount returns the Mount on which path resolution is currently occurring. It does not take a reference on the returned Mount.

func (*ResolvingPath) MustBeDir

func (rp *ResolvingPath) MustBeDir() bool

MustBeDir returns true if the file traversed by rp must be a directory.

func (*ResolvingPath) Pit

func (rp *ResolvingPath) Pit() fspath.Iterator

Pit returns a copy of rp's current path iterator. Modifying the iterator does not change rp.

func (*ResolvingPath) Release

func (rp *ResolvingPath) Release(ctx context.Context)

Release decrements references if needed and returns the object to the pool.

func (rp *ResolvingPath) ShouldFollowSymlink() bool

ShouldFollowSymlink returns true if, supposing that the current path component in pcs represents a symbolic link, the symbolic link should be followed.

If path is terminated with '/', the '/' is considered the last element and any symlink before that is followed:

  • For most non-creating walks, the last path component is handled by fs/namei.c:lookup_last(), which sets LOOKUP_FOLLOW if the first byte after the path component is non-NULL (which is only possible if it's '/') and the path component is of type LAST_NORM.

  • For open/openat/openat2 without O_CREAT, the last path component is handled by fs/namei.c:do_last(), which does the same, though without the LAST_NORM check.

Preconditions: !rp.Done().

func (*ResolvingPath) Start

func (rp *ResolvingPath) Start() *Dentry

Start returns the starting Dentry represented by rp. It does not take a reference on the returned Dentry.

func (*ResolvingPath) VirtualFilesystem

func (rp *ResolvingPath) VirtualFilesystem() *VirtualFilesystem

VirtualFilesystem returns the containing VirtualFilesystem.

type SetStatOptions

type SetStatOptions struct {
	// Stat is the metadata that should be set. Only fields indicated by
	// Stat.Mask should be set.
	//
	// If Stat specifies that a timestamp should be set,
	// FilesystemImpl.SetStatAt() and FileDescriptionImpl.SetStat() must
	// special-case StatxTimestamp.Nsec == UTIME_NOW as described by
	// utimensat(2); however, they do not need to check for StatxTimestamp.Nsec
	// == UTIME_OMIT (VFS users must unset the corresponding bit in Stat.Mask
	// instead).
	Stat linux.Statx

	// NeedWritePerm indicates that write permission on the file is needed for
	// this operation. This is needed for truncate(2) (note that ftruncate(2)
	// does not require the same check--instead, it checks that the fd is
	// writable).
	NeedWritePerm bool
}

SetStatOptions contains options to VirtualFilesystem.SetStatAt(), FilesystemImpl.SetStatAt(), FileDescription.SetStat(), and FileDescriptionImpl.SetStat().

+stateify savable

type SetXattrOptions

type SetXattrOptions struct {
	// Name is the name of the extended attribute being mutated.
	Name string

	// Value is the extended attribute's new value.
	Value string

	// Flags contains flags as specified for setxattr/lsetxattr/fsetxattr(2).
	Flags uint32
}

SetXattrOptions contains options to VirtualFilesystem.SetXattrAt(), FilesystemImpl.SetXattrAt(), FileDescription.SetXattr(), and FileDescriptionImpl.SetXattr().

+stateify savable

type StatOptions

type StatOptions struct {
	// Mask is the set of fields in the returned Statx that the FilesystemImpl
	// or FileDescriptionImpl should provide. Bits are as in linux.Statx.Mask.
	//
	// The FilesystemImpl or FileDescriptionImpl may return fields not
	// requested in Mask, and may fail to return fields requested in Mask that
	// are not supported by the underlying filesystem implementation, without
	// returning an error.
	Mask uint32

	// Sync specifies the synchronization required, and is one of
	// linux.AT_STATX_SYNC_AS_STAT (which is 0, and therefore the default),
	// linux.AT_STATX_SYNC_FORCE_SYNC, or linux.AT_STATX_SYNC_DONT_SYNC.
	Sync uint32
}

StatOptions contains options to VirtualFilesystem.StatAt(), FilesystemImpl.StatAt(), FileDescription.Stat(), and FileDescriptionImpl.Stat().

+stateify savable

type StaticData

type StaticData struct {
	Data string
}

StaticData implements DynamicBytesSource over a static string.

+stateify savable

func (*StaticData) Generate

func (s *StaticData) Generate(ctx context.Context, buf *bytes.Buffer) error

Generate implements DynamicBytesSource.

type UmountOptions

type UmountOptions struct {
	// Flags contains flags as specified for umount2(2).
	Flags uint32
}

UmountOptions contains options to VirtualFilesystem.UmountAt().

+stateify savable

type VirtualDentry

type VirtualDentry struct {
	// contains filtered or unexported fields
}

A VirtualDentry represents a node in a VFS tree, by combining a Dentry (which represents a node in a Filesystem's tree) and a Mount (which represents the Filesystem's position in a VFS mount tree).

VirtualDentry's semantics are similar to that of a Go interface object representing a pointer: it is a copyable value type that represents references to another entity. The zero value of VirtualDentry is an "empty VirtualDentry", directly analogous to a nil interface object. VirtualDentry.Ok() checks that a VirtualDentry is not zero-valued; unless otherwise specified, all other VirtualDentry methods require VirtualDentry.Ok() == true.

Mounts and Dentries are reference-counted, requiring that users call VirtualDentry.{Inc,Dec}Ref() as appropriate. We often colloquially refer to references on the Mount and Dentry referred to by a VirtualDentry as references on the VirtualDentry itself. Unless otherwise specified, all VirtualDentry methods require that a reference is held on the VirtualDentry.

VirtualDentry is analogous to Linux's struct path.

+stateify savable

func MakeVirtualDentry

func MakeVirtualDentry(mount *Mount, dentry *Dentry) VirtualDentry

MakeVirtualDentry creates a VirtualDentry.

func RootFromContext

func RootFromContext(ctx context.Context) VirtualDentry

RootFromContext returns the VFS root used by ctx. It takes a reference on the returned VirtualDentry. If ctx does not have a specific VFS root, RootFromContext returns a zero-value VirtualDentry.

func (VirtualDentry) DecRef

func (vd VirtualDentry) DecRef(ctx context.Context)

DecRef decrements the reference counts on the Mount and Dentry represented by vd.

func (VirtualDentry) Dentry

func (vd VirtualDentry) Dentry() *Dentry

Dentry returns the Dentry associated with vd. It does not take a reference on the returned Dentry.

func (VirtualDentry) IncRef

func (vd VirtualDentry) IncRef()

IncRef increments the reference counts on the Mount and Dentry represented by vd.

func (VirtualDentry) Mount

func (vd VirtualDentry) Mount() *Mount

Mount returns the Mount associated with vd. It does not take a reference on the returned Mount.

func (VirtualDentry) Ok

func (vd VirtualDentry) Ok() bool

Ok returns true if vd is not empty. It does not require that a reference is held.

type VirtualFilesystem

type VirtualFilesystem struct {
	// contains filtered or unexported fields
}

A VirtualFilesystem (VFS for short) combines Filesystems in trees of Mounts.

There is no analogue to the VirtualFilesystem type in Linux, as the equivalent state in Linux is global.

+stateify savable

func (*VirtualFilesystem) AbortDeleteDentry

func (vfs *VirtualFilesystem) AbortDeleteDentry(d *Dentry)

AbortDeleteDentry must be called after PrepareDeleteDentry if the deletion fails. +checklocksrelease:d.mu

func (*VirtualFilesystem) AbortRenameDentry

func (vfs *VirtualFilesystem) AbortRenameDentry(from, to *Dentry)

AbortRenameDentry must be called after PrepareRenameDentry if the rename fails. +checklocksrelease:from.mu +checklocksrelease:to.mu

func (*VirtualFilesystem) AccessAt

func (vfs *VirtualFilesystem) AccessAt(ctx context.Context, creds *auth.Credentials, ats AccessTypes, pop *PathOperation) error

AccessAt checks whether a user with creds has access to the file at the given path.

func (*VirtualFilesystem) BoundEndpointAt

BoundEndpointAt gets the bound endpoint at the given path, if one exists.

func (*VirtualFilesystem) CommitDeleteDentry

func (vfs *VirtualFilesystem) CommitDeleteDentry(ctx context.Context, d *Dentry)

CommitDeleteDentry must be called after PrepareDeleteDentry if the deletion succeeds. +checklocksrelease:d.mu

func (*VirtualFilesystem) CommitRenameExchangeDentry

func (vfs *VirtualFilesystem) CommitRenameExchangeDentry(from, to *Dentry)

CommitRenameExchangeDentry must be called after the files represented by from and to are exchanged by rename(RENAME_EXCHANGE).

Preconditions: PrepareRenameDentry was previously called on from and to. +checklocksrelease:from.mu +checklocksrelease:to.mu

func (*VirtualFilesystem) CommitRenameReplaceDentry

func (vfs *VirtualFilesystem) CommitRenameReplaceDentry(ctx context.Context, from, to *Dentry)

CommitRenameReplaceDentry must be called after the file represented by from is renamed without RENAME_EXCHANGE. If to is not nil, it represents the file that was replaced by from.

Preconditions: PrepareRenameDentry was previously called on from and to. +checklocksrelease:from.mu +checklocksrelease:to.mu

func (*VirtualFilesystem) CompleteRestore

func (vfs *VirtualFilesystem) CompleteRestore(ctx context.Context, opts *CompleteRestoreOptions) error

CompleteRestore completes restoration from checkpoint for all filesystems after deserialization.

func (*VirtualFilesystem) ConnectMountAt

func (vfs *VirtualFilesystem) ConnectMountAt(ctx context.Context, creds *auth.Credentials, mnt *Mount, target *PathOperation) error

ConnectMountAt connects mnt at the path represented by target.

Preconditions: mnt must be disconnected.

func (*VirtualFilesystem) GenerateProcFilesystems

func (vfs *VirtualFilesystem) GenerateProcFilesystems(buf *bytes.Buffer)

GenerateProcFilesystems emits the contents of /proc/filesystems for vfs to buf.

func (*VirtualFilesystem) GenerateProcMountInfo

func (vfs *VirtualFilesystem) GenerateProcMountInfo(ctx context.Context, taskRootDir VirtualDentry, buf *bytes.Buffer)

GenerateProcMountInfo emits the contents of /proc/[pid]/mountinfo for vfs to buf.

Preconditions: taskRootDir.Ok().

func (*VirtualFilesystem) GenerateProcMounts

func (vfs *VirtualFilesystem) GenerateProcMounts(ctx context.Context, taskRootDir VirtualDentry, buf *bytes.Buffer)

GenerateProcMounts emits the contents of /proc/[pid]/mounts for vfs to buf.

Preconditions: taskRootDir.Ok().

func (*VirtualFilesystem) GetAnonBlockDevMinor

func (vfs *VirtualFilesystem) GetAnonBlockDevMinor() (uint32, error)

GetAnonBlockDevMinor allocates and returns an unused minor device number for an "anonymous" block device with major number UNNAMED_MAJOR.

func (*VirtualFilesystem) GetDentryAt

func (vfs *VirtualFilesystem) GetDentryAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, opts *GetDentryOptions) (VirtualDentry, error)

GetDentryAt returns a VirtualDentry representing the given path, at which a file must exist. A reference is taken on the returned VirtualDentry.

func (*VirtualFilesystem) GetXattrAt

func (vfs *VirtualFilesystem) GetXattrAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, opts *GetXattrOptions) (string, error)

GetXattrAt returns the value associated with the given extended attribute for the file at the given path.

func (*VirtualFilesystem) Init

func (vfs *VirtualFilesystem) Init(ctx context.Context) error

Init initializes a new VirtualFilesystem with no mounts or FilesystemTypes.

func (*VirtualFilesystem) InvalidateDentry

func (vfs *VirtualFilesystem) InvalidateDentry(ctx context.Context, d *Dentry)

InvalidateDentry is called when d ceases to represent the file it formerly did for reasons outside of VFS' control (e.g. d represents the local state of a file on a remote filesystem on which the file has already been deleted).

func (*VirtualFilesystem) LinkAt

func (vfs *VirtualFilesystem) LinkAt(ctx context.Context, creds *auth.Credentials, oldpop, newpop *PathOperation) error

LinkAt creates a hard link at newpop representing the existing file at oldpop.

func (*VirtualFilesystem) ListXattrAt

func (vfs *VirtualFilesystem) ListXattrAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, size uint64) ([]string, error)

ListXattrAt returns all extended attribute names for the file at the given path.

func (*VirtualFilesystem) MakeSyntheticMountpoint

func (vfs *VirtualFilesystem) MakeSyntheticMountpoint(ctx context.Context, target string, root VirtualDentry, creds *auth.Credentials) error

MakeSyntheticMountpoint creates parent directories of target if they do not exist and attempts to create a directory for the mountpoint. If a non-directory file already exists there then we allow it.

func (*VirtualFilesystem) MkdirAllAt

func (vfs *VirtualFilesystem) MkdirAllAt(ctx context.Context, currentPath string, root VirtualDentry, creds *auth.Credentials, mkdirOpts *MkdirOptions) error

MkdirAllAt recursively creates non-existent directories on the given path (including the last component).

func (*VirtualFilesystem) MkdirAt

func (vfs *VirtualFilesystem) MkdirAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, opts *MkdirOptions) error

MkdirAt creates a directory at the given path.

func (*VirtualFilesystem) MknodAt

func (vfs *VirtualFilesystem) MknodAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, opts *MknodOptions) error

MknodAt creates a file of the given mode at the given path. It returns an error from the linuxerr package.

func (*VirtualFilesystem) MountAt

func (vfs *VirtualFilesystem) MountAt(ctx context.Context, creds *auth.Credentials, source string, target *PathOperation, fsTypeName string, opts *MountOptions) (*Mount, error)

MountAt creates and mounts a Filesystem configured by the given arguments. The VirtualFilesystem will hold a reference to the Mount until it is unmounted.

This method returns the mounted Mount without a reference, for convenience during VFS setup when there is no chance of racing with unmount.

func (*VirtualFilesystem) MountDisconnected

func (vfs *VirtualFilesystem) MountDisconnected(ctx context.Context, creds *auth.Credentials, source string, fsTypeName string, opts *MountOptions) (*Mount, error)

MountDisconnected creates a Filesystem configured by the given arguments, then returns a Mount representing it. The new Mount is not associated with any MountNamespace and is not connected to any other Mounts.

func (*VirtualFilesystem) MustRegisterFilesystemType

func (vfs *VirtualFilesystem) MustRegisterFilesystemType(name string, fsType FilesystemType, opts *RegisterFilesystemTypeOptions)

MustRegisterFilesystemType is equivalent to RegisterFilesystemType but panics on failure.

func (*VirtualFilesystem) NewAnonVirtualDentry

func (vfs *VirtualFilesystem) NewAnonVirtualDentry(name string) VirtualDentry

NewAnonVirtualDentry returns a VirtualDentry with the given synthetic name, consistent with Linux's fs/anon_inodes.c:anon_inode_getfile(). References are taken on the returned VirtualDentry.

func (*VirtualFilesystem) NewDisconnectedMount

func (vfs *VirtualFilesystem) NewDisconnectedMount(fs *Filesystem, root *Dentry, opts *MountOptions) *Mount

NewDisconnectedMount returns a Mount representing fs with the given root (which may be nil). The new Mount is not associated with any MountNamespace and is not connected to any other Mounts. References are taken on fs and root.

func (*VirtualFilesystem) NewEpollInstanceFD

func (vfs *VirtualFilesystem) NewEpollInstanceFD(ctx context.Context) (*FileDescription, error)

NewEpollInstanceFD returns a FileDescription representing a new epoll instance. A reference is taken on the returned FileDescription.

func (*VirtualFilesystem) NewMountNamespace

func (vfs *VirtualFilesystem) NewMountNamespace(ctx context.Context, creds *auth.Credentials, source, fsTypeName string, opts *MountOptions) (*MountNamespace, error)

NewMountNamespace returns a new mount namespace with a root filesystem configured by the given arguments. A reference is taken on the returned MountNamespace.

func (*VirtualFilesystem) OpenAt

OpenAt returns a FileDescription providing access to the file at the given path. A reference is taken on the returned FileDescription.

func (*VirtualFilesystem) OpenDeviceSpecialFile

func (vfs *VirtualFilesystem) OpenDeviceSpecialFile(ctx context.Context, mnt *Mount, d *Dentry, kind DeviceKind, major, minor uint32, opts *OpenOptions) (*FileDescription, error)

OpenDeviceSpecialFile returns a FileDescription representing the given device.

func (*VirtualFilesystem) PathnameForGetcwd

func (vfs *VirtualFilesystem) PathnameForGetcwd(ctx context.Context, vfsroot, vd VirtualDentry) (string, error)

PathnameForGetcwd returns an absolute pathname to vd, consistent with Linux's sys_getcwd().

func (*VirtualFilesystem) PathnameReachable

func (vfs *VirtualFilesystem) PathnameReachable(ctx context.Context, vfsroot, vd VirtualDentry) (string, error)

PathnameReachable returns an absolute pathname to vd, consistent with Linux's __d_path() (as used by seq_path_root()). If vfsroot.Ok() and vd is not reachable from vfsroot, such that seq_path_root() would return SEQ_SKIP (causing the entire containing entry to be skipped), PathnameReachable returns ("", nil).

func (*VirtualFilesystem) PathnameWithDeleted

func (vfs *VirtualFilesystem) PathnameWithDeleted(ctx context.Context, vfsroot, vd VirtualDentry) (string, error)

PathnameWithDeleted returns an absolute pathname to vd, consistent with Linux's d_path(). In particular, if vd.Dentry() has been disowned, PathnameWithDeleted appends " (deleted)" to the returned pathname.

func (*VirtualFilesystem) PivotRoot

func (vfs *VirtualFilesystem) PivotRoot(ctx context.Context, creds *auth.Credentials, newRootPop *PathOperation, putOldPop *PathOperation) error

PivotRoot makes location pointed to by newRootPop the root of the current namespace, and moves the current root to the location pointed to by putOldPop.

func (*VirtualFilesystem) PrepareDeleteDentry

func (vfs *VirtualFilesystem) PrepareDeleteDentry(mntns *MountNamespace, d *Dentry) error

PrepareDeleteDentry must be called before attempting to delete the file represented by d. If PrepareDeleteDentry succeeds, the caller must call AbortDeleteDentry or CommitDeleteDentry depending on the deletion's outcome. +checklocksacquire:d.mu

func (*VirtualFilesystem) PrepareRenameDentry

func (vfs *VirtualFilesystem) PrepareRenameDentry(mntns *MountNamespace, from, to *Dentry) error

PrepareRenameDentry must be called before attempting to rename the file represented by from. If to is not nil, it represents the file that will be replaced or exchanged by the rename. If PrepareRenameDentry succeeds, the caller must call AbortRenameDentry, CommitRenameReplaceDentry, or CommitRenameExchangeDentry depending on the rename's outcome.

Preconditions: * If to is not nil, it must be a child Dentry from the same Filesystem. * from != to. +checklocksacquire:from.mu +checklocksacquire:to.mu

func (*VirtualFilesystem) PrepareSave

func (vfs *VirtualFilesystem) PrepareSave(ctx context.Context) error

PrepareSave prepares all filesystems for serialization.

func (*VirtualFilesystem) PutAnonBlockDevMinor

func (vfs *VirtualFilesystem) PutAnonBlockDevMinor(minor uint32)

PutAnonBlockDevMinor deallocates a minor device number returned by a previous call to GetAnonBlockDevMinor.

func (*VirtualFilesystem) ReadlinkAt

func (vfs *VirtualFilesystem) ReadlinkAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation) (string, error)

ReadlinkAt returns the target of the symbolic link at the given path.

func (*VirtualFilesystem) RegisterDevice

func (vfs *VirtualFilesystem) RegisterDevice(kind DeviceKind, major, minor uint32, dev Device, opts *RegisterDeviceOptions) error

RegisterDevice registers the given Device in vfs with the given major and minor device numbers.

func (*VirtualFilesystem) RegisterFilesystemType

func (vfs *VirtualFilesystem) RegisterFilesystemType(name string, fsType FilesystemType, opts *RegisterFilesystemTypeOptions) error

RegisterFilesystemType registers the given FilesystemType in vfs with the given name.

func (*VirtualFilesystem) Release

func (vfs *VirtualFilesystem) Release(ctx context.Context)

Release drops references on filesystem objects held by vfs.

Precondition: This must be called after VFS.Init() has succeeded.

func (*VirtualFilesystem) RemoveXattrAt

func (vfs *VirtualFilesystem) RemoveXattrAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, name string) error

RemoveXattrAt removes the given extended attribute from the file at rp.

func (*VirtualFilesystem) RenameAt

func (vfs *VirtualFilesystem) RenameAt(ctx context.Context, creds *auth.Credentials, oldpop, newpop *PathOperation, opts *RenameOptions) error

RenameAt renames the file at oldpop to newpop.

func (*VirtualFilesystem) RmdirAt

func (vfs *VirtualFilesystem) RmdirAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation) error

RmdirAt removes the directory at the given path.

func (*VirtualFilesystem) SetMountReadOnly

func (vfs *VirtualFilesystem) SetMountReadOnly(mnt *Mount, ro bool) error

SetMountReadOnly sets the mount as ReadOnly.

func (*VirtualFilesystem) SetStatAt

func (vfs *VirtualFilesystem) SetStatAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, opts *SetStatOptions) error

SetStatAt changes metadata for the file at the given path.

func (*VirtualFilesystem) SetXattrAt

func (vfs *VirtualFilesystem) SetXattrAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, opts *SetXattrOptions) error

SetXattrAt changes the value associated with the given extended attribute for the file at the given path.

func (*VirtualFilesystem) StatAt

func (vfs *VirtualFilesystem) StatAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, opts *StatOptions) (linux.Statx, error)

StatAt returns metadata for the file at the given path.

func (*VirtualFilesystem) StatFSAt

func (vfs *VirtualFilesystem) StatFSAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation) (linux.Statfs, error)

StatFSAt returns metadata for the filesystem containing the file at the given path.

func (*VirtualFilesystem) SymlinkAt

func (vfs *VirtualFilesystem) SymlinkAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, target string) error

SymlinkAt creates a symbolic link at the given path with the given target.

func (*VirtualFilesystem) SyncAllFilesystems

func (vfs *VirtualFilesystem) SyncAllFilesystems(ctx context.Context) error

SyncAllFilesystems has the semantics of Linux's sync(2).

func (*VirtualFilesystem) UmountAt

func (vfs *VirtualFilesystem) UmountAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation, opts *UmountOptions) error

UmountAt removes the Mount at the given path.

func (*VirtualFilesystem) UnlinkAt

func (vfs *VirtualFilesystem) UnlinkAt(ctx context.Context, creds *auth.Credentials, pop *PathOperation) error

UnlinkAt deletes the non-directory file at the given path.

type Watch

type Watch struct {
	// contains filtered or unexported fields
}

Watch represent a particular inotify watch created by inotify_add_watch.

+stateify savable

func (*Watch) ExcludeUnlinked

func (w *Watch) ExcludeUnlinked() bool

ExcludeUnlinked indicates whether the watched object should continue to be notified of events originating from a path that has been unlinked.

For example, if "foo/bar" is opened and then unlinked, operations on the open fd may be ignored by watches on "foo" and "foo/bar" with IN_EXCL_UNLINK.

func (*Watch) Notify

func (w *Watch) Notify(name string, events uint32, cookie uint32) bool

Notify queues a new event on this watch. Returns true if this is a one-shot watch that should be deleted, after this event was successfully queued.

func (*Watch) OwnerID

func (w *Watch) OwnerID() uint64

OwnerID returns the id of the inotify instance that owns this watch.

type Watches

type Watches struct {
	// contains filtered or unexported fields
}

Watches is the collection of all inotify watches on a single file.

+stateify savable

func (*Watches) Add

func (w *Watches) Add(watch *Watch)

Add adds watch into this set of watches.

Precondition: the inotify instance with the given id must be locked.

func (*Watches) HandleDeletion

func (w *Watches) HandleDeletion(ctx context.Context)

HandleDeletion is called when the watch target is destroyed. Clear the watch set, detach watches from the inotify instances they belong to, and generate the appropriate events.

func (*Watches) Lookup

func (w *Watches) Lookup(id uint64) *Watch

Lookup returns the watch owned by an inotify instance with the given id. Returns nil if no such watch exists.

Precondition: the inotify instance with the given id must be locked to prevent the returned watch from being concurrently modified or replaced in Inotify.watches.

func (*Watches) Notify

func (w *Watches) Notify(ctx context.Context, name string, events, cookie uint32, et EventType, unlinked bool)

Notify queues a new event with watches in this set. Watches with IN_EXCL_UNLINK are skipped if the event is coming from a child that has been unlinked.

func (*Watches) Remove

func (w *Watches) Remove(id uint64)

Remove removes a watch with the given id from this set of watches and releases it. The caller is responsible for generating any watch removal event, as appropriate. The provided id must match an existing watch in this collection.

Precondition: the inotify instance with the given id must be locked.

func (*Watches) Size

func (w *Watches) Size() int

Size returns the number of watches held by w.

type WritableDynamicBytesSource

type WritableDynamicBytesSource interface {
	DynamicBytesSource

	// Write sends writes to the source.
	Write(ctx context.Context, src usermem.IOSequence, offset int64) (int64, error)
}

WritableDynamicBytesSource extends DynamicBytesSource to allow writes to the underlying source.

TODO(b/179825241): Make utility for integer-based writable files.

type WriteOptions

type WriteOptions struct {
	// Flags contains flags as specified for pwritev2(2).
	Flags uint32
}

WriteOptions contains options to FileDescription.PWrite(), FileDescriptionImpl.PWrite(), FileDescription.Write(), and FileDescriptionImpl.Write().

+stateify savable

Directories

Path Synopsis
Package genericfstree provides tools for implementing vfs.FilesystemImpls where a single statically-determined lock or set of locks is sufficient to ensure that a Dentry's name and parent are contextually immutable.
Package genericfstree provides tools for implementing vfs.FilesystemImpls where a single statically-determined lock or set of locks is sufficient to ensure that a Dentry's name and parent are contextually immutable.
Package memxattr provides a default, in-memory extended attribute implementation.
Package memxattr provides a default, in-memory extended attribute implementation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL