memmap

package
v0.0.0-...-4ba931d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 11, 2025 License: Apache-2.0, MIT Imports: 5 Imported by: 74

Documentation

Overview

Package memmap defines semantics for memory mappings.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CheckTranslateResult

func CheckTranslateResult(required, optional MappableRange, at hostarch.AccessType, ts []Translation, terr error) error

CheckTranslateResult returns an error if (ts, terr) does not satisfy all postconditions for Mappable.Translate(required, optional, at).

Preconditions: Same as Mappable.Translate.

Types

type BufferedIOFallbackErr

type BufferedIOFallbackErr struct{}

BufferedIOFallbackErr is returned (by value) by implementations of File.MapInternal() that cannot succeed, but can still support memory-mapped I/O by falling back to buffered reads and writes.

func (BufferedIOFallbackErr) Error

func (BufferedIOFallbackErr) Error() string

Error implements error.Error.

type BusError

type BusError struct {
	// Err is the original error.
	Err error
}

BusError may be returned by implementations of Mappable.Translate for errors that should result in SIGBUS delivery if they cause application page fault handling to fail.

func (*BusError) Error

func (b *BusError) Error() string

Error implements error.Error.

type File

type File interface {

	// IncRef increments the reference count on all pages in fr and
	// associates each page with a memCgID (memory cgroup id) to which it
	// belongs. memCgID will not be changed if the page already exists.
	//
	// Preconditions:
	//	* fr.Start and fr.End must be page-aligned.
	//	* fr.Length() > 0.
	//	* At least one reference must be held on all pages in fr. (The File
	//		interface does not provide a way to acquire an initial reference;
	//		implementors may define mechanisms for doing so.)
	IncRef(fr FileRange, memCgID uint32)

	// DecRef decrements the reference count on all pages in fr.
	//
	// Preconditions:
	//	* fr.Start and fr.End must be page-aligned.
	//	* fr.Length() > 0.
	//	* At least one reference must be held on all pages in fr.
	DecRef(fr FileRange)

	// MapInternal returns a mapping of the given file offsets in the invoking
	// process' address space for reading and writing.
	//
	// Note that fr.Start and fr.End need not be page-aligned.
	//
	// Preconditions:
	//	* fr.Length() > 0.
	//	* At least one reference must be held on all pages in fr.
	//
	// Postconditions: The returned mapping is valid as long as at least one
	// reference is held on the mapped pages.
	MapInternal(fr FileRange, at hostarch.AccessType) (safemem.BlockSeq, error)

	// DataFD blocks until offsets fr in the file contain valid data, then
	// returns the file descriptor represented by the File.
	//
	// Note that fr.Start and fr.End need not be page-aligned.
	//
	// Preconditions:
	//	* fr.Length() > 0.
	//	* At least one reference must be held on all pages in fr.
	DataFD(fr FileRange) (int, error)

	// BufferReadAt reads len(dst) bytes from the file into dst, starting at
	// file offset off. It returns the number of bytes read. Like
	// io.ReaderAt.ReadAt(), it never returns a short read with a nil error.
	//
	// Implementations of File for which MapInternal() never returns
	// BufferedIOFallbackErr can embed NoBufferedIOFallback to obtain an
	// appropriate implementation of BufferReadAt.
	//
	// Preconditions:
	//	* MapInternal() returned a BufferedIOFallbackErr.
	//	* At least one reference must be held on all read pages.
	BufferReadAt(off uint64, dst []byte) (uint64, error)

	// BufferWriteAt writes len(src) bytes src to the file, starting at file
	// offset off. It returns the number of bytes written. Like
	// io.WriterAt.WriteAt(), it never returns a short write with a nil error.
	//
	// Implementations of File for which MapInternal() never returns
	// BufferedIOFallbackErr can embed NoBufferedIOFallback to obtain an
	// appropriate implementation of BufferWriteAt.
	//
	// Preconditions:
	//	* MapInternal() returned a BufferedIOFallbackErr.
	//	* At least one reference must be held on all written pages.
	BufferWriteAt(off uint64, src []byte) (uint64, error)

	// FD returns the file descriptor represented by the File. The returned
	// file descriptor should not be used to implement
	// platform.AddressSpace.MapFile, since the contents of the File may not be
	// valid; use DataFD instead.
	FD() int
}

File represents a host file that may be mapped into an platform.AddressSpace.

type InvalidateOpts

type InvalidateOpts struct {
	// InvalidatePrivate is true if private pages in the invalidated region
	// should also be discarded, causing their data to be lost.
	InvalidatePrivate bool
}

InvalidateOpts holds options to MappingSpace.Invalidate.

type MLockMode

type MLockMode int

MLockMode specifies the memory locking behavior of a memory mapping.

const (
	// MLockNone specifies that a mapping has no memory locking behavior.
	//
	// This must be the zero value for MLockMode.
	MLockNone MLockMode = iota

	// MLockEager specifies that a mapping is memory-locked, as by mlock() or
	// similar. Pages in the mapping should be made, and kept, resident in
	// physical memory as soon as possible.
	//
	// As of this writing, MLockEager does not cause memory-locking to be
	// requested from the host; it only affects the sentry's memory management
	// behavior.
	//
	// MLockEager is analogous to Linux's VM_LOCKED.
	MLockEager

	// MLockLazy specifies that a mapping is memory-locked, as by mlock() or
	// similar. Pages in the mapping should be kept resident in physical memory
	// once they have been made resident due to e.g. a page fault.
	//
	// As of this writing, MLockLazy does not cause memory-locking to be
	// requested from the host; in fact, it has virtually no effect, except for
	// interactions between mlocked pages and other syscalls.
	//
	// MLockLazy is analogous to Linux's VM_LOCKED | VM_LOCKONFAULT.
	MLockLazy
)

Note that the ordering of MLockModes is significant; see mm.MemoryManager.defMLockMode.

type MMapOpts

type MMapOpts struct {
	// Length is the length of the mapping.
	Length uint64

	// MappingIdentity controls the lifetime of Mappable, and provides
	// properties of the mapping shown in /proc/[pid]/maps. If MMapOpts is used
	// to successfully create a memory mapping, a reference is taken on
	// MappingIdentity.
	MappingIdentity MappingIdentity

	// Mappable is the Mappable to be mapped. If Mappable is nil, the mapping
	// is anonymous. If Mappable is not nil, it must remain valid as long as a
	// reference is held on MappingIdentity.
	Mappable Mappable

	// Offset is the offset into Mappable to map. If Mappable is nil, Offset is
	// ignored.
	Offset uint64

	// Addr is the suggested address for the mapping.
	Addr hostarch.Addr

	// Fixed specifies whether this is a fixed mapping (it must be located at
	// Addr).
	Fixed bool

	// Unmap specifies whether existing mappings in the range being mapped may
	// be replaced. If Unmap is true, Fixed must be true.
	Unmap bool

	// If Map32Bit is true, all addresses in the created mapping must fit in a
	// 32-bit integer. (Note that the "end address" of the mapping, i.e. the
	// address of the first byte *after* the mapping, need not fit in a 32-bit
	// integer.) Map32Bit is ignored if Fixed is true.
	Map32Bit bool

	// Perms is the set of permissions to the applied to this mapping.
	Perms hostarch.AccessType

	// MaxPerms limits the set of permissions that may ever apply to this
	// mapping. If Mappable is not nil, all memmap.Translations returned by
	// Mappable.Translate must support all accesses in MaxPerms.
	//
	// Preconditions: MaxAccessType should be an effective AccessType, as
	// access cannot be limited beyond effective AccessTypes.
	MaxPerms hostarch.AccessType

	// Private is true if writes to the mapping should be propagated to a copy
	// that is exclusive to the MemoryManager.
	Private bool

	// GrowsDown is true if the mapping should be automatically expanded
	// downward on guard page faults.
	GrowsDown bool

	// Stack is equivalent to MAP_STACK, which has no mandatory semantics in
	// Linux.
	Stack bool

	// PlatformEffect controls the synchronous effect of this call on the
	// underlying platform.AddressSpace.
	PlatformEffect MMapPlatformEffect

	// MLockMode specifies the memory locking behavior of the mapping.
	MLockMode MLockMode

	// Name is the name used for the mapping in /proc/[pid]/maps. If Name is
	// empty, MappingIdentity.MappedName() will be used instead.
	//
	// TODO(jamieliu): Replace entirely with MappingIdentity?
	Name string

	// NameMut controls the effect of prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME)
	// on this mapping.
	NameMut NameMut

	// Force means to skip validation checks of Addr and Length. It can be
	// used to create special mappings below mm.layout.MinAddr and
	// mm.layout.MaxAddr. It has to be used with caution.
	//
	// If Force is true, Unmap and Fixed must be true.
	Force bool

	// If RequirePlatformEffect is false, PlatformEffect is best-effort;
	// failure to create mappings in the platform.AddressSpace are silently
	// ignored. If RequirePlatformEffect is true, failure to create mappings in
	// the platform.AddressSpace cause MMap() to fail. (If PlatformEffect is
	// PlatformEffectDefault, RequirePlatformEffect is ignored.)
	RequirePlatformEffect bool

	// SentryOwnedContent indicates the sentry exclusively controls the
	// underlying memory backing the mapping thus the memory content is
	// guaranteed not to be modified outside the sentry's purview.
	SentryOwnedContent bool
}

MMapOpts specifies a request to create a memory mapping.

type MMapPlatformEffect

type MMapPlatformEffect uint8

MMapPlatformEffect is the type of MMapOpts.PlatformEffect.

const (
	// PlatformEffectDefault indicates that no specific behavior is requested
	// from the platform.
	PlatformEffectDefault MMapPlatformEffect = iota

	// PlatformEffectPopulate indicates that platform mappings should be
	// established for all pages in the mapping.
	PlatformEffectPopulate

	// PlatformEffectCommit is like PlatformEffectPopulate, but also requests
	// that the platform eagerly commit resources to the mapping, as in
	// platform.AddressSpace.MapFile(precommit=true).
	PlatformEffectCommit
)

Possible values for MMapOpts.PlatformEffect:

type Mappable

type Mappable interface {
	// AddMapping notifies the Mappable of a mapping from addresses ar in ms to
	// offsets [offset, offset+ar.Length()) in this Mappable.
	//
	// The writable flag indicates whether the backing data for a Mappable can
	// be modified through the mapping. Effectively, this means a shared mapping
	// where Translate may be called with at.Write == true. This is a property
	// established at mapping creation and must remain constant throughout the
	// lifetime of the mapping.
	//
	// Preconditions: offset+ar.Length() does not overflow.
	AddMapping(ctx context.Context, ms MappingSpace, ar hostarch.AddrRange, offset uint64, writable bool) error

	// RemoveMapping notifies the Mappable of the removal of a mapping from
	// addresses ar in ms to offsets [offset, offset+ar.Length()) in this
	// Mappable.
	//
	// Preconditions:
	//	* offset+ar.Length() does not overflow.
	//	* The removed mapping must exist. writable must match the
	//		corresponding call to AddMapping.
	RemoveMapping(ctx context.Context, ms MappingSpace, ar hostarch.AddrRange, offset uint64, writable bool)

	// CopyMapping notifies the Mappable of an attempt to copy a mapping in ms
	// from srcAR to dstAR. For most Mappables, this is equivalent to
	// AddMapping. Note that it is possible that srcAR.Length() != dstAR.Length(),
	// and also that srcAR.Length() == 0.
	//
	// CopyMapping is only called when a mapping is copied within a given
	// MappingSpace; it is analogous to Linux's vm_operations_struct::mremap.
	//
	// Preconditions:
	//	* offset+srcAR.Length() and offset+dstAR.Length() do not overflow.
	//	* The mapping at srcAR must exist. writable must match the
	//		corresponding call to AddMapping.
	CopyMapping(ctx context.Context, ms MappingSpace, srcAR, dstAR hostarch.AddrRange, offset uint64, writable bool) error

	// Translate returns the Mappable's current mappings for at least the range
	// of offsets specified by required, and at most the range of offsets
	// specified by optional. at is the set of access types that may be
	// performed using the returned Translations. If not all required offsets
	// are translated, it returns a non-nil error explaining why.
	//
	// Translations are valid until invalidated by a callback to
	// MappingSpace.Invalidate or until the caller removes its mapping of the
	// translated range. Mappable implementations must ensure that at least one
	// reference is held on all pages in a File that may be the result
	// of a valid Translation.
	//
	// Preconditions:
	//	* required.Length() > 0.
	//	* optional.IsSupersetOf(required).
	//	* required and optional must be page-aligned.
	//	* The caller must have established a mapping for all of the queried
	//		offsets via a previous call to AddMapping.
	//	* The caller is responsible for ensuring that calls to Translate
	//		synchronize with invalidation.
	//
	// Postconditions: See CheckTranslateResult.
	Translate(ctx context.Context, required, optional MappableRange, at hostarch.AccessType) ([]Translation, error)

	// InvalidateUnsavable requests that the Mappable invalidate Translations
	// that cannot be preserved across save/restore.
	//
	// Invariant: InvalidateUnsavable never races with concurrent calls to any
	// other Mappable methods.
	InvalidateUnsavable(ctx context.Context) error
}

Mappable represents a memory-mappable object, a mutable mapping from uint64 offsets to (File, uint64 File offset) pairs.

See mm/mm.go for Mappable's place in the lock order.

All Mappable methods have the following preconditions:

  • hostarch.AddrRanges and MappableRanges must be non-empty (Length() != 0).
  • hostarch.Addrs and Mappable offsets must be page-aligned.

type MappingIdentity

type MappingIdentity interface {
	// IncRef increments the MappingIdentity's reference count.
	IncRef()

	// DecRef decrements the MappingIdentity's reference count.
	DecRef(ctx context.Context)

	// MappedName returns the application-visible name shown in
	// /proc/[pid]/maps.
	MappedName(ctx context.Context) string

	// DeviceID returns the device number shown in /proc/[pid]/maps.
	DeviceID() uint64

	// InodeID returns the inode number shown in /proc/[pid]/maps.
	InodeID() uint64

	// Msync has the same semantics as fs.FileOperations.Fsync(ctx,
	// int64(mr.Start), int64(mr.End-1), fs.SyncData).
	// (fs.FileOperations.Fsync() takes an inclusive end, but mr.End is
	// exclusive, hence mr.End-1.) It is defined rather than Fsync so that
	// implementors don't need to depend on the fs package for fs.SyncType.
	Msync(ctx context.Context, mr MappableRange) error
}

MappingIdentity controls the lifetime of a Mappable, and provides information about the Mappable for /proc/[pid]/maps. It is distinct from Mappable because all Mappables that are coherent must compare equal to support the implementation of shared futexes, but different MappingIdentities may represent the same Mappable, in the same way that multiple fs.Files may represent the same fs.Inode. (This similarity is not coincidental; fs.File implements MappingIdentity, and some fs.InodeOperations implement Mappable.)

type MappingOfRange

type MappingOfRange struct {
	MappingSpace MappingSpace
	AddrRange    hostarch.AddrRange
	Writable     bool
}

MappingOfRange represents a mapping of a MappableRange.

+stateify savable

func (MappingOfRange) String

func (r MappingOfRange) String() string

String implements fmt.Stringer.String.

type MappingSpace

type MappingSpace interface {
	// Invalidate is called to notify the MappingSpace that values returned by
	// previous calls to Mappable.Translate for offsets mapped by addresses in
	// ar are no longer valid.
	//
	// Invalidate must not take any locks preceding mm.MemoryManager.activeMu
	// in the lock order.
	//
	// Preconditions:
	//	* ar.Length() != 0.
	//	* ar must be page-aligned.
	Invalidate(ar hostarch.AddrRange, opts InvalidateOpts)
}

MappingSpace represents a mutable mapping from hostarch.Addrs to (Mappable, uint64 offset) pairs.

type MappingsOfRange

type MappingsOfRange map[MappingOfRange]struct{}

MappingsOfRange is the value type of MappingSet, and represents the set of all mappings of the corresponding MappableRange.

Using a map offers O(1) lookups in RemoveMapping and mappingSetFunctions.Merge.

type NameMut

type NameMut uint8

NameMut is the type of MMapOpts.NameMut.

const (
	// NameMutDisallowed indicates that PR_SET_VMA_ANON_NAME should fail.
	NameMutDisallowed NameMut = iota

	// NameMutAnon indicates that PR_SET_VMA_ANON_NAME should succeed, and
	// treat the mapping as private anonymous memory.
	NameMutAnon

	// NameMutAnonShmem indicates that PR_SET_VMA_ANON_NAME should succeed, and
	// treat the mapping as shared anonymous memory.
	NameMutAnonShmem
)

Possible values for MMapOpts.NameMut:

type NoBufferedIOFallback

type NoBufferedIOFallback struct{}

NoBufferedIOFallback implements File.BufferReadAt() and BufferWriteAt() for implementations of File for which MapInternal() never returns BufferedIOFallbackErr.

func (NoBufferedIOFallback) BufferReadAt

func (NoBufferedIOFallback) BufferReadAt(off uint64, dst []byte) (uint64, error)

BufferReadAt implements File.BufferReadAt.

func (NoBufferedIOFallback) BufferWriteAt

func (NoBufferedIOFallback) BufferWriteAt(off uint64, src []byte) (uint64, error)

BufferWriteAt implements File.BufferWriteAt.

type Translation

type Translation struct {
	// Source is the translated range in the Mappable.
	Source MappableRange

	// File is the mapped file.
	File File

	// Offset is the offset into File at which this Translation begins.
	Offset uint64

	// Perms is the set of permissions for which platform.AddressSpace.MapFile
	// and platform.AddressSpace.MapInternal on this Translation is permitted.
	Perms hostarch.AccessType
}

Translations are returned by Mappable.Translate.

func (Translation) FileRange

func (t Translation) FileRange() FileRange

FileRange returns the FileRange represented by t.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL