ring0

package
v0.0.0-...-ba09d25 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 29, 2021 License: Apache-2.0, MIT Imports: 11 Imported by: 0

Documentation

Overview

Package ring0 provides basic operating system-level stubs.

Index

Constants

View Source
const (
	// KernelFlagsSet should always be set in the kernel.
	KernelFlagsSet = _RFLAGS_RESERVED

	// UserFlagsSet are always set in userspace.
	//
	// _RFLAGS_IOPL is a set of two bits and it shows the I/O privilege
	// level. The Current Privilege Level (CPL) of the task must be less
	// than or equal to the IOPL in order for the task or program to access
	// I/O ports.
	//
	// Here, _RFLAGS_IOPL0 is used only to determine whether the task is
	// running in the kernel or userspace mode. In the user mode, the CPL is
	// always 3 and it doesn't matter what IOPL is set if it is bellow CPL.
	//
	// We need to have one bit which will be always different in user and
	// kernel modes. And we have to remember that even though we have
	// KernelFlagsClear, we still can see some of these flags in the kernel
	// mode. This can happen when the goruntime switches on a goroutine
	// which has been saved in the host mode. On restore, the popf
	// instruction is used to restore flags and this means that all flags
	// what the goroutine has in the host mode will be restored in the
	// kernel mode.
	//
	// _RFLAGS_IOPL0 is never set in host and kernel modes and we always set
	// it in the user mode. So if this flag is set, the task is running in
	// the user mode and if it isn't set, the task is running in the kernel
	// mode.
	UserFlagsSet = _RFLAGS_RESERVED | _RFLAGS_IF | _RFLAGS_IOPL0

	// KernelFlagsClear should always be clear in the kernel.
	KernelFlagsClear = _RFLAGS_STEP | _RFLAGS_IF | _RFLAGS_IOPL | _RFLAGS_AC | _RFLAGS_NT

	// UserFlagsClear are always cleared in userspace.
	UserFlagsClear = _RFLAGS_NT | _RFLAGS_IOPL1
)
View Source
const (
	SegmentDescriptorAccess     SegmentDescriptorFlags = 1 << 8  // Access bit (always set).
	SegmentDescriptorWrite                             = 1 << 9  // Write permission.
	SegmentDescriptorExpandDown                        = 1 << 10 // Grows down, not used.
	SegmentDescriptorExecute                           = 1 << 11 // Execute permission.
	SegmentDescriptorSystem                            = 1 << 12 // Zero => system, 1 => user code/data.
	SegmentDescriptorPresent                           = 1 << 15 // Present.
	SegmentDescriptorAVL                               = 1 << 20 // Available.
	SegmentDescriptorLong                              = 1 << 21 // Long mode.
	SegmentDescriptorDB                                = 1 << 22 // 16 or 32-bit.
	SegmentDescriptorG                                 = 1 << 23 // Granularity: page or byte.
)

SegmentDescriptorFlag declarations.

Variables

View Source
var (
	// UserspaceSize is the total size of userspace.
	UserspaceSize = uintptr(1) << (VirtualAddressBits() - 1)

	// MaximumUserAddress is the largest possible user address.
	MaximumUserAddress = (UserspaceSize - 1) & ^uintptr(hostarch.PageSize-1)

	// KernelStartAddress is the starting kernel address.
	KernelStartAddress = ^uintptr(0) - (UserspaceSize - 1)
)

Functions

func AddrOfStart

func AddrOfStart() uintptr

AddrOfStart return the address of the CPU entrypoint.

The following start conditions must be satisfied:

  • AX should contain the CPU pointer.
  • c.GDT() should be loaded as the GDT.
  • c.IDT() should be loaded as the IDT.
  • c.CR0() should be the current CR0 value.
  • c.CR3() should be set to the kernel PageTables.
  • c.CR4() should be the current CR4 value.
  • c.EFER() should be the current EFER value.

The CPU state will be set to c.Registers().

In Go 1.17+, Go references to assembly functions resolve to an ABIInternal wrapper function rather than the function itself. We must reference from assembly to get the ABI0 (i.e., primary) address.

func Emit

func Emit(w io.Writer)

Emit prints architecture-specific offsets.

func Halt

func Halt()

Halt halts execution.

func HaltAndWriteFSBase

func HaltAndWriteFSBase(regs *arch.Registers)

HaltAndWriteFSBase halts execution. On resume, it sets FS_BASE from the value in regs.

func Init

func Init(featureSet *cpuid.FeatureSet)

Init sets function pointers based on architectural features.

This must be called prior to using ring0.

func IsCanonical

func IsCanonical(addr uint64) bool

IsCanonical indicates whether addr is canonical per the amd64 spec.

func IsKernelFlags

func IsKernelFlags(rflags uint64) bool

IsKernelFlags returns true if rflags coresponds to the kernel mode.

go:nosplit

func PhysicalAddressBits

func PhysicalAddressBits() uint32

PhysicalAddressBits returns the number of bits available for physical addresses.

FIXME(b/69382326): This should use the cpuid passed to Init.

func ReadCR2

func ReadCR2() uintptr

ReadCR2 reads the current CR2 value.

func RestoreKernelFPState

func RestoreKernelFPState()

RestoreKernelFPState restores the Sentry floating point state.

func SetCPUIDFaulting

func SetCPUIDFaulting(on bool) bool

SetCPUIDFaulting sets CPUID faulting per the boolean value.

True is returned if faulting could be set.

func VirtualAddressBits

func VirtualAddressBits() uint32

VirtualAddressBits returns the number bits available for virtual addresses.

Note that sign-extension semantics apply to the highest order bit.

FIXME(b/69382326): This should use the cpuid passed to Init.

Types

type CPU

type CPU struct {

	// CPUArchState is architecture-specific state.
	CPUArchState
	// contains filtered or unexported fields
}

CPU is the per-CPU struct.

func (*CPU) CR0

func (c *CPU) CR0() uint64

CR0 returns the CPU's CR0 value.

func (*CPU) CR4

func (c *CPU) CR4() uint64

CR4 returns the CPU's CR4 value.

func (*CPU) ClearErrorCode

func (c *CPU) ClearErrorCode()

ClearErrorCode resets the error code.

func (*CPU) EFER

func (c *CPU) EFER() uint64

EFER returns the CPU's EFER value.

func (*CPU) ErrorCode

func (c *CPU) ErrorCode() (value uintptr, user bool)

ErrorCode returns the last error code.

The returned boolean indicates whether the error code corresponds to the last user error or not. If it does not, then fault information must be ignored. This is generally the result of a kernel fault while servicing a user fault.

func (*CPU) FloatingPointState

func (c *CPU) FloatingPointState() *fpu.State

FloatingPointState returns the kernel floating point state.

This is explicitly safe to call during KernelException and KernelSyscall.

func (*CPU) GDT

func (c *CPU) GDT() (uint64, uint16)

GDT returns the CPU's GDT base and limit.

func (*CPU) IDT

func (c *CPU) IDT() (uint64, uint16)

IDT returns the CPU's IDT base and limit.

func (*CPU) Init

func (c *CPU) Init(k *Kernel, cpuID int, hooks Hooks)

Init initializes a new CPU.

Init allows embedding in other objects.

func (*CPU) Registers

func (c *CPU) Registers() *arch.Registers

Registers returns a modifiable-copy of the kernel registers.

This is explicitly safe to call during KernelException and KernelSyscall.

func (*CPU) StackTop

func (c *CPU) StackTop() uint64

StackTop returns the kernel's stack address.

func (*CPU) SwitchToUser

func (c *CPU) SwitchToUser(switchOpts SwitchOpts) (vector Vector)

SwitchToUser performs either a sysret or an iret.

The return value is the vector that interrupted execution.

This function will not split the stack. Callers will probably want to call runtime.entersyscall (and pair with a call to runtime.exitsyscall) prior to calling this function.

When this is done, this region is quite sensitive to things like system calls. After calling entersyscall, any memory used must have been allocated and no function calls without go:nosplit are permitted. Any calls made here are protected appropriately (e.g. IsCanonical and CR3).

Also note that this function transitively depends on the compiler generating code that uses IP-relative addressing inside of absolute addresses. That's the case for amd64, but may not be the case for other architectures.

Precondition: the Rip, Rsp, Fs and Gs registers must be canonical.

+checkescape:all

func (*CPU) TSS

func (c *CPU) TSS() (uint64, uint16, *SegmentDescriptor)

TSS returns the CPU's TSS base, limit and value.

type CPUArchState

type CPUArchState struct {
	// contains filtered or unexported fields
}

CPUArchState contains CPU-specific arch state.

type Gate64

type Gate64 struct {
	// contains filtered or unexported fields
}

Gate64 is a 64-bit task, trap, or interrupt gate.

type Hooks

type Hooks interface {
	// KernelSyscall is called for kernel system calls.
	//
	// Return from this call will restore registers and return to the kernel: the
	// registers must be modified directly.
	//
	// If this function is not provided, a kernel exception results in halt.
	//
	// This must be go:nosplit, as this will be on the interrupt stack.
	// Closures are permitted, as the pointer to the closure frame is not
	// passed on the stack.
	KernelSyscall()

	// KernelException handles an exception during kernel execution.
	//
	// Return from this call will restore registers and return to the kernel: the
	// registers must be modified directly.
	//
	// If this function is not provided, a kernel exception results in halt.
	//
	// This must be go:nosplit, as this will be on the interrupt stack.
	// Closures are permitted, as the pointer to the closure frame is not
	// passed on the stack.
	KernelException(Vector)
}

Hooks are hooks for kernel functions.

type Kernel

type Kernel struct {
	// PageTables are the kernel pagetables; this must be provided.
	PageTables *pagetables.PageTables

	KernelArchState
}

Kernel is a global kernel object.

This contains global state, shared by multiple CPUs.

func (*Kernel) EntryRegions

func (k *Kernel) EntryRegions() map[uintptr]uintptr

EntryRegions returns the set of kernel entry regions (must be mapped).

func (*Kernel) Init

func (k *Kernel) Init(maxCPUs int)

Init initializes a new kernel.

type KernelArchState

type KernelArchState struct {
	// contains filtered or unexported fields
}

KernelArchState contains architecture-specific state.

type SegmentDescriptor

type SegmentDescriptor struct {
	// contains filtered or unexported fields
}

SegmentDescriptor is a segment descriptor.

var (
	UserCodeSegment32 SegmentDescriptor
	UserDataSegment   SegmentDescriptor
	UserCodeSegment64 SegmentDescriptor
	KernelCodeSegment SegmentDescriptor
	KernelDataSegment SegmentDescriptor
)

Standard segments.

func (*SegmentDescriptor) Base

func (d *SegmentDescriptor) Base() uint32

Base returns the descriptor's base linear address.

func (*SegmentDescriptor) DPL

func (d *SegmentDescriptor) DPL() int

DPL returns the descriptor privilege level.

func (*SegmentDescriptor) Flags

Flags returns descriptor flags.

func (*SegmentDescriptor) Limit

func (d *SegmentDescriptor) Limit() uint32

Limit returns the descriptor size.

type SegmentDescriptorFlags

type SegmentDescriptorFlags uint32

SegmentDescriptorFlags are typed flags within a descriptor.

type Selector

type Selector uint16

Selector is a segment Selector.

const (
	Kcode   Selector = segKcode << 3
	Kdata   Selector = segKdata << 3
	Ucode32 Selector = (segUcode32 << 3) | 3
	Udata   Selector = (segUdata << 3) | 3
	Ucode64 Selector = (segUcode64 << 3) | 3
	Tss     Selector = segTss << 3
)

Selectors.

type SwitchArchOpts

type SwitchArchOpts struct {
	// UserPCID indicates that the application PCID to be used on switch,
	// assuming that PCIDs are supported.
	//
	// Per pagetables_x86.go, a zero PCID implies a flush.
	UserPCID uint16

	// KernelPCID indicates that the kernel PCID to be used on return,
	// assuming that PCIDs are supported.
	//
	// Per pagetables_x86.go, a zero PCID implies a flush.
	KernelPCID uint16
}

SwitchArchOpts are embedded in SwitchOpts.

type SwitchOpts

type SwitchOpts struct {
	// Registers are the user register state.
	Registers *arch.Registers

	// FloatingPointState is a byte pointer where floating point state is
	// saved and restored.
	FloatingPointState *fpu.State

	// PageTables are the application page tables.
	PageTables *pagetables.PageTables

	// Flush indicates that a TLB flush should be forced on switch.
	Flush bool

	// FullRestore indicates that an iret-based restore should be used.
	FullRestore bool

	// SwitchArchOpts are architecture-specific options.
	SwitchArchOpts
}

SwitchOpts are passed to the Switch function.

type TaskState64

type TaskState64 struct {
	// contains filtered or unexported fields
}

TaskState64 is a 64-bit task state structure.

type Vector

type Vector uintptr

Vector is an exception vector.

const (
	DivideByZero Vector = iota
	Debug
	NMI
	Breakpoint
	Overflow
	BoundRangeExceeded
	InvalidOpcode
	DeviceNotAvailable
	DoubleFault
	CoprocessorSegmentOverrun
	InvalidTSS
	SegmentNotPresent
	StackSegmentFault
	GeneralProtectionFault
	PageFault

	X87FloatingPointException
	AlignmentCheck
	MachineCheck
	SIMDFloatingPointException
	VirtualizationException
	SecurityException = 0x1e
	SyscallInt80      = 0x80
)

Exception vectors.

const (
	Syscall Vector = _NR_INTERRUPTS
)

System call vectors.

Directories

Path Synopsis
Binary gen_offsets is a helper for generating offset headers.
Binary gen_offsets is a helper for generating offset headers.
Package pagetables provides a generic implementation of pagetables.
Package pagetables provides a generic implementation of pagetables.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL