bytecode

package
v0.1.207 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 20, 2025 License: MIT Imports: 8 Imported by: 0

Documentation

Overview

Package bytecode contains a custom bytecode compiler and virtual machine.

Initially, an Evy program is turned into a sequence of tokens by the [lexer]. Then, the parser generates an Abstract Syntax Tree (AST) from the tokens. The Compiler of this package walks the AST and writes the instructions to custom bytecode. For more on the AST refer to the parser package documentation.

The VM can read the bytecode and execute the encoded instructions, providing a runtime similar to the [evaluator]. The virtual machine is a straight-forward stack implementation that does not use any additional registers.

Index

Constants

View Source
const (
	// JumpPlaceholder is used as a placeholder operand value in OpJump
	// and OpJumpOnFalse.
	JumpPlaceholder = 9999
)
View Source
const (
	// StackSize defines an upper limit for the size of the stack.
	StackSize = 2048
)

Variables

View Source
var (
	// ErrInternal and errors wrapping ErrInternal report internal
	// errors of the VM that should not occur during normal
	// program execution.
	ErrInternal = errors.New("internal error")
	// ErrPanic and errors wrapping ErrPanic report runtime errors, such
	// as an index out of bounds or a stack overflow.
	ErrPanic = errors.New("user error")
	// ErrUnknownOpcode is returned when an unknown opcode is encountered.
	ErrUnknownOpcode = fmt.Errorf("%w: unknown opcode", ErrInternal)
)
View Source
var (
	// ErrUndefinedVar is returned when a variable name cannot
	// be resolved in the symbol table.
	ErrUndefinedVar = fmt.Errorf("%w: undefined variable", ErrPanic)
	// ErrUnknownOperator is returned when an operator cannot
	// be resolved.
	ErrUnknownOperator = fmt.Errorf("%w: unknown operator", ErrInternal)
	// ErrUnsupportedExpression is returned when an expression is not
	// supported by the compiler, this indicates an error in the compiler
	// itself, as all parseable evy expressions should be supported.
	ErrUnsupportedExpression = fmt.Errorf("%w: unsupported expression", ErrInternal)
)
View Source
var (
	// ErrBounds reports an index out of bounds in an array or string.
	ErrBounds = fmt.Errorf("%w: index out of bounds", ErrPanic)
	// ErrIndexValue reports an index value error if the index is not an integer, e.g. 1.1.
	ErrIndexValue = fmt.Errorf("%w: index not an integer", ErrPanic)
	// ErrMapKey reports that no value was found for a specific key
	// used in a map index.
	ErrMapKey = fmt.Errorf("%w: no value for map key", ErrPanic)
	// ErrSlice reports an invalid slice where start index > end index.
	ErrSlice = fmt.Errorf("%w: invalid slice", ErrPanic)
)
View Source
var (
	// ErrStackOverflow is returned when the stack exceeds its size limit.
	ErrStackOverflow = fmt.Errorf("%w: stack overflow", ErrPanic)
	// ErrDivideByZero is returned when a division by zero would
	// produce an invalid result. In Golang, floating point division
	// by zero produces +Inf, and modulo by zero produces NaN.
	ErrDivideByZero = fmt.Errorf("%w: division by zero", ErrPanic)
	// ErrBadRepetition is returned when the right-hand side of the array
	// repetition operator is invalid; i.e. negative or not an integer.
	ErrBadRepetition = fmt.Errorf("%w: bad repetition count", ErrPanic)
)

Functions

func Make

func Make(op Opcode, operands ...int) ([]byte, error)

Make assembles and returns a single bytecode instruction out of an Opcode and an optional list of operands. An error will be returned if there is no definition for the provided Opcode.

func ReadOperands

func ReadOperands(def *OpDefinition, ins Instructions) ([]int, int)

ReadOperands will read from the provided Instructions based on the width of the provided OpDefinition. It returns the operands and the read offset inside the Instructions.

func ReadUint16

func ReadUint16(ins Instructions) uint16

ReadUint16 is exposed to allow reading without performing a Lookup to get an OpDefinition to pass to ReadOperands.

Types

type Bytecode

type Bytecode struct {
	Constants    []value
	Instructions Instructions
	GlobalCount  int
	LocalCount   int
}

Bytecode represents raw evy bytecode.

type Compiler

type Compiler struct {
	// contains filtered or unexported fields
}

Compiler is responsible for turning a parsed evy program into bytecode.

func NewCompiler

func NewCompiler() *Compiler

NewCompiler returns a new compiler.

func (*Compiler) Bytecode

func (c *Compiler) Bytecode() *Bytecode

Bytecode renders the compiler instructions into Bytecode.

func (*Compiler) Compile

func (c *Compiler) Compile(node parser.Node) error

Compile accepts an AST node and renders it to bytecode internally.

type Instructions

type Instructions []byte

Instructions represents raw bytecode, which is composed of opcodes and an optional number of operands after each opcode.

func (Instructions) String

func (ins Instructions) String() string

String prints opcodes and their associated operands.

type OpDefinition

type OpDefinition struct {
	Name          string
	OperandWidths []int
}

OpDefinition defines a name and expected operand width for each OpCode.

func Lookup

func Lookup(op Opcode) (*OpDefinition, error)

Lookup returns an OpDefinition for the value of an Opcode, any unknown value will result in an error.

type Opcode

type Opcode byte

Opcode defines the type of operation to be performed when reading the bytecode.

const (
	// OpConstant defines a constant that will be referred to by index
	// in the bytecode.
	OpConstant Opcode = iota
	// OpGetGlobal retrieves a symbol from the global symbol table at the
	// specified index.
	OpGetGlobal
	// OpSetGlobal adds a symbol to the specified index in the global
	// symbol table.
	OpSetGlobal
	// OpDrop pops and discards the top N elements of the stack.
	OpDrop
	// OpGetLocal retrieves a symbol from the current local-scoped
	// symbol table at the specified index. This index also corresponds
	// to its location on the VM stack.
	OpGetLocal
	// OpSetLocal adds a symbol to the specified index in the current
	// local-scoped symbol table. This index also corresponds to its
	// location on the VM stack.
	OpSetLocal
	// OpAdd instructs the virtual machine to perform an addition.
	OpAdd
	// OpSubtract instructs the virtual machine to perform a subtraction.
	OpSubtract
	// OpMultiply instructs the virtual machine to perform a multiplication.
	OpMultiply
	// OpDivide instructs the virtual machine to perform a division.
	OpDivide
	// OpModulo instructs the virtual machine to perform a modulo operation.
	// Modulo returns the remainder of dividing the left side of an expression
	// by the right.
	OpModulo
	// OpTrue represents the boolean literal true.
	OpTrue
	// OpFalse represents the boolean literal false.
	OpFalse
	// OpNot represents the not unary operator, which performs logical
	// negation on a boolean operand.
	OpNot
	// OpMinus represents the minus unary operator, which negates the
	// value of a numeric operand.
	OpMinus
	// OpEqual represents the equality operator.
	OpEqual
	// OpNotEqual represents the inequality operator.
	OpNotEqual
	// OpNumLessThan represents a less than operator for numeric
	// operands.
	OpNumLessThan
	// OpNumLessThanEqual represents a less than or equal operator for
	// numeric operands.
	OpNumLessThanEqual
	// OpNumGreaterThan represents a greater than operator for numeric
	// operands.
	OpNumGreaterThan
	// OpNumGreaterThanEqual represents a greater than or equal operator
	// for numeric operands.
	OpNumGreaterThanEqual
	// OpStringLessThan represents a less than operator for string
	// operands. Strings are compared using lexicographic order.
	OpStringLessThan
	// OpStringLessThanEqual represents a less than or equal operator for
	// string operands. Strings are compared using lexicographic order.
	OpStringLessThanEqual
	// OpStringGreaterThan represents a greater than operator for string
	// operands. Strings are compared using lexicographic order.
	OpStringGreaterThan
	// OpStringGreaterThanEqual represents a greater than or equal operator
	// for string operands. Strings are compared using lexicographic order.
	OpStringGreaterThanEqual
	// OpStringConcatenate represents a + operator used to concatenate two
	// strings.
	OpStringConcatenate
	// OpArray represents an array literal, the operand N is the length of
	// the array, and instructs the vm to pop N elements from the stack.
	OpArray
	// OpArrayConcatenate represents a + operator used to concatenate two
	// arrays.
	OpArrayConcatenate
	// OpArrayRepeat represents a * operator used to repeat an array a
	// number of times.
	OpArrayRepeat
	// OpMap represents a map literal, the operand N is the length of
	// the map. The keys and values are read sequentially from the
	// stack (e.g. k1, v1, k2, v2...).
	OpMap
	// OpIndex represents an index operator used on an array, map or
	// string variable.
	OpIndex
	// OpSetIndex represents an index operator on the left hand side of
	// an assignment.
	OpSetIndex
	// OpSlice represents a slice operator used on an array or string. The top
	// three elements are read from the stack and are the end index, start index
	// and the data structure being operated on. OpNone is used as a default
	// when the start or end are not specified by the user program.
	OpSlice
	// OpNone represents an unspecified value used where values are
	// optional and unspecified such as the start and end value of a
	// slice index, or the values in a step range.
	OpNone
	// OpJump will force the virtual machine to jump to the instruction
	// address within its operand.
	OpJump
	// OpJumpOnFalse will pop the top element of the stack as a
	// boolean and evaluate it. It will jump to the instruction address
	// in its operand if the condition evaluates to false.
	OpJumpOnFalse
	// OpStepRange represents a range over a numeric start, stop and step.
	// It has one operand that specifies if the loop range assigns to a
	// loop variable.
	OpStepRange
	// OpIterRange represents a range over an iterable structure (a string,
	// array or map). It has one operand that specifies if the loop range
	// assigns to a loop variable.
	OpIterRange
)

type Symbol

type Symbol struct {
	Name  string
	Scope SymbolScope
	Index int
}

Symbol is a variable inside an evy program.

type SymbolScope

type SymbolScope string

SymbolScope defines a type of scope that a symbol can be defined inside.

const (
	// GlobalScope is the top level scope of an evy program.
	GlobalScope SymbolScope = "GLOBAL"
	// LocalScope is any local scope in an evy program, it is distinct
	// from the GlobalScope.
	LocalScope SymbolScope = "LOCAL"
)

type SymbolTable

type SymbolTable struct {
	// contains filtered or unexported fields
}

SymbolTable translates the variables of an Evy program to memory locations. It maps variable names (or symbols) encountered during compilation to their corresponding index-addressed memory locations, which are used at run time. This enables efficient access and manipulation of variables by the Evy virtual machine.

Global variables, which are declared at the top level of a program and outside of any blocks, are stored in a dedicated global memory space. All other variables are considered local variables and are stored on the program stack.

Index mapping for symbols involves tracking variable counts within each scope of the Evy program. As new variables are defined within a scope, the corresponding index is incremented (see SymbolTable.Define method). Upon entering a new block scope, a dedicated SymbolTable is created and pushed onto a stack (SymbolTable.Push method) to manage variables within that block. When exiting the block scope, the SymbolTable is popped off the stack (SymbolTable.Pop method), and the maximum index value is propagated back to the parent scope. This allows the maximum number of symbols in scope at the same time to be known so we do not over-allocate space for them at run time.

func NewSymbolTable

func NewSymbolTable() *SymbolTable

NewSymbolTable returns a new SymbolTable.

func (*SymbolTable) Define

func (s *SymbolTable) Define(name string) Symbol

Define adds a symbol definition to the table or returns an already defined symbol with the same name.

func (*SymbolTable) Pop added in v0.1.180

func (s *SymbolTable) Pop() *SymbolTable

Pop returns the outer symbol table of s and updates the outer symbol table's nested max index. This accumulates the maximum index of all nested scopes into the parent scope so we know how much storage is needed for all the variables of a scope and its children.

func (*SymbolTable) Push added in v0.1.180

func (s *SymbolTable) Push() *SymbolTable

Push nests a new symbol table under s. The index for the nested table starts at the index of the outer symbol table if the outer symbol table is not the global symbol table. Otherwise the index starts at zero.

func (*SymbolTable) Resolve

func (s *SymbolTable) Resolve(name string) (Symbol, bool)

Resolve returns the Symbol with the specified name and true, or if there is no matching symbol, it recurses to the outer symbol table. If there is no outer symbol table, false is returned.

type VM

type VM struct {
	// contains filtered or unexported fields
}

VM is responsible for executing evy programs from bytecode.

func NewVM

func NewVM(bytecode *Bytecode) *VM

NewVM returns a new VM.

func (*VM) Run

func (vm *VM) Run() error

Run executes the provided bytecode instructions in order, any error will stop the execution.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL