Documentation ¶
Overview ¶
Package asm provides utility functions to assemble and disassemble Ngaro VM code.
Supported assembler mnemonics:
TOS is the value on top of the data stack. NOS is the next value on the data stack. Instructions with a check mark in the "arg" column expect an argument in the cell following them. opcode asm alias arg stack description ------ --- ----- --- ----- ------------------------------------------------------------------------ 0 nop no-op 1 lit ✓ -n push the value in the following memory location to the data stack. 2 dup n-nn duplicate TOS 3 drop n- drop TOS 4 swap xy-yx swap TOS and NOS 5 push n- push TOS to address stack 6 pop -n pop value on top of address stack and place it on TOS 7 loop ✓ n-? decrement TOS. If >0 jump to address in next cell, else drop TOS and do nothing 8 jump jmp ✓ jump to address in next cell 9 ; ret return: pop address from address stack, add 1 and jump to it. 10 >jump jgt ✓ xy- jump to address in next cell if NOS > TOS 11 <jump jlt ✓ xy- jump to address in next cell if NOS < TOS 12 !jump jne ✓ xy- jump to address in next cell if NOS != TOS 13 =jump jge ✓ xy- jump to address in next cell if NOS == TOS 14 @ a-n fetch: get the value at the address on TOS and place it on TOS. 15 ! na- store: store the value in NOS at address in TOS 16 + add xy-z add NOS to TOS and place result on TOS 17 - sub xy-z subtract NOS from TOS and place result on TOS 18 * mul xy-z multiply NOS with TOS and place result on TOS 19 /mod div xy-rq divide TOS by NOS and place remainder in NOS, quotient in TOS 20 and xy-z do a logical and of NOS and TOS and place result on TOS 21 or xy-z do a logical or of NOS and TOS and place result on TOS 22 xor xy-z do a logical xor of NOS and TOS and place result on TOS 23 << shl xy-z do a logical left shift of NOS by TOS and place result on TOS 24 >> asr xy-z do an arithmetic right shift of NOS by TOS and place result on TOS 25 0; 0ret n-? ZeroExit: if TOS is 0, drop it and do a return, else do nothing 26 1+ inc n-n increment tos 27 1- dec n-n decrement tos 28 in p-n I/O in (see Ngaro VM spec) 29 out np- I/O out (see Ngaro VM spec) 30 wait ?- I/O wait (see Ngaro VM spec)
Comments:
Comments are placed between parentheses, i.e. '(' and ')'. The body of the comment must be separated from the enclosing parentheses by a space. That is:
Some valid comments:
( this is a valid comment ) ( this is a rather long multiline comment )
The following ae invalid comments:
(this will be seen by the parser as label "(this" and will not work ) ( comments may ( not be nested ) here, the parser will complain trying to resolve "here," as a label )
Literals and label/const identifiers:
The parser behaves almost like a Forth parser: input is split at white space (space, tab or new line) into tokens. The parser then does the following:
If a token can be converted to a Go integer (see strconv.ParseInt), it will be converted to an integer literal.
If it is a Go character literal between single quotes, it will be converted to the corresponding integer literal. Watch out with unicode chars: they will be convberted to the proper rune (int32), but they are not natively supported by the VM I/O code.
If a token is the name of a defined constant, it will be replaced internally by the constant's value and can be used anywhere an integer literal is expected.
Then name resolution applies:
if an instruction is expected, the token is looked up in the assembler mnemonics and if no match is found, it is considered to be a label.
if an argument is expected, the token is always considered a label.
You may therefore define unusual labels or constant names (at least for Go programmers) such as "2dup", "(weird" or "end-weird)". Also, more than one instruction may appear on the same line and comments can be placed anywhere between instructions.
Implicit "lit":
Where the parser is expecting an instruction, integer literals, character literals and constants will be compiled with an implicit "lit":
lit 42 42 ( will compile as "lit 42", just like above ) ( like ) 'a' ( compiles as ) lit 'a' ( which in fact compiles as ) lit 97
Labels:
Labels are defined by prefixing them with a colon (:) and can be used as address in any lit, jump or loop instruction (without the ':' prefix). For example:
foo ( forward references are ok. This will be compiled as a call to foo ) lit foo ( this will compile as lit <address of foo>. This is actually the only way to place the address of a label on the stack. ) :foo ( foo defined here ) nop ; :bar nop ( label definitions can be grouped with other instructions on the same line ) ; :foobar nop ; ( we can actually place any number of instructions on the same line )
Local labels:
Local labels work in the same way as in the GNU assembler. They are defined as a colon followed by a sequence of digits (i.e. :007, :0, :42). Although they can be defined multiple times, the compiler internally assigns them a unique name of the form N·counter (the middle character is '\u00b7'). References to such labels must be suffixed with either a '-' (meaning backward reference to the last definition of this label), or a '+' (meaning a forward reference to the next definition of this label). For example, in the following code:
:1 jump 1+ ( not to be confused with the '1+' mnemonic. Here it means next occurrence of :1 ) :2 jump 1- :1 jump 2+ :2 jump 1-
the labels will be internally converted to:
:1·1 jump 1·2 :2·1 jump 1·1 :1·2 jump 2·1 :2·2 jump 1·2
As a consequence, you should not use or define labels of the form N·N where N is any non-empty sequence of difigts. This also prevents the definition of labels of the form N+ or N- because they will not be addressable.
Please note that the parser does not prevent you either from using/defining labels with the same name as instructions. The only caveat, besides confusing yourself, is that you will not be able to use implicit calls to such labels:
:drop 'D' 1 1 out 0 0 out wait ( print 'D' ) drop ; ( this will not loop forever, drop will be compiled as opcode 3, not a call ) drop ( still opcode 3 ) .dat drop ( will compile an implicit call to our custom drop )
Assembler directives:
The assembler supports the following directives:
.equ <identifier> <value>
defines a constant value. <identifier> can be any valid identifier (any combination of letters, symbols, digits and punctuation). The value must be an integer value, named constant or character literal. Constants must be defined before being used. Constants can be redefined, the compiler will always use the last assigned value.
.org <value>
Will place the next instruction at the address specified by the given integer literal or named constant.
.dat <value>
Will compile the specified integer value, named constant or character literal as-is (i.e. with no implicit "lit"). This is primarily used used for data storage structures:
:table .dat 65 .dat 'B'
The cells at addresses table+0 and table+1 will contain 65 and 66 respectively.
.opcode <identifier> <value>
defines a custom opcode. <identifier> can be any valid identifier (any combination of letters, symbols, digits and punctuation). The value must be an integer value, named constant or character literal. Custom opcodes must be defined before being used. They can be redefined, the compiler will always use the last assigned value. Default opcodes can also be redefined (think override) with this directive, it should therefore be used with caution.
For example, suppose that we have a VM implementation that maps opcode -42 to a function that computes the square root of the number on top of the data stack:
.opcode sqrt -42 lit 49 sqrt ( this compiles as .dat -42 ) 7 !jump error
Note that there is no mechanism to tell the assdembler that a given custom opcode expects an argument from the next memory location (like lit or jump). Should you need to implement this type of opcode, constant and integer arguments would have to be prefixed with a .dat directive. For example, a compare instruction would look like:
.opcode cmp -1 ( compares TOS with value in next memory location ) cmp 0 ( Wrong: would compile as ".dat -1 lit 0" ) cmp .dat 0 ( Correct: will compile as ".dat -1 0" )
Example (Locals) ¶
Demonstrates use of local labels
package main import ( "fmt" "os" "strings" "github.com/db47h/ngaro/asm" ) func main() { code := ` :1 jump 1+ :2 jump 1- :1 jump 2+ :2 jump 1- ` img, err := asm.Assemble("locals", strings.NewReader(code)) if err != nil { fmt.Println(err) return } for pc := 0; pc < len(img); { fmt.Printf("% 4d\t", pc) pc = asm.Disassemble(img, pc, os.Stdout) fmt.Println() } }
Output: 0 jump 4 2 jump 0 4 jump 6 6 jump 4
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Assemble ¶
Assemble compiles assembly read from the supplied io.Reader and returns the resulting image and error if any.
Then name parameter is used only in error messages to name the source of the error. If the io.Reader is a file, name should be the file name.
The returned error, if not nil, can safely be cast to an ErrAsm value that will contain up to 10 entries.
Example ¶
Shows off some of the assembler features.
package main import ( "fmt" "os" "strings" "github.com/db47h/ngaro/asm" ) func main() { code := ` ( this is a comment. brackets must be separated by spaces ) ( a constant definition. Does not generate any code on its own ) .equ SOMECONST 42 nop 123 ( implicit literal ) SOMECONST ( const literal ) drop drop foo ( implicit call ) pop lit table ( address of table ) 'x' ( char literal, compiles as lit 'x' ) .org 32 ( set compilation address ) :foo 42 bar drop ; :bar 1+ ; ( several instructions on the same line ) .opcode sqrt -1 ( test custom opcode ) sqrt ( should compile like .dat -1 ) :table ( data structure ) .dat -100 ( will appear in the disassembly as "call -100" ) .dat 0666 ( octal ) .dat 0x27 ( hex ) .dat '\u2033' ( unicode char ) .dat SOMECONST .dat foo ( address of some label ) ` img, err := asm.Assemble("raw_string", strings.NewReader(code)) if err != nil { fmt.Println(err) return } for pc := 0; pc < len(img); { fmt.Printf("% 4d\t", pc) pc = asm.Disassemble(img, pc, os.Stdout) fmt.Println() } }
Output: 0 nop 1 123 3 42 5 drop 6 drop 7 call 32 8 pop 9 40 11 120 13 nop 14 nop 15 nop 16 nop 17 nop 18 nop 19 nop 20 nop 21 nop 22 nop 23 nop 24 nop 25 nop 26 nop 27 nop 28 nop 29 nop 30 nop 31 nop 32 42 34 call 37 35 drop 36 ; 37 1+ 38 ; 39 call -1 40 call -100 41 call 438 42 call 39 43 call 8243 44 call 42 45 call 32
func Disassemble ¶
Disassemble disassembles the cells in the given slice at position pc to the specified io.Writer and returns the position of the next valid opcode.
Example ¶
Disassemble is pretty straightforward. Here we Disassemble a hand crafted fibonacci function.
package main import ( "fmt" "os" "strings" "github.com/db47h/ngaro/asm" ) func main() { fibS := ` :fib push 0 1 pop ( like [ 0 1 ] dip ) jump 1+ ( jump forward to the next :1 ) :0 push ( local label ) dup push + pop swap pop :1 loop 0- ( local label back ) swap drop ; lit ( lit deliberately unterminated at end of image for testing purposes ) ` fib, err := asm.Assemble("fib", strings.NewReader(fibS)) if err != nil { fmt.Println(err) return } for pc := 0; pc < len(fib); { fmt.Printf("% 4d\t", pc) pc = asm.Disassemble(fib, pc, os.Stdout) fmt.Printf("\n") } }
Output: 0 push 1 0 3 1 5 pop 6 jump 15 8 push 9 dup 10 push 11 + 12 pop 13 swap 14 pop 15 loop 8 17 swap 18 drop 19 ; 20 ???