Documentation ¶
Overview ¶
Package xeddata provides utilities to work with XED datafiles.
Main features:
- Fundamental XED enumerations (CPU modes, operand sizes, ...)
- XED objects and their components
- XED datafiles reader (see below)
- Utility functions like ExpandStates
The amount of file formats that is understood is a minimal set required to generate x86.csv from XED tables:
- states - simple macro substitutions used in patterns
- widths - mappings from width names to their size
- element-types - XED xtype information
- objects - XED objects that constitute "the tables"
Collectively, those files are called "datafiles".
Terminology is borrowed from XED itself, where appropriate, x86csv names are provided as an alternative.
"$XED/foo/bar.txt" notation is used to specify a path to "foo/bar.txt" file under local XED source repository folder.
The default usage scheme:
- Open "XED database" to load required metadata.
- Read XED file with objects definitions.
- Operate on XED objects.
See example_test.go for complete examples.
It is required to build Intel XED before attempting to use its datafiles, as this package expects "all" versions that are a concatenated final versions of datafiles. If "$XED/obj/dgen/" does not contain relevant files, then either this documentation is stale or your XED is not built.
To see examples of "XED objects" see "testdata/xed_objects.txt".
Intel XED https://github.com/intelxed/xed provides all documentation that can be required to understand datafiles. The "$XED/misc/engineering-notes.txt" is particularly useful. For convenience, the most important notes are spread across package comments.
Tested with XED 088c48a2efa447872945168272bcd7005a7ddd91.
Index ¶
- Variables
- func ExpandStates(db *Database, s string) string
- func WalkInsts(xedPath string, visit func(*Inst)) error
- type AddressSizeMode
- type CPUMode
- type Database
- type Inst
- type Object
- type Operand
- type OperandSizeMode
- type OperandVisibility
- type PatternSet
- func (pset PatternSet) Index(keys ...string) int
- func (pset PatternSet) Is(k string) bool
- func (pset PatternSet) Match(keyval ...string) string
- func (pset PatternSet) MatchOrDefault(defaultValue string, keyval ...string) string
- func (pset PatternSet) Replace(oldKey, newKey string)
- func (pset PatternSet) String() string
- type Reader
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var PatternAliases = map[string]string{
"VEX": "VEXVALID=1",
"EVEX": "VEXVALID=2",
"XOP": "VEXVALID=3",
"MemOnly": "MOD!=3",
"RegOnly": "MOD=3",
}
PatternAliases is extendable map of pattern keys aliases. Maps human-readable key to XED property.
Used in PatternSet.Is.
Functions ¶
func ExpandStates ¶
ExpandStates returns a copy of s where all state macros are expanded. This requires db "states" to be loaded.
Example ¶
This example shows how to use ExpandStates and its effects.
package main import ( "fmt" "log" "strings" "golang.org/x/arch/x86/xeddata" ) func main() { const xedPath = "testdata/xedpath" input := strings.NewReader(` { ICLASS: VEXADD CPL: 3 CATEGORY: ? EXTENSION: ? ATTRIBUTES: AT_A AT_B PATTERN: _M_VV_TRUE 0x58 _M_VEX_P_66 _M_VLEN_128 _M_MAP_0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() OPERANDS: REG0=XMM_R():w:width_dq:fword64 REG1=XMM_N():r:width_dq:fword64 MEM0:r:width_dq:fword64 PATTERN: _M_VV_TRUE 0x58 _M_VEX_P_66 _M_VLEN_128 _M_MAP_0F MOD[0b11] MOD=3 REG[rrr] RM[nnn] OPERANDS: REG0=XMM_R():w:width_dq:fword64 REG1=XMM_N():r:width_dq:fword64 REG2=XMM_B():r:width_dq:fword64 PATTERN: _M_VV_TRUE 0x58 _M_VEX_P_66 _M_VLEN_256 _M_MAP_0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() OPERANDS: REG0=YMM_R():w:qq:fword64 REG1=YMM_N():r:qq:fword64 MEM0:r:qq:fword64 PATTERN: _M_VV_TRUE 0x58 _M_VEX_P_66 _M_VLEN_256 _M_MAP_0F MOD[0b11] MOD=3 REG[rrr] RM[nnn] OPERANDS: REG0=YMM_R():w:qq:fword64 REG1=YMM_N():r:qq:fword64 REG2=YMM_B():r:qq:fword64 }`) objects, err := xeddata.NewReader(input).ReadAll() if err != nil { log.Fatal(err) } db, err := xeddata.NewDatabase(xedPath) if err != nil { log.Fatal(err) } for _, o := range objects { for _, inst := range o.Insts { fmt.Printf("old: %q\n", inst.Pattern) fmt.Printf("new: %q\n", xeddata.ExpandStates(db, inst.Pattern)) } } }
Output: old: "_M_VV_TRUE 0x58 _M_VEX_P_66 _M_VLEN_128 _M_MAP_0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()" new: "VEXVALID=1 0x58 VEX_PREFIX=1 VL=0 MAP=1 MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()" old: "_M_VV_TRUE 0x58 _M_VEX_P_66 _M_VLEN_128 _M_MAP_0F MOD[0b11] MOD=3 REG[rrr] RM[nnn]" new: "VEXVALID=1 0x58 VEX_PREFIX=1 VL=0 MAP=1 MOD[0b11] MOD=3 REG[rrr] RM[nnn]" old: "_M_VV_TRUE 0x58 _M_VEX_P_66 _M_VLEN_256 _M_MAP_0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()" new: "VEXVALID=1 0x58 VEX_PREFIX=1 VL=1 MAP=1 MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM()" old: "_M_VV_TRUE 0x58 _M_VEX_P_66 _M_VLEN_256 _M_MAP_0F MOD[0b11] MOD=3 REG[rrr] RM[nnn]" new: "VEXVALID=1 0x58 VEX_PREFIX=1 VL=1 MAP=1 MOD[0b11] MOD=3 REG[rrr] RM[nnn]"
Types ¶
type AddressSizeMode ¶
type AddressSizeMode int
AddressSizeMode describes address size mode (67H prefix).
const ( AddrSize16 AddressSizeMode = iota AddrSize32 AddrSize64 )
Possible address size modes. XED calls it ASZ.
func (AddressSizeMode) String ¶
func (asz AddressSizeMode) String() string
String returns asz bit size string. Panics on illegal enumerations.
type Database ¶
type Database struct {
// contains filtered or unexported fields
}
Database holds information that is required to properly handle XED datafiles.
func NewDatabase ¶
NewDatabase returns Database that loads everything it can find in xedPath. Missing lookup file is not an error, but error during parsing of found file is.
Lookup:
"$xedPath/all-state.txt" => db.LoadStates() "$xedPath/all-widths.txt" => db.LoadWidths() "$xedPath/all-element-types.txt" => db.LoadXtypes()
$xedPath is the interpolated value of function argument.
The call NewDatabase("") is valid and returns empty database. Load methods can be used to read lookup files one-by-one.
func (*Database) LoadStates ¶
LoadStates reads XED states definitions from r and updates db. "states" are simple macro substitutions without parameters. See "$XED/obj/dgen/all-state.txt".
func (*Database) LoadWidths ¶
LoadWidths reads XED widths definitions from r and updates db. "widths" are 16/32/64 bit mode type sizes. See "$XED/obj/dgen/all-widths.txt".
func (*Database) LoadXtypes ¶
LoadXtypes reads XED xtypes definitions from r and updates db. "xtypes" are low-level XED type names. See "$XED/obj/dgen/all-element-types.txt". See "$XED/obj/dgen/all-element-type-base.txt".
type Inst ¶
type Inst struct { // Object that contains properties that are shared with multiple // Inst objects. *Object // Index is the position inside XED object. // Object.Insts[Index] returns this inst. Index int // Pattern is the sequence of bits and nonterminals used to // decode/encode an instruction. // Example: "0x0F 0x28 no_refining_prefix MOD[0b11] MOD=3 REG[rrr] RM[nnn]". Pattern string // Operands are instruction arguments, typicall registers, // memory operands and pseudo-resources. Separated by space. // Example: "MEM0:rcw:b REG0=GPR8_R():r REG1=XED_REG_AL:rcw:SUPP". Operands string // Iform is a name for the pattern that starts with the // iclass and bakes in the operands. If omitted, XED // tries to generate one. We often add custom suffixes // to these to disambiguate certain combinations. // Example: "MOVAPS_XMMps_XMMps_0F28". // // Optional. Iform string }
Inst represents a single instruction template.
Some templates contain expandable (macro) pattern and operands which tells that there are more than one real instructions that are expressed by the template.
type Object ¶
type Object struct { // Iclass is instruction class name (opcode). // Iclass alone is not enough to uniquely identify machine instructions. // Example: "PSRLW". Iclass string // Disasm is substituted name when a simple conversion // from iclass is inappropriate. // Never combined with DisasmIntel or DisasmATTSV. // Example: "syscall". // // Optional. Disasm string // DisasmIntel is like Disasm, but with Intel syntax. // If present, usually comes with DisasmATTSV. // Example: "jmp far". // // Optional. DisasmIntel string // DisasmATTSV is like Disasm, but with AT&T/SysV syntax. // If present, usually comes with DisasmIntel. // Example: "ljmp". // // Optional. DisasmATTSV string // Attributes describes name set for bits in the binary attributes field. // Example: "NOP X87_CONTROL NOTSX". // // Optional. If not present, zero attribute set is implied. Attributes string // Uname is unique name used for deleting / replacing instructions. // // Optional. Provided for completeness, mostly useful for XED internal usage. Uname string // CPL is instruction current privilege level restriction. // Can have value of "0" or "3". CPL string // Category is an ad-hoc categorization of instructions. // Example: "SEMAPHORE". Category string // Extension is an ad-hoc grouping of instructions. // If no ISASet is specified, this is used instead. // Example: "3DNOW" Extension string // Exceptions is an exception set name. // Example: "SSE_TYPE_7". // // Optional. Empty exception category generally means that // instruction generates no exceptions. Exceptions string // ISASet is a name for the group of instructions that // introduced this feature. // Example: "I286PROTECTED". // // Older objects only defined Extension field. // Newer objects may contain both Extension and ISASet fields. // For some objects Extension==ISASet. // Both fields are required to do precise CPUID-like decisions. // // Optional. ISASet string // Flags describes read/written flag bit values. // Example: "MUST [ of-u sf-u af-u pf-u cf-mod ]". // // Optional. If not present, no flags are neither read nor written. Flags string // A hopefully useful comment. // // Optional. Comment string // The object revision. // // Optional. Version string // RealOpcode marks unstable (not in SDM yet) instructions with "N". // Normally, always "Y" or not present at all. // // Optional. RealOpcode string // Insts are concrete instruction templates that are derived from containing Object. // Inst contains fields PATTERN, OPERANDS, IFORM in enc/dec instruction. Insts []*Inst }
An Object is a single "dec/enc-instruction" XED object from datafiles.
Field names and their comments are borrowed from Intel XED engineering notes (see "$XED/misc/engineering-notes.txt").
Field values are always trimmed (i.e. no leading/trailing whitespace).
Missing optional members are expressed with an empty string.
Object contains multiple Inst elements that represent concrete instruction with encoding pattern and operands description.
func (*Object) HasAttribute ¶
HasAttribute checks that o has attribute with specified name. Note that check is done at "word" level, substring names will not match.
type Operand ¶
type Operand struct { // Name is an ID with optional nonterminal name part. // // Possible values: "REG0=GPRv_B", "REG1", "MEM0", ... // // If nonterminal part is present, name // can be split into LHS and RHS with NonTerminalName method. Name string // Action describes argument types. // // Possible values: "r", "w", "rw", "cr", "cw", "crw". // Optional "c" prefix represents conditional access. Action string // Width descriptor. It can express simple width like "w" (word, 16bit) // or meta-width like "v", which corresponds to {16, 32, 64} bits. // // Possible values: "", "q", "ds", "dq", ... // Optional. Width string // Xtype holds XED-specific type information. // // Possible values: "", "f64", "i32", ... // Optional. Xtype string // Attributes serves as container for all other properties. // // Possible values: // EVEX.b context { // TXT=ZEROSTR - zeroing // TXT=SAESTR - suppress all exceptions // TXT=ROUNDC - rounding // TXT=BCASTSTR - broadcasting // } // MULTISOURCE4 - 4FMA multi-register operand. // // Optional. For most operands, it's nil. Attributes map[string]bool // Visibility tells if operand is explicit, implicit or suspended. Visibility OperandVisibility }
Operand holds data that is encoded inside instruction's "OPERANDS" field.
Use NewOperand function to decode operand fields into Operand object.
Example ¶
This example shows how to handle Inst "OPERANDS" field.
package main import ( "fmt" "log" "strings" "golang.org/x/arch/x86/xeddata" ) func main() { const xedPath = "testdata/xedpath" input := strings.NewReader(` { ICLASS: ADD_N_TIMES # Like IMUL CPL: 3 CATEGORY: BINARY EXTENSION: BASE ISA_SET: I86 FLAGS: MUST [ of-mod sf-u zf-u af-u pf-u cf-mod ] PATTERN: 0xAA MOD[mm] MOD!=3 REG[0b101] RM[nnn] MODRM() OPERANDS: MEM0:r:width_v REG0=AX:rw:SUPP REG1=DX:w:SUPP }`) objects, err := xeddata.NewReader(input).ReadAll() if err != nil { log.Fatal(err) } db, err := xeddata.NewDatabase(xedPath) if err != nil { log.Fatal(err) } inst := objects[0].Insts[0] // Single instruction is enough for this example for i, rawOperand := range strings.Fields(inst.Operands) { operand, err := xeddata.NewOperand(db, rawOperand) if err != nil { log.Fatalf("parse operand #%d: %+v", i, err) } visibility := "implicit" if operand.IsVisible() { visibility = "explicit" } fmt.Printf("(%s) %s:\n", visibility, rawOperand) fmt.Printf("\tname: %q\n", operand.Name) if operand.IsVisible() { fmt.Printf("\t32/64bit width: %s/%s bytes\n", db.WidthSize(operand.Width, xeddata.OpSize32), db.WidthSize(operand.Width, xeddata.OpSize64)) } } }
Output: (explicit) MEM0:r:width_v: name: "MEM0" 32/64bit width: 4/8 bytes (implicit) REG0=AX:rw:SUPP: name: "REG0=AX" (implicit) REG1=DX:w:SUPP: name: "REG1=DX"
func NewOperand ¶
NewOperand decodes operand string.
See "$XED/pysrc/opnds.py" to learn about fields format and valid combinations.
Requires database with xtypes and widths info.
func (*Operand) IsVisible ¶
IsVisible returns true for operands that are usually shown in syntax strings.
func (*Operand) NameLHS ¶
NameLHS returns left hand side part of the non-terminal name. Example: NameLHS("REG0=GPRv()") => "REG0".
func (*Operand) NameRHS ¶
NameRHS returns right hand side part of the non-terminal name. Example: NameLHS("REG0=GPRv()") => "GPRv()".
func (*Operand) NonterminalName ¶
NonterminalName returns true if op.Name consist of LHS and RHS parts.
RHS is non-terminal name lookup function expression. Example: "REG0=GPRv()" has "GPRv()" name lookup function.
type OperandSizeMode ¶
type OperandSizeMode int
OperandSizeMode describes operand size mode (66H prefix).
const ( OpSize16 OperandSizeMode = iota OpSize32 OpSize64 )
Possible operand size modes. XED calls it OSZ.
func (OperandSizeMode) String ¶
func (osz OperandSizeMode) String() string
String returns osz bit size string. Panics on illegal enumerations.
type OperandVisibility ¶
type OperandVisibility int
OperandVisibility describes operand visibility in XED terms.
const ( // VisExplicit is a default operand visibility. // Explicit operand is "real" kind of operands that // is shown in syntax and can be specified by the programmer. VisExplicit OperandVisibility = iota // VisImplicit is for fixed arg (like EAX); usually shown in syntax. VisImplicit // VisSuppressed is like VisImplicit, but not shown in syntax. // In some very rare exceptions, they are also shown in syntax string. VisSuppressed // VisEcond is encoder-only conditions. Can be ignored. VisEcond )
type PatternSet ¶
PatternSet wraps instruction PATTERN properties providing set operations on them.
func NewPatternSet ¶
func NewPatternSet(pattern string) PatternSet
NewPatternSet decodes pattern string into PatternSet.
func (PatternSet) Index ¶
func (pset PatternSet) Index(keys ...string) int
Index returns index from keys of first matching key. Returns -1 if does not contain any of given keys.
func (PatternSet) Is ¶
func (pset PatternSet) Is(k string) bool
Is reports whether set contains key k. In contrast with direct pattern set lookup, it does check if PatternAliases[k] is available to be used instead of k in lookup.
func (PatternSet) Match ¶
func (pset PatternSet) Match(keyval ...string) string
Match is like MatchOrDefault("", keyval...).
func (PatternSet) MatchOrDefault ¶
func (pset PatternSet) MatchOrDefault(defaultValue string, keyval ...string) string
MatchOrDefault returns first matching key associated value. Returns defaultValue if no match is found.
Keyval structure can be described as {"k1", "v1", ..., "kN", "vN"}.
func (PatternSet) Replace ¶
func (pset PatternSet) Replace(oldKey, newKey string)
Replace inserts newKey if oldKey is defined. oldKey is removed if insertion is performed.
func (PatternSet) String ¶
func (pset PatternSet) String() string
String returns pattern printer representation. All properties are sorted.
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader reads enc/dec-instruction objects from XED datafile.
Example ¶
This example shows how to print raw XED objects using Reader. Objects are called "raw" because some of their fields may require additional transformations like macro (states) expansion.
package main import ( "fmt" "log" "strings" "golang.org/x/arch/x86/xeddata" ) func main() { const xedPath = "testdata/xedpath" input := strings.NewReader(` { ICLASS: VEXADD EXCEPTIONS: avx-type-zero CPL: 2000 CATEGORY: AVX-Q EXTENSION: AVX-Q ATTRIBUTES: A B C PATTERN: VV1 0x07 VL128 V66 V0F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() OPERANDS: REG0=XMM_R():w:width_dq:fword64 REG1=XMM_N():r:width_dq:fword64 MEM0:r:width_dq:fword64 } { ICLASS: COND_MOV_Z CPL: 210 CATEGORY: MOV_IF_COND_MET EXTENSION: BASE ISA_SET: COND_MOV FLAGS: READONLY [ zf-tst ] PATTERN: 0x0F 0x4F MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() OPERANDS: REG0=GPRv_R():cw MEM0:r:width_v PATTERN: 0x0F 0x4F MOD[0b11] MOD=3 REG[rrr] RM[nnn] OPERANDS: REG0=GPRv_R():cw REG1=GPRv_B():r }`) objects, err := xeddata.NewReader(input).ReadAll() if err != nil { log.Fatal(err) } for _, o := range objects { fmt.Printf("%s (%s):\n", o.Opcode(), o.Extension) for _, inst := range o.Insts { fmt.Printf("\t[%d] %s\n", inst.Index, inst.Operands) } } }
Output: VEXADD (AVX-Q): [0] REG0=XMM_R():w:width_dq:fword64 REG1=XMM_N():r:width_dq:fword64 MEM0:r:width_dq:fword64 COND_MOV_Z (BASE): [0] REG0=GPRv_R():cw MEM0:r:width_v [1] REG0=GPRv_R():cw REG1=GPRv_B():r