uast

package
v3.3.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 14, 2019 License: GPL-3.0 Imports: 7 Imported by: 7

Documentation

Overview

Package uast defines a UAST (Universal Abstract Syntax Tree) representation and operations to manipulate them.

Index

Constants

View Source
const (
	KeyType  = "@type"  // the type of UAST node (InternalType in v1)
	KeyToken = "@token" // token of the UAST node (Native and Annotated nodes only)
	KeyRoles = "@role"  // roles of UAST node (Annotated nodes only); for representations see RoleList
	KeyPos   = "@pos"   // positional information is stored in this field, see Positions
)

Special field keys for nodes.Object

View Source
const (
	// NS is a namespace for the UAST types.
	NS = "uast"

	// TypePosition is a node type for positional information in AST. See AsPosition.
	TypePosition = NS + ":Position"
	// TypePositions is a node type for a root node of positional information in AST. See AsPositions.
	TypePositions = NS + ":Positions"
	// TypeOperator is a node type for an operator AST node. See Operator.
	TypeOperator = NS + ":Operator"
	// KeyPosOff is a name for a Position object field that stores a bytes offset.
	KeyPosOff = "offset"
	// KeyPosLine is a name for a Position object field that stores a source line.
	KeyPosLine = "line"
	// KeyPosCol is a name for a Position object field that stores a source column.
	KeyPosCol = "col"

	KeyStart = "start" // StartPosition
	KeyEnd   = "end"   // EndPosition
)

Variables

View Source
var (
	// ErrIncorrectType is returned when trying to load a generic UAST node into a Go value
	// of an incorrect type.
	ErrIncorrectType = errors.NewKind("incorrect object type: %q, expected: %q")
	// ErrTypeNotRegistered is returned when trying to create a UAST type that was not associated
	// with any Go type. See RegisterPackage.
	ErrTypeNotRegistered = errors.NewKind("type is not registered: %q")
)

Functions

func AllImportPaths added in v3.2.0

func AllImportPaths(root nodes.External) []string

AllImportPaths returns a list of all import paths in the UAST. Resulting import paths will be deduplicated and sorted.

Path elements in QualifiedIdentifiers import will be joined by '/'. For example, Java import "com.example.pkg" will be listed as "com/example/pkg".

func ContentOf

func ContentOf(n nodes.Node) string

ContentOf returns any relevant string content of a node. It returns a Name for Identifiers, Value for Strings, etc and uses TokenOf for non-Semantic nodes.

The result may not exactly match the source file since values in Semantic nodes are normalized.

It returns an empty string if the node has no string content.

func HashNoPos

func HashNoPos(n nodes.External) nodes.Hash

HashNoPos hashes the node, but skips positional information.

func LookupType

func LookupType(typ string) (reflect.Type, bool)

LookupType finds a Go type corresponding to a specified UAST type.

It only returns types registered via RegisterPackage.

func NewObjectByType

func NewObjectByType(typ string) nodes.Object

func NewObjectByTypeOpt

func NewObjectByTypeOpt(typ string) (obj, opt nodes.Object)

func NewPositionalIterator

func NewPositionalIterator(root nodes.External) nodes.Iterator

NewPositionalIterator creates a new iterator that enumerates all object nodes, sorting them by positions in the source file. Nodes with no positions will be enumerated last.

func NewValue

func NewValue(typ string) (reflect.Value, error)

NewValue creates a new Go value corresponding to a specified UAST type.

It only creates types registered via RegisterPackage.

func NodeAs

func NodeAs(n nodes.External, dst interface{}) error

NodeAs loads a generic UAST node into provided Go value.

It uses "uast" or "json" struct tags to get field names. Interface values will either be constructed from types that were registered in uast package, or will be populated with raw UAST nodes.

It returns ErrIncorrectType in case of type mismatch.

func RegisterPackage

func RegisterPackage(ns string, types ...interface{})

RegisterPackage registers a new UAST namespace and associates the concrete types of the specified values with it. All types should be in the same Go package. The name of each type is derived from its reflect.Type name.

Example:

type Node struct{}

func init(){
   // will register a UAST type "my:Node" associated with
   // a Node type from this package
   RegisterPackage("my", Node{})
}

func RoleList

func RoleList(roles ...role.Role) nodes.Array

RoleList converts a set of roles into a list node.

func RolesOf

func RolesOf(n nodes.Node) role.Roles

RolesOf is a helper for getting node UAST roles (see KeyRoles). The function will returns nil roles array for non-object nodes like arrays and values.

func ToNode

func ToNode(o interface{}) (nodes.Node, error)

ToNode converts generic values returned by schema-less encodings such as JSON to Node objects. It also supports values registered via RegisterPackage.

func TokenOf

func TokenOf(n nodes.Node) string

TokenOf is a helper for getting node token (see KeyToken).

The token is an exact code snippet that represents a given AST node. It only works for primitive nodes like identifiers and string literals, and is only available in Native and Annotated parsing modes. For Semantic mode, see ContentOf.

It returns an empty string if the node is not an object, or there is no token.

func Tokens

func Tokens(n nodes.Node) []string

Tokens collects all tokens of the tree recursively (pre-order). See TokenOf.

func TypeOf

func TypeOf(o interface{}) string

TypeOf returns the UAST type of a value.

If the value is a generic UAST node, the function returns the value of its KeyType.

If an object is registered as a UAST schema type, the function returns the associated type.

Types

type Alias

type Alias struct {
	GenNode
	// Name assigned to an entity.
	//
	// TODO: define a different node to handle QualifiedIdentifier as a name
	Name Identifier `json:"Name"`

	// A UAST node to assign a name to.
	Node Any `json:"Node"`
}

Alias provides a way to assign a permanent name to an entity, or give an alternative name.

Aliases are immutable and the only way to redefine it is to shadow it in the child scope.

What is considered an Alias: - a name of a function in a function declaration; - a name of a constant and its value; - a name of a preprocessor macros and its substitution; - variable declaration; // TODO: should point to some Variable node

Not considered an Alias: - value assignments to a variable, even if it defines a variable;

type Any

type Any interface{}

Any is an alias type for any UAST node.

type Argument

type Argument struct {
	GenNode
	// Name is an optional name of an argument.
	Name *Identifier `json:"Name"`

	// Type is an optional type of an argument.
	Type Any `json:"Type"`

	// Init is an optional expression used to initialize the argument
	// in case no value is provided.
	Init Any `json:"Init"`

	// Variadic is set for the last argument of a function with a
	// variadic number of arguments.
	Variadic bool `json:"Variadic"`

	// MapVariadic is set for the last argument of a function that accepts a
	// map/dictionary value that is mapped to function arguments.
	MapVariadic bool `json:"MapVariadic"`

	// Receiver is set to true if an argument is a receiver of a method call.
	Receiver bool `json:"Receiver"`
}

Argument is a named argument or return of a function.

type Block

type Block struct {
	GenNode
	Statements []Any `json:"Statements"`
}

Block is a logical code block. It groups multiple statements and enforces a sequential execution of these statements.

When the Block should be used: - for function bodies; - when the statement defines a new scope;

type Bool

type Bool struct {
	GenNode
	Value bool `json:"Value" uast:",content"`
}

Bool is a boolean literal.

type Comment

type Comment struct {
	GenNode

	// Block is set to true for block-style comments.
	//
	// TODO: should be a string similar to Format field in String literal;
	//       may have more than 2 possible values (line, block, doc?)
	Block bool `json:"Block"`

	// Text is an unescaped UTF8 string with the comment text.
	//
	// Drivers must trim any comment-related tokens as well as whitespaces and
	// stylistic characters at the beginning of ToObjecteach line. See Prefix, Suffix, Tab.
	//
	// Example:
	//    /*
	//     * some comment
	//     */
	//
	//    only "some comment" is considered a text
	Text string `json:"Text" uast:",content"`

	// Prefix is a set of whitespaces and stylistic characters that appear before
	// the first line of an actual comment text.
	//
	// Example:
	//    /*
	//     * some comment
	//     */
	//
	//    the "\n" after the "/*" token is considered a prefix
	Prefix string `json:"Prefix"`

	// Suffix is a set of whitespaces and stylistic characters that appear after
	// the last line of an actual comment text.
	//
	// Example:
	//    /*
	//     * some comment
	//     */
	//
	//    the "\n " before the "*/" token is considered a suffix
	Suffix string `json:"Suffix"`

	// Tab is a set of whitespace and stylistic characters that appears at the beginning
	// of each comment line, except the first one, which uses Prefix.
	//
	// Example:
	//    /*
	//     * some comment
	//     */
	//
	//    the " *" before the comment text is considered a tab
	//
	// TODO(dennwc): rename to Indent?
	Tab string `json:"Tab"`
}

Comment is a no-op node that can span multiple lines and provides a human-readable description for code around it.

TODO: currently some annotations are also considered a Comment; need to clarify this

type Function

type Function struct {
	GenNode
	// Type is a signature of a function. Should always be set.
	Type FunctionType `json:"Type"`

	// Body is an optional implementation of a function. should point to a Block with
	// a set of statements. Each code path in those statements should end with return.
	//
	// TODO: we don't have return statements yet
	Body *Block `json:"Body"`
}

Function is a declaration of a function with a specific signature and implementation.

Name is not a part of function declaration. Use Alias as a parent node to specify the name of a function.

What is considered a Function: - function declaration; - anonymous functions;

type FunctionGroup

type FunctionGroup Group

FunctionGroup is a special group node that joins multiple UAST nodes related to a function declaration.

FunctionGroup usually contains at least an Alias node that specifies the function name and may contain additional nodes such as annotations and comments and docs related to it.

See Function for more details about function declarations.

type FunctionType

type FunctionType struct {
	GenNode
	// Arguments is a set of arguments the function accepts.
	//
	// Methods defined on structures and classes must have the first argument
	// that corresponds to a method's receiver ("this" in most languages).
	Arguments []Argument `json:"Arguments"`

	// Returns is a set of values returned by a function.
	//
	// Languages with an implicit return should specify a single return with an
	// unspecified type.
	Returns []Argument `json:"Returns"`
}

FunctionType is a signature of a function.

type GenNode

type GenNode struct {
	Positions Positions `json:"@pos,omitempty"`
}

GenNode is embedded into every UAST node to store positional information.

type Group

type Group struct {
	GenNode
	// Nodes is a list of UAST nodes in a group.
	Nodes []Any `json:"Nodes"`
}

Group is a no-op UAST node that groups multiple nodes together.

Drivers may use it when for grouping statements that are represented by a single statement in the native AST.

For example, a language may describe a way to define multiple variables in one statement. This statement should be split into separate UAST nodes that become a children of a single Group.

Groups should never convey any semantic meaning.

type Identifier

type Identifier struct {
	GenNode
	// Name of an entity. Can be any valid UTF8 string.
	Name string `json:"Name" uast:",content"`
}

Identifier is a name of an entity.

What is considered an Identifier: - variable, type, function names; - builtin type names; - package name consisting of a single name element; - goto labels;

Not considered an Identifier: - qualified names (see QualifiedIdentifier); - path-like or url-like package names (see String);

func (Identifier) Roles

func (Identifier) Roles() []role.Role

Roles returns a list of UAST node roles that apply to this node.

type Import

type Import struct {
	GenNode
	// Path is a path of a modules or package to load.
	//
	// May have a value of:
	// - String (specifies relative or absolute module path);
	// - Identifier (same as QualifiedIdentifier, but for one name element);
	// - QualifiedIdentifier (specifies a canonical module name);
	// - Alias (contains any of the above and defines a local package name within a file/scope);
	Path Any `json:"Path"`

	// All is set to true when the statement defines all exported symbols from
	// a module in the local scope (usually file).
	All    bool  `json:"All"`
	Names  []Any `json:"Names"`
	Target Scope `json:"Target"`
}

Import is a statement that can load other modules into the program or library.

This is a declarative import statement. Its position in the UAST does not affect the way and the time when the module is imported and the side-effects are executed only once a package is initialized.

This describes imports in Go, Java, and C#, for example.

For more specific types see RuntimeImport, RuntimeReImport, InlineImport.

type InlineImport

type InlineImport Import

InlineImport is a subset of import statement that acts like a preprocessor - all statements in the imported module are copied into a position of the UAST node.

This describes #include in C and C++.

For other import types, see Import.

type Position

type Position struct {
	// Offset is the position as an absolute byte offset. It is a 0-based index.
	Offset uint32 `json:"offset"`
	// Line is the line number. It is a 1-based index.
	Line uint32 `json:"line"`
	// Col is the column number — the byte offset of the position relative to
	// a line. It is a 1-based index.
	Col uint32 `json:"col"`
}

Position represents a position in a source code file.

func AsPosition

func AsPosition(m nodes.Object) *Position

AsPosition transforms a generic AST node to a Position object.

func (Position) HasLineCol

func (p Position) HasLineCol() bool

HasLineCol checks if a position has a valid line-column pair.

func (Position) HasOffset

func (p Position) HasOffset() bool

HasOffset checks if a position has a valid offset value.

func (Position) Less

func (p Position) Less(p2 Position) bool

Less reports whether position p is strictly less than p2.

If both positions have offsets, they will be used for comparison. Otherwise, line-column pair will be used.

Invalid positions are sorted last.

func (Position) ToObject

func (p Position) ToObject() nodes.Object

ToObject converts Position to a generic AST node.

func (Position) Valid

func (p Position) Valid() bool

Valid checks if position value is valid.

type Positions

type Positions map[string]Position

Positions is a container that stores all positional information for a UAST node.

The string key is a name of a position, for example KeyStart is a start position of a node and KeyEnd is an end position of a node. Driver may provide additional positional information for other tokens that the node consists of.

func PositionsOf

func PositionsOf(n nodes.Node) Positions

PositionsOf returns a complete positions map for the given UAST node. The function will return nil for non-object nodes like arrays and values. To get positions for these nodes, PositionsOf should be called on their parent node.

func (Positions) End

func (p Positions) End() *Position

End returns an end position of the node.

func (Positions) Keys

func (p Positions) Keys() []string

Keys returns a sorted slice of position names.

func (Positions) Start

func (p Positions) Start() *Position

Start returns a start position of the node.

func (Positions) ToObject

func (p Positions) ToObject() nodes.Object

ToObject converts a positions map to a generic UAST node.

type QualifiedIdentifier

type QualifiedIdentifier struct {
	GenNode
	// Names is a list of simple identifiers starting from a root level of hierarchy
	// and ending with leaf identifier. Names should not be empty.
	Names []Identifier `json:"Names"`
}

QualifiedIdentifier is a name of an entity that consists of multiple simple identifiers, organized in a hierarchy, similar to filesystem paths.

What is considered a QualifiedIdentifier: - qualified names that consist of Identifier-like elements;

Not considered a QualifiedIdentifier: - path-like or url-like package names (see String); - selector expressions (a->b and a.b in C++);

type RuntimeImport

type RuntimeImport Import

RuntimeImport is a type of an import statement that imports a module only when an execution reaches this UAST node. The import side effects are executed only once, regardless of how many times a statement is reached.

This describes imports in PHP, Python and JS for example.

For other import types, see Import.

type RuntimeReImport

type RuntimeReImport RuntimeImport

RuntimeReImport is a subset of RuntimeImport statement that will re-execute an import and its side-effects statement each time an execution reaches the statement.

This describes imports in PHP and Python for example.

For other import types, see Import.

type Scope

type Scope = Any

Scope is a temporary definition of a scope semantic type.

type String

type String struct {
	GenNode
	// Value is a UTF8 string literal value.
	//
	// Drivers should remove any quotes and unescape the value according to the language rules.
	Value string `json:"Value" uast:",content"`

	// Format is an optional language-specific string that describes the format of the literal.
	//
	// This field can be empty for the most common string literal type of a specific language.
	// The priority is given to a one-line literal that escapes newline characters.
	//
	// TODO: define some well-known formats and maybe make it an enum
	Format string `json:"Format"`
}

String is an unescaped UTF8 string literal.

What is considered a String literal: - escaped string literals; - raw string literals; - path-like or url-like package names;

Not considered a String literal: - identifiers (see Identifier); - qualified names (see QualifiedIdentifier); - numeric and boolean literals; - special regexp literals;

Directories

Path Synopsis
nodesproto
Package nodesproto is a generated protocol buffer package.
Package nodesproto is a generated protocol buffer package.
Package role is a generated protocol buffer package.
Package role is a generated protocol buffer package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL