Documentation ¶
Overview ¶
Package uast is a generated protocol buffer package. It is generated from these files: gopkg.in/bblfsh/sdk.v1/uast/generated.proto It has these top-level messages: Node Position
Package uast defines a UAST (Universal Abstract Syntax Tree) representation and operations to manipulate them.
Index ¶
- Constants
- Variables
- func CyclomaticComplexity(n *Node) int
- func Pretty(n *Node, w io.Writer, includes IncludeFlag) error
- func Tokens(n *Node) []string
- type Hash
- type IncludeFlag
- type Node
- func (*Node) Descriptor() ([]byte, []int)
- func (n *Node) Hash() Hash
- func (n *Node) HashWith(includes IncludeFlag) Hash
- func (m *Node) Marshal() (dAtA []byte, err error)
- func (m *Node) MarshalTo(dAtA []byte) (int, error)
- func (*Node) ProtoMessage()
- func (m *Node) ProtoSize() (n int)
- func (m *Node) Reset()
- func (n *Node) String() string
- func (m *Node) Unmarshal(dAtA []byte) error
- type ObjectToNode
- type Path
- type PathIter
- type PathStepIter
- type Position
- func (*Position) Descriptor() ([]byte, []int)
- func (m *Position) Marshal() (dAtA []byte, err error)
- func (m *Position) MarshalTo(dAtA []byte) (int, error)
- func (*Position) ProtoMessage()
- func (m *Position) ProtoSize() (n int)
- func (m *Position) Reset()
- func (m *Position) String() string
- func (m *Position) Unmarshal(dAtA []byte) error
- type Role
Constants ¶
const ( // IncludeChildren includes all children of the node. IncludeChildren IncludeFlag = 1 // IncludeAnnotations includes UAST annotations. IncludeAnnotations = 2 // IncludePositions includes token positions. IncludePositions = 4 // IncludeTokens includes token contents. IncludeTokens = 8 // IncludeInternalType includes internal type. IncludeInternalType = 16 // IncludeProperties includes properties. IncludeProperties = 32 // IncludeOriginalAST includes all properties that are present // in the original AST. IncludeOriginalAST = IncludeChildren | IncludePositions | IncludeTokens | IncludeInternalType | IncludeProperties // IncludeAll includes all fields. IncludeAll = IncludeOriginalAST | IncludeAnnotations )
const ( // InternalRoleKey is a key string uses in properties to use the internal // role of a node in the AST, if any. InternalRoleKey = "internalRole" )
Variables ¶
var ( ErrInvalidLengthGenerated = fmt.Errorf("proto: negative length found during unmarshaling") ErrIntOverflowGenerated = fmt.Errorf("proto: integer overflow") )
var ( ErrEmptyAST = errors.NewKind("input AST was empty") ErrTwoTokensSameNode = errors.NewKind("token was already set (%s != %s)") ErrTwoTypesSameNode = errors.NewKind("internal type was already set (%s != %s)") ErrUnexpectedObject = errors.NewKind("expected object of type %s, got: %#v") ErrUnexpectedObjectSize = errors.NewKind("expected object of size %d, got %d") ErrUnsupported = errors.NewKind("unsupported: %s") )
var Role_name = map[int32]string{}/* 118 elements not displayed */
Role is the main UAST annotation. It indicates that a node in an AST can be interpreted as acting with certain language-independent role.
go:generate stringer -type=Role
var Role_value = map[string]int32{}/* 118 elements not displayed */
Functions ¶
func CyclomaticComplexity ¶
This implementation uses PMD implementation as reference and uses the method of counting one + one of the following UAST Roles if present on any children: If | Case | For | [Do]While | Catch | Continue | And | Or | Xor | Goto Important: since some languages allow for code defined outside function definitions, this won't check that the Node has the role FunctionDeclarationRole so the user should check that if the intended use is calculating the complexity of a function/method. If the children contain more than one function definitions, the value will not be averaged between the total number of function declarations but given as a total.
Some practical implementations counting tokens in the code. They sometimes differ; for example some of them count the switch "default" as an incrementor, some consider all return values minus the last, some of them consider "else" (which is wrong IMHO, but not for elifs, remember than the IfElse token in the UAST is really an Else not an "else if", elseifs would have a children If token), some consider throw and finally while others only the catch, etc.
Examples: PMD reference implementation: http://pmd.sourceforge.net/pmd-4.3.0/xref/net/sourceforge/pmd/rules/CyclomaticComplexity.html GMetrics: http://gmetrics.sourceforge.net/gmetrics-CyclomaticComplexityMetric.html Go: https://github.com/fzipp/gocyclo/blob/master/gocyclo.go#L214 SonarQube (include rules for many languages): https://docs.sonarqube.org/display/SONAR/Metrics+-+Complexity
IMPORTANT DISCLAIMER: McCabe definition specifies clearly that boolean operations should increment the count in 1 for every boolean element when the language if the language evaluates conditions in short-circuit. Unfortunately in the current version of the UAST we don't specify these language invariants and also we still haven't defined how we are going to represent the boolean expressions (which also would need a tree transformation process in the pipeline that we lack) so lacking a better way of detecting for all languages what symbols or literals are part of a boolean expression we count the boolean operators themselves which should work for short-circuit infix languages but not prefix or infix languages that can evaluate more than two items with a single operator. (FIXME when both things are solved in the UAST definition and the SDK).
Types ¶
type IncludeFlag ¶
type IncludeFlag int64
IncludeFlag represents a set of fields to be included in a Hash or String.
func (IncludeFlag) Is ¶
func (f IncludeFlag) Is(of IncludeFlag) bool
type Node ¶
type Node struct { // InternalType is the internal type of the node in the AST, in the source // language. InternalType string `json:",omitempty"` // Properties are arbitrary, language-dependent, metadata of the // original AST. Properties map[string]string `json:",omitempty"` // Children are the children nodes of this node. Children []*Node `json:",omitempty"` // Token is the token content if this node represents a token from the // original source file. If it is empty, there is no token attached. Token string `json:",omitempty"` // StartPosition is the position where this node starts in the original // source code file. StartPosition *Position `json:",omitempty"` // EndPosition is the position where this node ends in the original // source code file. EndPosition *Position `json:",omitempty"` // Roles is a list of Role that this node has. It is a language-independent // annotation. Roles []Role `json:",omitempty"` }
Node is a node in a UAST.
func (*Node) Descriptor ¶
func (*Node) HashWith ¶
func (n *Node) HashWith(includes IncludeFlag) Hash
HashWith returns the hash of the node, computed with the given set of fields.
func (*Node) ProtoMessage ¶
func (*Node) ProtoMessage()
type ObjectToNode ¶ added in v1.1.0
type ObjectToNode struct { // InternalTypeKey is the name of the key that the native AST uses // to differentiate the type of the AST nodes. This internal key will then be // checkable in the AnnotationRules with the `HasInternalType` predicate. This // field is mandatory. InternalTypeKey string // OffsetKey is the key used in the native AST to indicate the absolute offset, // from the file start position, where the code mapped to the AST node starts. OffsetKey string // EndOffsetKey is the key used in the native AST to indicate the absolute offset, // from the file start position, where the code mapped to the AST node ends. EndOffsetKey string // LineKey is the key used in the native AST to indicate // the line number where the code mapped to the AST node starts. LineKey string // EndLineKey is the key used in the native AST to indicate // the line number where the code mapped to the AST node ends. EndLineKey string // ColumnKey is a key that indicates the column inside the line ColumnKey string // EndColumnKey is a key that indicates the column inside the line where the node ends. EndColumnKey string // TokenKeys establishes what properties (as in JSON // keys) in the native AST nodes can be mapped to Tokens in the UAST. If the // InternalTypeKey is the "type" of a node, the Token could be tough of as the // "value" representation; this could be a specific value for string/numeric // literals or the symbol name for others. E.g.: if a native AST represents a // numeric literal as: `{"ast_type": NumLiteral, "value": 2}` then you should have // to add `"value": true` to the TokenKeys map. Some native ASTs will use several // different fields as tokens depending on the node type; in that case, all should // be added to this map to ensure a correct UAST generation. TokenKeys map[string]bool // SpecificTokenKeys allow to map specific nodes, by their internal type, to a // concrete field of the node. This can solve conflicts on some nodes that the token // represented by a very unique field or have more than one of the fields specified in // TokenKeys. SpecificTokenKeys map[string]string // SyntheticTokens is a map of InternalType to string used to add // synthetic tokens to nodes depending on its InternalType; sometimes native ASTs just use an // InternalTypeKey for some node but we need to add a Token to the UAST node to // improve the representation. In this case we can add both the InternalKey and // what token it should generate. E.g.: an InternalTypeKey called "NullLiteral" in // Java should be mapped using this map to "null" adding “`"NullLiteral": // "null"“` to this map. SyntheticTokens map[string]string // PromotedPropertyLists allows to convert some properties in the native AST with a list value // to its own node with the list elements as children. By default the UAST // generation will set as children of a node any uast. that hangs from any of the // original native AST node properties. In this process, object key serving as // the parent is lost and its name is added as the "internalRole" key of the children. // This is usually fine since the InternalTypeKey of the parent AST node will // usually provide enough context and the node won't any other children. This map // allows you to change this default behavior for specific nodes so the properties // are "promoted" to a new node (with an InternalTypeKey named "Parent.KeyName") // and the objects in its list will be shown in the UAST as children. E.g.: if you // have a native AST where an "If" node has the JSON keys "body", "else" and // "condition" each with its own list of children, you could add an entry to // PromotedPropertyLists like // // "If": {"body": true, "orelse": true, "condition": true}, // // In this case, the new nodes will have the InternalTypeKey "If.body", "If.orelse" // and "If.condition" and with these names you should be able to write specific // matching rules in the annotation.go file. PromotedPropertyLists map[string]map[string]bool // If this option is set, all properties mapped to a list will be promoted to its own node. Setting // this option to true will ignore the PromotedPropertyLists settings. PromoteAllPropertyLists bool // PromotedPropertyStrings allows to convert some properties which value is a string // in the native AST as a full node with the string value as Token like: // // "SomeKey": "SomeValue" // // that would be converted to a child node like: // // {"internalType": "SomeKey", "Token": "SomeValue"} PromotedPropertyStrings map[string]map[string]bool // TopLevelIsRootNode tells ToNode where to find the root node of // the AST. If true, the root will be its input argument. If false, // the root will be the value of the only key present in its input // argument. TopLevelIsRootNode bool }
ObjectToNode transform trees that are represented as nested JSON objects. That is, an interface{} containing maps, slices, strings and integers. It then converts from that structure to *Node.
func (*ObjectToNode) ToNode ¶ added in v1.1.0
func (c *ObjectToNode) ToNode(v interface{}) (*Node, error)
type Path ¶
type Path []*Node
Path represents a Node with its path in a tree. It is a slice with every token in the path, where the last one is the node itself. The empty path is is the zero value (e.g. parent of the root node).
type PathIter ¶
type PathIter interface { // Next returns the next node path or nil if the are no more nodes. Next() Path }
PathIter iterates node paths.
type PathStepIter ¶
type PathStepIter interface { PathIter // If Step is called, children of the last node returned by Next() will // not be visited. Step() }
PathIter iterates node paths, optionally stepping to avoid visiting children of some nodes.
func NewOrderPathIter ¶
func NewOrderPathIter(p Path) PathStepIter
NewOrderPathIter creates an iterator that iterates all tree nodes (by default it will use preorder traversal but will switch to inorder or postorder if the Infix and Postfix roles are found).
type Position ¶
type Position struct { // Offset is the position as an absolute byte offset. It is a 0-based // index. Offset uint32 // Line is the line number. It is a 1-based index. Line uint32 // Col is the column number (the byte offset of the position relative to // a line. It is a 1-based index. Col uint32 }
Position represents a position in a source code file.
func (*Position) Descriptor ¶
func (*Position) ProtoMessage ¶
func (*Position) ProtoMessage()
type Role ¶
type Role int16
Role is the main UAST annotation. It indicates that a node in an AST can be interpreted as acting with certain language-independent role.
const ( // Invalid Role is assigned as a zero value since protobuf enum definition must start at 0. Invalid Role = iota // Identifier is any form of identifier, used for variable names, functions, packages, etc. Identifier // Qualified is a kind of property identifiers may have, when it's composed // of multiple simple identifiers. Qualified // Operator is any form of operator. Operator // Binary is any form of binary operator, in contrast with unary operators. Binary // Unary is any form of unary operator, in contrast with binary operators. Unary // Left is a left hand side in a binary expression. Left // Right is a right hand side if a binary expression. Right // Infix should mark the nodes which are parents of expression nodes using infix notation, e.g.: a+b. // Nodes without Infix or Postfix mark are considered in prefix order by default. Infix // Postfix should mark the nodes which are parents of nodes using postfix notation, e.g.: ab+. // Nodes without Infix or Postfix mark are considered in prefix order by default. Postfix // Bitwise is any form of bitwise operation. Bitwise // Boolean is any form of boolean operation. Boolean // Unsigned is an form of unsigned operation. Unsigned // LeftShift is a left shift operation (i.e. `<<`, `rol`, etc.) LeftShift // RightShift is a right shift operation (i.e. `>>`, `ror`, etc.) RightShift // Or is an OR operation (i.e. `||`, `or`, `|`, etc.) Or // Xor is an exclusive OR operation (i.e. `~`, `^`, etc.) Xor // And is an AND operation (i.e. `&&`, `&`, `and`, etc.) And // Expression is a construct computed to produce some value. Expression // Statement is some action to be carried out. Statement // Equal is an eaquality predicate (i.e. `=`, `==`, etc.) Equal // Not is a negation operation. It may be used to annotate a complement of an operator. Not // LessThan is a comparison predicate that checks if the lhs value is smaller than the rhs value (i. e. `<`.) LessThan // LessThanOrEqual is a comparison predicate that checks if the lhs value is smaller or equal to the rhs value (i.e. `<=`.) LessThanOrEqual // GreaterThan is a comparison predicate that checks if the lhs value is greather than the rhs value (i. e. `>`.) GreaterThan // GreaterThanOrEqual is a comparison predicate that checks if the lhs value is greather than or equal to the rhs value (i.e. 1>=`.) GreaterThanOrEqual // Identical is an identity predicate (i. e. `===`, `is`, etc.) Identical // Contains is a membership predicate that checks if the lhs value is a member of the rhs container (i.e. `in` in Python.) Contains // Increment is an arithmetic operator that increments a value (i. e. `++i`.) Increment // Decrement is an arithmetic operator that decrements a value (i. e. `--i`.) Decrement // Negative is an arithmetic operator that negates a value (i.e. `-x`.) Negative // Positive is an arithmetic operator that makes a value positive. It's usually redundant (i.e. `+x`.) Positive // Dereference is an operation that gets the actual value of a pointer or reference (i.e. `*x`.) Dereference // TakeAddress is an operation that gets the memory address of a value (i. e. `&x`.) TakeAddress // File is the root node of a single file AST. File // Add is an arithmetic operator (i.e. `+`.) Add // Substract in an arithmetic operator (i.e. `-`.) Substract // Multiply is an arithmetic operator (i.e. `*`.) Multiply // Divide is an arithmetic operator (i.e. `/`.) Divide // Modulo is an arithmetic operator (i.e. `%`, `mod`, etc.) Modulo // Package indicates that a package level property. Package // Declaration is a construct to specify properties of an identifier. Declaration // Import indicates an import level property. Import // Pathname is a qualified name of some construct. Pathname // Alias is an alternative name for some construct. Alias // Function is a sequence of instructions packaged as a unit. Function // Body is a sequence of instructions in a block. Body // Name is an identifier used to reference a value. Name // Receiver is the target of a construct (message, function, etc.) Receiver // Argument is variable used as input/output in a function. Argument // Value is an expression that cannot be evaluated any further. Value // ArgsList is variable number of arguments (i.e. `...`, `Object...`, `*args`, etc.) ArgsList // Base is the parent type of which another type inherits. Base // Implements is the type (usually an interface) that another type implements. Implements // Instance is a concrete occurrence of an object. Instance // Subtype is a type that can be used to substitute another type. Subtype // Subpackage is a package that is below another package in the hierarchy. Subpackage // Module is a set of funcitonality grouped. Module // Friend is an access granter for some private resources. Friend // World is a set of every component. World // If is used for if-then[-else] statements or expressions. // An if-then tree will look like: // // If, Statement { // **[non-If nodes] { // If, Condition { // [...] // } // } // **[non-If* nodes] { // If, Then { // [...] // } // } // **[non-If* nodes] { // If, Else { // [...] // } // } // } // // The Else node is optional. The order of Condition, Then and // Else is not defined. If // Condition is a condition in an IfStatement or IfExpression. Condition // Then is the clause executed when the Condition is true. Then // Else is the clause executed when the Condition is false. Else // Switch is used to represent a broad of switch flavors. An expression // is evaluated and then compared to the values returned by different // case expressions, executing a body associated to the first case that // matches. Similar constructions that go beyond expression comparison // (such as pattern matching in Scala's match) should not be annotated // with Switch. Switch // Case is a clause whose expression is compared with the condition. Case // Default is a clause that is called when no other clause is matches. Default // For is a loop with an initialization, a condition, an update and a body. For // Initialization is the assignment of an initial value to a variable // (i.e. a for loop variable initialization.) Initialization // Update is the assignment of a new value to a variable // (i.e. a for loop variable update.) Update // Iterator is the element that iterates over something. Iterator // While is a loop construct with a condition and a body. While // DoWhile is a loop construct with a body and a condition. DoWhile // Break is a construct for early exiting a block. Break // Continue is a construct for continuation with the next iteration of a loop. Continue // Goto is an unconditional transfer of control statement. Goto // Block is a group of statements. If the source language has block scope, // it should be annotated both with Block and BlockScope. Block // Scope is a range in which a variable can be referred. Scope // Return is a return statement. It might have a child expression or not // as with naked returns in Go or return in void methods in Java. Return // Try is a statement for exception handling. Try // Catch is a clause to capture exceptions. Catch // Finally is a clause for a block executed after a block with exception handling. Finally // Throw is a statement that creates an exception. Throw // Assert checks if an expression is true and if it is not, it signals // an error/exception, possibly stopping the execution. Assert // Call is any call, whether it is a function, procedure, method or macro. // In its simplest form, a call will have a single child with a function // name (callee). Arguments are marked with Argument and Positional or Name. // In OO languages there is usually a Receiver too. Call // Callee is the callable being called. It might be the name of a // function or procedure, it might be a method, it might a simple name // or qualified with a namespace. Callee // Positional is an element which position has meaning (i.e. a positional argument in a call). Positional // Noop is a construct that does nothing. Noop // Literal is a literal value. Literal // Byte is a single-byte element. Byte // ByteString is a raw byte string. ByteString // Character is an encoded character. Character // List is a sequence. List // Map is a collection of key, value pairs. Map // Null is an empty value. Null // Number is a numeric value. This applies to any numeric value // whether it is integer or float, any base, scientific notation or not, // etc. Number // Regexp is a regular expression. Regexp // Set is a collection of values. Set // String is a sequence of characters. String // Tuple is an finite ordered sequence of elements. Tuple // Type is a classification of data. Type // Entry is a collection element. Entry // Key is the index value of a map. Key // Primitive is a language builtin. Primitive // Assignment is an assignment operator. Assignment // This represents the self-reference of an object instance in // one of its methods. This corresponds to the `this` keyword // (e.g. Java, C++, PHP), `self` (e.g. Smalltalk, Perl, Swift) and `Me` // (e.g. Visual Basic). This // Comment is a code comment. Comment // Documentation is a node that represents documentation of another node, // such as function or package. Documentation is usually in the form of // a string in certain position (e.g. Python docstring) or comment // (e.g. Javadoc, godoc). Documentation // Whitespace. Whitespace // Incomplete express that the semantic meaning of the node roles doesn't express // the full semantic information. Added in BIP-002. Incomplete // Unannotated will be automatically added by the SDK for nodes that did not receive // any annotations with the current version of the driver's `annotations.go` file. // Added in BIP-002. Unannotated // Visibility is an access granter role, usually together with an specifier role Visibility // Annotation is syntactic metadata Annotation // Anonymous is an unbound construct Anonymous // Enumeration is a distinct type that represents a set of named constants Enumeration // Arithmetic is a type of operation Arithmetic // Relational is a type of operation Relational // Variable is a symbolic name associatend with a value Variable )
func (Role) EnumDescriptor ¶
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
Package ann provides a DSL to annotate UAST.
|
Package ann provides a DSL to annotate UAST. |