pointer

package
v0.0.0-...-f056e37 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 11, 2022 License: BSD-3-Clause Imports: 24 Imported by: 0

Documentation

Overview

Package pointer implements Andersen's analysis, an inclusion-based pointer analysis algorithm first described in (Andersen, 1994).

A pointer analysis relates every pointer expression in a whole program to the set of memory locations to which it might point. This information can be used to construct a call graph of the program that precisely represents the destinations of dynamic function and method calls. It can also be used to determine, for example, which pairs of channel operations operate on the same channel.

The package allows the client to request a set of expressions of interest for which the points-to information will be returned once the analysis is complete. In addition, the client may request that a callgraph is constructed. The example program in example_test.go demonstrates both of these features. Clients should not request more information than they need since it may increase the cost of the analysis significantly.

CLASSIFICATION

Our algorithm is INCLUSION-BASED: the points-to sets for x and y will be related by pts(y) ⊇ pts(x) if the program contains the statement y = x.

It is FLOW-INSENSITIVE: it ignores all control flow constructs and the order of statements in a program. It is therefore a "MAY ALIAS" analysis: its facts are of the form "P may/may not point to L", not "P must point to L".

It is FIELD-SENSITIVE: it builds separate points-to sets for distinct fields, such as x and y in struct { x, y *int }.

It is mostly CONTEXT-INSENSITIVE: most functions are analyzed once, so values can flow in at one call to the function and return out at another. Only some smaller functions are analyzed with consideration of their calling context.

It has a CONTEXT-SENSITIVE HEAP: objects are named by both allocation site and context, so the objects returned by two distinct calls to f:

func f() *T { return new(T) }

are distinguished up to the limits of the calling context.

It is a WHOLE PROGRAM analysis: it requires SSA-form IR for the complete Go program and summaries for native code.

See the (Hind, PASTE'01) survey paper for an explanation of these terms.

SOUNDNESS

The analysis is fully sound when invoked on pure Go programs that do not use reflection or unsafe.Pointer conversions. In other words, if there is any possible execution of the program in which pointer P may point to object O, the analysis will report that fact.

REFLECTION

By default, the "reflect" library is ignored by the analysis, as if all its functions were no-ops, but if the client enables the Reflection flag, the analysis will make a reasonable attempt to model the effects of calls into this library. However, this comes at a significant performance cost, and not all features of that library are yet implemented. In addition, some simplifying approximations must be made to ensure that the analysis terminates; for example, reflection can be used to construct an infinite set of types and values of those types, but the analysis arbitrarily bounds the depth of such types.

Most but not all reflection operations are supported. In particular, addressable reflect.Values are not yet implemented, so operations such as (reflect.Value).Set have no analytic effect.

UNSAFE POINTER CONVERSIONS

The pointer analysis makes no attempt to understand aliasing between the operand x and result y of an unsafe.Pointer conversion:

y = (*T)(unsafe.Pointer(x))

It is as if the conversion allocated an entirely new object:

y = new(T)

NATIVE CODE

The analysis cannot model the aliasing effects of functions written in languages other than Go, such as runtime intrinsics in C or assembly, or code accessed via cgo. The result is as if such functions are no-ops. However, various important intrinsics are understood by the analysis, along with built-ins such as append.

The analysis currently provides no way for users to specify the aliasing effects of native code.

------------------------------------------------------------------------

IMPLEMENTATION

The remaining documentation is intended for package maintainers and pointer analysis specialists. Maintainers should have a solid understanding of the referenced papers (especially those by H&L and PKH) before making making significant changes.

The implementation is similar to that described in (Pearce et al, PASTE'04). Unlike many algorithms which interleave constraint generation and solving, constructing the callgraph as they go, this implementation for the most part observes a phase ordering (generation before solving), with only simple (copy) constraints being generated during solving. (The exception is reflection, which creates various constraints during solving as new types flow to reflect.Value operations.) This improves the traction of presolver optimisations, but imposes certain restrictions, e.g. potential context sensitivity is limited since all variants must be created a priori.

TERMINOLOGY

A type is said to be "pointer-like" if it is a reference to an object. Pointer-like types include pointers and also interfaces, maps, channels, functions and slices.

We occasionally use C's x->f notation to distinguish the case where x is a struct pointer from x.f where is a struct value.

Pointer analysis literature (and our comments) often uses the notation dst=*src+offset to mean something different than what it means in Go. It means: for each node index p in pts(src), the node index p+offset is in pts(dst). Similarly *dst+offset=src is used for store constraints and dst=src+offset for offset-address constraints.

NODES

Nodes are the key datastructure of the analysis, and have a dual role: they represent both constraint variables (equivalence classes of pointers) and members of points-to sets (things that can be pointed at, i.e. "labels").

Nodes are naturally numbered. The numbering enables compact representations of sets of nodes such as bitvectors (or BDDs); and the ordering enables a very cheap way to group related nodes together. For example, passing n parameters consists of generating n parallel constraints from caller+i to callee+i for 0<=i<n.

The zero nodeid means "not a pointer". For simplicity, we generate flow constraints even for non-pointer types such as int. The pointer equivalence (PE) presolver optimization detects which variables cannot point to anything; this includes not only all variables of non-pointer types (such as int) but also variables of pointer-like types if they are always nil, or are parameters to a function that is never called.

Each node represents a scalar part of a value or object. Aggregate types (structs, tuples, arrays) are recursively flattened out into a sequential list of scalar component types, and all the elements of an array are represented by a single node. (The flattening of a basic type is a list containing a single node.)

Nodes are connected into a graph with various kinds of labelled edges: simple edges (or copy constraints) represent value flow. Complex edges (load, store, etc) trigger the creation of new simple edges during the solving phase.

OBJECTS

Conceptually, an "object" is a contiguous sequence of nodes denoting an addressable location: something that a pointer can point to. The first node of an object has a non-nil obj field containing information about the allocation: its size, context, and ssa.Value.

Objects include:

  • functions and globals;
  • variable allocations in the stack frame or heap;
  • maps, channels and slices created by calls to make();
  • allocations to construct an interface;
  • allocations caused by conversions, e.g. []byte(str).
  • arrays allocated by calls to append();

Many objects have no Go types. For example, the func, map and chan type kinds in Go are all varieties of pointers, but their respective objects are actual functions (executable code), maps (hash tables), and channels (synchronized queues). Given the way we model interfaces, they too are pointers to "tagged" objects with no Go type. And an *ssa.Global denotes the address of a global variable, but the object for a Global is the actual data. So, the types of an ssa.Value that creates an object is "off by one indirection": a pointer to the object.

The individual nodes of an object are sometimes referred to as "labels".

For uniformity, all objects have a non-zero number of fields, even those of the empty type struct{}. (All arrays are treated as if of length 1, so there are no empty arrays. The empty tuple is never address-taken, so is never an object.)

TAGGED OBJECTS

An tagged object has the following layout:

T          -- obj.flags ⊇ {otTagged}
v
...

The T node's typ field is the dynamic type of the "payload": the value v which follows, flattened out. The T node's obj has the otTagged flag.

Tagged objects are needed when generalizing across types: interfaces, reflect.Values, reflect.Types. Each of these three types is modelled as a pointer that exclusively points to tagged objects.

Tagged objects may be indirect (obj.flags ⊇ {otIndirect}) meaning that the value v is not of type T but *T; this is used only for reflect.Values that represent lvalues. (These are not implemented yet.)

ANALYSIS ABSTRACTION OF EACH TYPE

Variables of the following "scalar" types may be represented by a single node: basic types, pointers, channels, maps, slices, 'func' pointers, interfaces.

Pointers:

Nothing to say here, oddly.

Basic types (bool, string, numbers, unsafe.Pointer):

Currently all fields in the flattening of a type, including non-pointer basic types such as int, are represented in objects and values. Though non-pointer nodes within values are uninteresting, non-pointer nodes in objects may be useful (if address-taken) because they permit the analysis to deduce, in this example,

var s struct{ ...; x int; ... }
p := &s.x

that p points to s.x. If we ignored such object fields, we could only say that p points somewhere within s.

All other basic types are ignored. Expressions of these types have zero nodeid, and fields of these types within aggregate other types are omitted.

unsafe.Pointers are not modelled as pointers, so a conversion of an unsafe.Pointer to *T is (unsoundly) treated equivalent to new(T).

Channels:

An expression of type 'chan T' is a kind of pointer that points exclusively to channel objects, i.e. objects created by MakeChan (or reflection).

'chan T' is treated like *T. *ssa.MakeChan is treated as equivalent to new(T). *ssa.Send and receive (*ssa.UnOp(ARROW)) and are equivalent to store

and load.

Maps:

An expression of type 'map[K]V' is a kind of pointer that points exclusively to map objects, i.e. objects created by MakeMap (or reflection).

map K[V] is treated like *M where M = struct{k K; v V}. *ssa.MakeMap is equivalent to new(M). *ssa.MapUpdate is equivalent to *y=x where *y and x have type M. *ssa.Lookup is equivalent to y=x.v where x has type *M.

Slices:

A slice []T, which dynamically resembles a struct{array *T, len, cap int}, is treated as if it were just a *T pointer; the len and cap fields are ignored.

*ssa.MakeSlice is treated like new([1]T): an allocation of a

singleton array.

*ssa.Index on a slice is equivalent to a load. *ssa.IndexAddr on a slice returns the address of the sole element of the slice, i.e. the same address. *ssa.Slice is treated as a simple copy.

Functions:

An expression of type 'func...' is a kind of pointer that points exclusively to function objects.

A function object has the following layout:

identity         -- typ:*types.Signature; obj.flags ⊇ {otFunction}
params_0         -- (the receiver, if a method)
...
params_n-1
results_0
...
results_m-1

There may be multiple function objects for the same *ssa.Function due to context-sensitive treatment of some functions.

The first node is the function's identity node. Associated with every callsite is a special "targets" variable, whose pts() contains the identity node of each function to which the call may dispatch. Identity words are not otherwise used during the analysis, but we construct the call graph from the pts() solution for such nodes.

The following block of contiguous nodes represents the flattened-out types of the parameters ("P-block") and results ("R-block") of the function object.

The treatment of free variables of closures (*ssa.FreeVar) is like that of global variables; it is not context-sensitive. *ssa.MakeClosure instructions create copy edges to Captures.

A Go value of type 'func' (i.e. a pointer to one or more functions) is a pointer whose pts() contains function objects. The valueNode() for an *ssa.Function returns a singleton for that function.

Interfaces:

An expression of type 'interface{...}' is a kind of pointer that points exclusively to tagged objects. All tagged objects pointed to by an interface are direct (the otIndirect flag is clear) and concrete (the tag type T is not itself an interface type). The associated ssa.Value for an interface's tagged objects may be an *ssa.MakeInterface instruction, or nil if the tagged object was created by an instrinsic (e.g. reflection).

Constructing an interface value causes generation of constraints for all of the concrete type's methods; we can't tell a priori which ones may be called.

TypeAssert y = x.(T) is implemented by a dynamic constraint triggered by each tagged object O added to pts(x): a typeFilter constraint if T is an interface type, or an untag constraint if T is a concrete type. A typeFilter tests whether O.typ implements T; if so, O is added to pts(y). An untagFilter tests whether O.typ is assignable to T,and if so, a copy edge O.v -> y is added.

ChangeInterface is a simple copy because the representation of tagged objects is independent of the interface type (in contrast to the "method tables" approach used by the gc runtime).

y := Invoke x.m(...) is implemented by allocating contiguous P/R blocks for the callsite and adding a dynamic rule triggered by each tagged object added to pts(x). The rule adds param/results copy edges to/from each discovered concrete method.

(Q. Why do we model an interface as a pointer to a pair of type and value, rather than as a pair of a pointer to type and a pointer to value? A. Control-flow joins would merge interfaces ({T1}, {V1}) and ({T2}, {V2}) to make ({T1,T2}, {V1,V2}), leading to the infeasible and type-unsafe combination (T1,V2). Treating the value and its concrete type as inseparable makes the analysis type-safe.)

Type parameters:

Type parameters are not directly supported by the analysis. Calls to generic functions will be left as if they had empty bodies. Users of the package are expected to use the ssa.InstantiateGenerics builder mode when building code that uses or depends on code containing generics.

reflect.Value:

A reflect.Value is modelled very similar to an interface{}, i.e. as a pointer exclusively to tagged objects, but with two generalizations.

1. a reflect.Value that represents an lvalue points to an indirect (obj.flags ⊇ {otIndirect}) tagged object, which has a similar layout to an tagged object except that the value is a pointer to the dynamic type. Indirect tagged objects preserve the correct aliasing so that mutations made by (reflect.Value).Set can be observed.

Indirect objects only arise when an lvalue is derived from an rvalue by indirection, e.g. the following code:

type S struct { X T }
var s S
var i interface{} = &s    // i points to a *S-tagged object (from MakeInterface)
v1 := reflect.ValueOf(i)  // v1 points to same *S-tagged object as i
v2 := v1.Elem()           // v2 points to an indirect S-tagged object, pointing to s
v3 := v2.FieldByName("X") // v3 points to an indirect int-tagged object, pointing to s.X
v3.Set(y)                 // pts(s.X) ⊇ pts(y)

Whether indirect or not, the concrete type of the tagged object corresponds to the user-visible dynamic type, and the existence of a pointer is an implementation detail.

(NB: indirect tagged objects are not yet implemented)

2. The dynamic type tag of a tagged object pointed to by a reflect.Value may be an interface type; it need not be concrete.

This arises in code such as this:

tEface := reflect.TypeOf(new(interface{}).Elem() // interface{}
eface := reflect.Zero(tEface)

pts(eface) is a singleton containing an interface{}-tagged object. That tagged object's payload is an interface{} value, i.e. the pts of the payload contains only concrete-tagged objects, although in this example it's the zero interface{} value, so its pts is empty.

reflect.Type:

Just as in the real "reflect" library, we represent a reflect.Type as an interface whose sole implementation is the concrete type, *reflect.rtype. (This choice is forced on us by go/types: clients cannot fabricate types with arbitrary method sets.)

rtype instances are canonical: there is at most one per dynamic type. (rtypes are in fact large structs but since identity is all that matters, we represent them by a single node.)

The payload of each *rtype-tagged object is an *rtype pointer that points to exactly one such canonical rtype object. We exploit this by setting the node.typ of the payload to the dynamic type, not '*rtype'. This saves us an indirection in each resolution rule. As an optimisation, *rtype-tagged objects are canonicalized too.

Aggregate types:

Aggregate types are treated as if all directly contained aggregates are recursively flattened out.

Structs:

*ssa.Field y = x.f creates a simple edge to y from x's node at f's offset.

*ssa.FieldAddr y = &x->f requires a dynamic closure rule to create

simple edges for each struct discovered in pts(x).

The nodes of a struct consist of a special 'identity' node (whose type is that of the struct itself), followed by the nodes for all the struct's fields, recursively flattened out. A pointer to the struct is a pointer to its identity node. That node allows us to distinguish a pointer to a struct from a pointer to its first field.

Field offsets are logical field offsets (plus one for the identity node), so the sizes of the fields can be ignored by the analysis.

(The identity node is non-traditional but enables the distinction described above, which is valuable for code comprehension tools. Typical pointer analyses for C, whose purpose is compiler optimization, must soundly model unsafe.Pointer (void*) conversions, and this requires fidelity to the actual memory layout using physical field offsets.)

*ssa.Field y = x.f creates a simple edge to y from x's node at f's offset.

*ssa.FieldAddr y = &x->f requires a dynamic closure rule to create

simple edges for each struct discovered in pts(x).

Arrays:

We model an array by an identity node (whose type is that of the array itself) followed by a node representing all the elements of the array; the analysis does not distinguish elements with different indices. Effectively, an array is treated like struct{elem T}, a load y=x[i] like y=x.elem, and a store x[i]=y like x.elem=y; the index i is ignored.

A pointer to an array is pointer to its identity node. (A slice is also a pointer to an array's identity node.) The identity node allows us to distinguish a pointer to an array from a pointer to one of its elements, but it is rather costly because it introduces more offset constraints into the system. Furthermore, sound treatment of unsafe.Pointer would require us to dispense with this node.

Arrays may be allocated by Alloc, by make([]T), by calls to append, and via reflection.

Tuples (T, ...):

Tuples are treated like structs with naturally numbered fields. *ssa.Extract is analogous to *ssa.Field.

However, tuples have no identity field since by construction, they cannot be address-taken.

FUNCTION CALLS

There are three kinds of function call:

  1. static "call"-mode calls of functions.
  2. dynamic "call"-mode calls of functions.
  3. dynamic "invoke"-mode calls of interface methods.

Cases 1 and 2 apply equally to methods and standalone functions.

Static calls:

A static call consists three steps:

  • finding the function object of the callee;
  • creating copy edges from the actual parameter value nodes to the P-block in the function object (this includes the receiver if the callee is a method);
  • creating copy edges from the R-block in the function object to the value nodes for the result of the call.

A static function call is little more than two struct value copies between the P/R blocks of caller and callee:

callee.P = caller.P
caller.R = callee.R

Context sensitivity: Static calls (alone) may be treated context sensitively, i.e. each callsite may cause a distinct re-analysis of the callee, improving precision. Our current context-sensitivity policy treats all intrinsics and getter/setter methods in this manner since such functions are small and seem like an obvious source of spurious confluences, though this has not yet been evaluated.

Dynamic function calls:

Dynamic calls work in a similar manner except that the creation of copy edges occurs dynamically, in a similar fashion to a pair of struct copies in which the callee is indirect:

callee->P = caller.P
caller.R = callee->R

(Recall that the function object's P- and R-blocks are contiguous.)

Interface method invocation:

For invoke-mode calls, we create a params/results block for the callsite and attach a dynamic closure rule to the interface. For each new tagged object that flows to the interface, we look up the concrete method, find its function object, and connect its P/R blocks to the callsite's P/R blocks, adding copy edges to the graph during solving.

Recording call targets:

The analysis notifies its clients of each callsite it encounters, passing a CallSite interface. Among other things, the CallSite contains a synthetic constraint variable ("targets") whose points-to solution includes the set of all function objects to which the call may dispatch.

It is via this mechanism that the callgraph is made available. Clients may also elect to be notified of callgraph edges directly; internally this just iterates all "targets" variables' pts(·)s.

PRESOLVER

We implement Hash-Value Numbering (HVN), a pre-solver constraint optimization described in Hardekopf & Lin, SAS'07. This is documented in more detail in hvn.go. We intend to add its cousins HR and HU in future.

SOLVER

The solver is currently a naive Andersen-style implementation; it does not perform online cycle detection, though we plan to add solver optimisations such as Hybrid- and Lazy- Cycle Detection from (Hardekopf & Lin, PLDI'07).

It uses difference propagation (Pearce et al, SQC'04) to avoid redundant re-triggering of closure rules for values already seen.

Points-to sets are represented using sparse bit vectors (similar to those used in LLVM and gcc), which are more space- and time-efficient than sets based on Go's built-in map type or dense bit vectors.

Nodes are permuted prior to solving so that object nodes (which may appear in points-to sets) are lower numbered than non-object (var) nodes. This improves the density of the set over which the PTSs range, and thus the efficiency of the representation.

Partly thanks to avoiding map iteration, the execution of the solver is 100% deterministic, a great help during debugging.

FURTHER READING

Andersen, L. O. 1994. Program analysis and specialization for the C programming language. Ph.D. dissertation. DIKU, University of Copenhagen.

David J. Pearce, Paul H. J. Kelly, and Chris Hankin. 2004. Efficient field-sensitive pointer analysis for C. In Proceedings of the 5th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering (PASTE '04). ACM, New York, NY, USA, 37-42. http://doi.acm.org/10.1145/996821.996835

David J. Pearce, Paul H. J. Kelly, and Chris Hankin. 2004. Online Cycle Detection and Difference Propagation: Applications to Pointer Analysis. Software Quality Control 12, 4 (December 2004), 311-337. http://dx.doi.org/10.1023/B:SQJO.0000039791.93071.a2

David Grove and Craig Chambers. 2001. A framework for call graph construction algorithms. ACM Trans. Program. Lang. Syst. 23, 6 (November 2001), 685-746. http://doi.acm.org/10.1145/506315.506316

Ben Hardekopf and Calvin Lin. 2007. The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code. In Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (PLDI '07). ACM, New York, NY, USA, 290-299. http://doi.acm.org/10.1145/1250734.1250767

Ben Hardekopf and Calvin Lin. 2007. Exploiting pointer and location equivalence to optimize pointer analysis. In Proceedings of the 14th international conference on Static Analysis (SAS'07), Hanne Riis Nielson and Gilberto Filé (Eds.). Springer-Verlag, Berlin, Heidelberg, 265-280.

Atanas Rountev and Satish Chandra. 2000. Off-line variable substitution for scaling points-to analysis. In Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation (PLDI '00). ACM, New York, NY, USA, 47-56. DOI=10.1145/349299.349310 http://doi.acm.org/10.1145/349299.349310

Example

This program demonstrates how to use the pointer analysis to obtain a conservative call-graph of a Go program. It also shows how to compute the points-to set of a variable, in this case, (C).f's ch parameter.

package main

import (
	"fmt"
	"sort"

	"github.com/peske/golang-x-tools/go/callgraph"
	"github.com/peske/golang-x-tools/go/loader"
	"github.com/peske/golang-x-tools/go/pointer"
	"github.com/peske/golang-x-tools/go/ssa"
	"github.com/peske/golang-x-tools/go/ssa/ssautil"
)

func main() {
	const myprog = `
package main

import "fmt"

type I interface {
	f(map[string]int)
}

type C struct{}

func (C) f(m map[string]int) {
	fmt.Println("C.f()")
}

func main() {
	var i I = C{}
	x := map[string]int{"one":1}
	i.f(x) // dynamic method call
}
`
	var conf loader.Config

	// Parse the input file, a string.
	// (Command-line tools should use conf.FromArgs.)
	file, err := conf.ParseFile("myprog.go", myprog)
	if err != nil {
		fmt.Print(err) // parse error
		return
	}

	// Create single-file main package and import its dependencies.
	conf.CreateFromFiles("main", file)

	iprog, err := conf.Load()
	if err != nil {
		fmt.Print(err) // type error in some package
		return
	}

	// Create SSA-form program representation.
	prog := ssautil.CreateProgram(iprog, ssa.InstantiateGenerics)
	mainPkg := prog.Package(iprog.Created[0].Pkg)

	// Build SSA code for bodies of all functions in the whole program.
	prog.Build()

	// Configure the pointer analysis to build a call-graph.
	config := &pointer.Config{
		Mains:          []*ssa.Package{mainPkg},
		BuildCallGraph: true,
	}

	// Query points-to set of (C).f's parameter m, a map.
	C := mainPkg.Type("C").Type()
	Cfm := prog.LookupMethod(C, mainPkg.Pkg, "f").Params[1]
	config.AddQuery(Cfm)

	// Run the pointer analysis.
	result, err := pointer.Analyze(config)
	if err != nil {
		panic(err) // internal error in pointer analysis
	}

	// Find edges originating from the main package.
	// By converting to strings, we de-duplicate nodes
	// representing the same function due to context sensitivity.
	var edges []string
	callgraph.GraphVisitEdges(result.CallGraph, func(edge *callgraph.Edge) error {
		caller := edge.Caller.Func
		if caller.Pkg == mainPkg {
			edges = append(edges, fmt.Sprint(caller, " --> ", edge.Callee.Func))
		}
		return nil
	})

	// Print the edges in sorted order.
	sort.Strings(edges)
	for _, edge := range edges {
		fmt.Println(edge)
	}
	fmt.Println()

	// Print the labels of (C).f(m)'s points-to set.
	fmt.Println("m may point to:")
	var labels []string
	for _, l := range result.Queries[Cfm].PointsTo().Labels() {
		label := fmt.Sprintf("  %s: %s", prog.Fset.Position(l.Pos()), l)
		labels = append(labels, label)
	}
	sort.Strings(labels)
	for _, label := range labels {
		fmt.Println(label)
	}

}
Output:

(main.C).f --> fmt.Println
main.init --> fmt.init
main.main --> (main.C).f

m may point to:
  myprog.go:18:21: makemap

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func CanHaveDynamicTypes

func CanHaveDynamicTypes(T types.Type) bool

CanHaveDynamicTypes reports whether the type T can "hold" dynamic types, i.e. is an interface (incl. reflect.Type) or a reflect.Value.

func CanPoint

func CanPoint(T types.Type) bool

CanPoint reports whether the type T is pointerlike, for the purposes of this analysis.

Types

type Config

type Config struct {
	// Mains contains the set of 'main' packages to analyze
	// Clients must provide the analysis with at least one
	// package defining a main() function.
	//
	// Non-main packages in the ssa.Program that are not
	// dependencies of any main package may still affect the
	// analysis result, because they contribute runtime types and
	// thus methods.
	//
	// TODO(adonovan): investigate whether this is desirable.
	//
	// Calls to generic functions will be unsound unless packages
	// are built using the ssa.InstantiateGenerics builder mode.
	Mains []*ssa.Package

	// Reflection determines whether to handle reflection
	// operators soundly, which is currently rather slow since it
	// causes constraint to be generated during solving
	// proportional to the number of constraint variables, which
	// has not yet been reduced by presolver optimisation.
	Reflection bool

	// BuildCallGraph determines whether to construct a callgraph.
	// If enabled, the graph will be available in Result.CallGraph.
	BuildCallGraph bool

	// The client populates Queries[v] or IndirectQueries[v]
	// for each ssa.Value v of interest, to request that the
	// points-to sets pts(v) or pts(*v) be computed.  If the
	// client needs both points-to sets, v may appear in both
	// maps.
	//
	// (IndirectQueries is typically used for Values corresponding
	// to source-level lvalues, e.g. an *ssa.Global.)
	//
	// The analysis populates the corresponding
	// Result.{Indirect,}Queries map when it creates the pointer
	// variable for v or *v.  Upon completion the client can
	// inspect that map for the results.
	//
	// TODO(adonovan): this API doesn't scale well for batch tools
	// that want to dump the entire solution.  Perhaps optionally
	// populate a map[*ssa.DebugRef]Pointer in the Result, one
	// entry per source expression.
	//
	Queries         map[ssa.Value]struct{}
	IndirectQueries map[ssa.Value]struct{}

	// If Log is non-nil, log messages are written to it.
	// Logging is extremely verbose.
	Log io.Writer
	// contains filtered or unexported fields
}

A Config formulates a pointer analysis problem for Analyze. It is only usable for a single invocation of Analyze and must not be reused.

func (*Config) AddExtendedQuery

func (c *Config) AddExtendedQuery(v ssa.Value, query string) (*Pointer, error)

AddExtendedQuery adds an extended, AST-based query on v to the analysis. The query, which must be a single Go expression, allows destructuring the value.

The query must operate on a variable named 'x', which represents the value, and result in a pointer-like object. Only a subset of Go expressions are permitted in queries, namely channel receives, pointer dereferences, field selectors, array/slice/map/tuple indexing and grouping with parentheses. The specific indices when indexing arrays, slices and maps have no significance. Indices used on tuples must be numeric and within bounds.

All field selectors must be explicit, even ones usually elided due to promotion of embedded fields.

The query 'x' is identical to using AddQuery. The query '*x' is identical to using AddIndirectQuery.

On success, AddExtendedQuery returns a Pointer to the queried value. This Pointer will be initialized during analysis. Using it before analysis has finished has undefined behavior.

Example:

// given v, which represents a function call to 'fn() (int, []*T)', and
// 'type T struct { F *int }', the following query will access the field F.
c.AddExtendedQuery(v, "x[1][0].F")

func (*Config) AddIndirectQuery

func (c *Config) AddIndirectQuery(v ssa.Value)

AddQuery adds v to Config.IndirectQueries. Precondition: CanPoint(v.Type().Underlying().(*types.Pointer).Elem()).

func (*Config) AddQuery

func (c *Config) AddQuery(v ssa.Value)

AddQuery adds v to Config.Queries. Precondition: CanPoint(v.Type()).

type Label

type Label struct {
	// contains filtered or unexported fields
}

A Label is an entity that may be pointed to by a pointer, map, channel, 'func', slice or interface.

Labels include:

  • functions
  • globals
  • tagged objects, representing interfaces and reflect.Values
  • arrays created by conversions (e.g. []byte("foo"), []byte(s))
  • stack- and heap-allocated variables (including composite literals)
  • channels, maps and arrays created by make()
  • intrinsic or reflective operations that allocate (e.g. append, reflect.New)
  • intrinsic objects, e.g. the initial array behind os.Args.
  • and their subelements, e.g. "alloc.y[*].z"

Labels are so varied that they defy good generalizations; some have no value, no callgraph node, or no position. Many objects have types that are inexpressible in Go: maps, channels, functions, tagged objects.

At most one of Value() or ReflectType() may return non-nil.

func (Label) Path

func (l Label) Path() string

Path returns the path to the subelement of the object containing this label. For example, ".x[*].y".

func (Label) Pos

func (l Label) Pos() token.Pos

Pos returns the position of this label, if known, zero otherwise.

func (Label) ReflectType

func (l Label) ReflectType() types.Type

ReflectType returns the type represented by this label if it is an reflect.rtype instance object or *reflect.rtype-tagged object.

func (Label) String

func (l Label) String() string

String returns the printed form of this label.

Examples: Object type:

x                                       (a variable)
(sync.Mutex).Lock                       (a function)
convert                                 (array created by conversion)
makemap                                 (map allocated via make)
makechan                                (channel allocated via make)
makeinterface                           (tagged object allocated by makeinterface)
<alloc in reflect.Zero>                 (allocation in instrinsic)
sync.Mutex                              (a reflect.rtype instance)
<command-line arguments>                (an intrinsic object)

Labels within compound objects have subelement paths:

x.y[*].z                                (a struct variable, x)
append.y[*].z                           (array allocated by append)
makeslice.y[*].z                        (array allocated via make)

TODO(adonovan): expose func LabelString(*types.Package, Label).

func (Label) Value

func (l Label) Value() ssa.Value

Value returns the ssa.Value that allocated this label's object, if any.

type Pointer

type Pointer struct {
	// contains filtered or unexported fields
}

A Pointer is an equivalence class of pointer-like values.

A Pointer doesn't have a unique type because pointers of distinct types may alias the same object.

func (Pointer) DynamicTypes

func (p Pointer) DynamicTypes() *typeutil.Map

DynamicTypes returns p.PointsTo().DynamicTypes().

func (Pointer) MayAlias

func (p Pointer) MayAlias(q Pointer) bool

MayAlias reports whether the receiver pointer may alias the argument pointer.

func (Pointer) PointsTo

func (p Pointer) PointsTo() PointsToSet

PointsTo returns the points-to set of this pointer.

func (Pointer) String

func (p Pointer) String() string

type PointsToSet

type PointsToSet struct {
	// contains filtered or unexported fields
}

A PointsToSet is a set of labels (locations or allocations).

func (PointsToSet) DynamicTypes

func (s PointsToSet) DynamicTypes() *typeutil.Map

If this PointsToSet came from a Pointer of interface kind or a reflect.Value, DynamicTypes returns the set of dynamic types that it may contain. (For an interface, they will always be concrete types.)

The result is a mapping whose keys are the dynamic types to which it may point. For each pointer-like key type, the corresponding map value is the PointsToSet for pointers of that type.

The result is empty unless CanHaveDynamicTypes(T).

func (PointsToSet) Intersects

func (s PointsToSet) Intersects(y PointsToSet) bool

Intersects reports whether this points-to set and the argument points-to set contain common members.

func (PointsToSet) Labels

func (s PointsToSet) Labels() []*Label

PointsTo returns the set of labels that this points-to set contains.

func (PointsToSet) String

func (s PointsToSet) String() string

type Result

type Result struct {
	CallGraph       *callgraph.Graph      // discovered call graph
	Queries         map[ssa.Value]Pointer // pts(v) for each v in Config.Queries.
	IndirectQueries map[ssa.Value]Pointer // pts(*v) for each v in Config.IndirectQueries.
	Warnings        []Warning             // warnings of unsoundness
}

A Result contains the results of a pointer analysis.

See Config for how to request the various Result components.

func Analyze

func Analyze(config *Config) (result *Result, err error)

Analyze runs the pointer analysis with the scope and options specified by config, and returns the (synthetic) root of the callgraph.

Pointer analysis of a transitively closed well-typed program should always succeed. An error can occur only due to an internal bug.

type Warning

type Warning struct {
	Pos     token.Pos
	Message string
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL