Documentation ¶
Overview ¶
Package inline implements inlining of Go function calls.
The client provides information about the caller and callee, including the source text, syntax tree, and type information, and the inliner returns the modified source file for the caller, or an error if the inlining operation is invalid (for example because the function body refers to names that are inaccessible to the caller).
Although this interface demands more information from the client than might seem necessary, it enables smoother integration with existing batch and interactive tools that have their own ways of managing the processes of reading, parsing, and type-checking packages. In particular, this package does not assume that the caller and callee belong to the same token.FileSet or types.Importer realms.
There are many aspects to a function call. It is the only construct that can simultaneously bind multiple variables of different explicit types, with implicit assignment conversions. (Neither var nor := declarations can do that.) It defines the scope of control labels, of return statements, and of defer statements. Arguments and results of function calls may be tuples even though tuples are not first-class values in Go, and a tuple-valued call expression may be "spread" across the argument list of a call or the operands of a return statement. All these unique features mean that in the general case, not everything that can be expressed by a function call can be expressed without one.
So, in general, inlining consists of modifying a function or method call expression f(a1, ..., an) so that the name of the function f is replaced ("literalized") by a literal copy of the function declaration, with free identifiers suitably modified to use the locally appropriate identifiers or perhaps constant argument values.
Inlining must not change the semantics of the call. Semantics preservation is crucial for clients such as codebase maintenance tools that automatically inline all calls to designated functions on a large scale. Such tools must not introduce subtle behavior changes. (Fully inlining a call is dynamically observable using reflection over the call stack, but this exception to the rule is explicitly allowed.)
In many cases it is possible to entirely replace ("reduce") the call by a copy of the function's body in which parameters have been replaced by arguments. The inliner supports a number of reduction strategies, and we expect this set to grow. Nonetheless, sound reduction is surprisingly tricky.
The inliner is in some ways like an optimizing compiler. A compiler is considered correct if it doesn't change the meaning of the program in translation from source language to target language. An optimizing compiler exploits the particulars of the input to generate better code, where "better" usually means more efficient. When a case is found in which it emits suboptimal code, the compiler is improved to recognize more cases, or more rules, and more exceptions to rules; this process has no end. Inlining is similar except that "better" code means tidier code. The baseline translation (literalization) is correct, but there are endless rules--and exceptions to rules--by which the output can be improved.
The following section lists some of the challenges, and ways in which they can be addressed.
All effects of the call argument expressions must be preserved, both in their number (they must not be eliminated or repeated), and in their order (both with respect to other arguments, and any effects in the callee function).
This must be the case even if the corresponding parameters are never referenced, are referenced multiple times, referenced in a different order from the arguments, or referenced within a nested function that may be executed an arbitrary number of times.
Currently, parameter replacement is not applied to arguments with effects, but with further analysis of the sequence of strict effects within the callee we could relax this constraint.
When not all parameters can be substituted by their arguments (e.g. due to possible effects), if the call appears in a statement context, the inliner may introduce a var declaration that declares the parameter variables (with the correct types) and assigns them to their corresponding argument values. The rest of the function body may then follow. For example, the call
f(1, 2)
to the function
func f(x, y int32) { stmts }
may be reduced to
{ var x, y int32 = 1, 2; stmts }.
There are many reasons why this is not always possible. For example, true parameters are statically resolved in the same scope, and are dynamically assigned their arguments in parallel; but each spec in a var declaration is statically resolved in sequence and dynamically executed in sequence, so earlier parameters may shadow references in later ones.
Even an argument expression as simple as ptr.x may not be referentially transparent, because another argument may have the effect of changing the value of ptr.
This constraint could be relaxed by some kind of alias or escape analysis that proves that ptr cannot be mutated during the call.
Although constants are referentially transparent, as a matter of style we do not wish to duplicate literals that are referenced multiple times in the body because this undoes proper factoring. Also, string literals may be arbitrarily large.
If the function body consists of statements other than just "return expr", in some contexts it may be syntactically impossible to reduce the call. Consider:
if x := f(); cond { ... }
Go has no equivalent to Lisp's progn or Rust's blocks, nor ML's let expressions (let param = arg in body); its closest equivalent is func(param){body}(arg). Reduction strategies must therefore consider the syntactic context of the call.
In such situations we could work harder to extract a statement context for the call, by transforming it to:
{ x := f(); if cond { ... } }
Similarly, without the equivalent of Rust-style blocks and first-class tuples, there is no general way to reduce a call to a function such as
func(params)(args)(results) { stmts; return expr }
to an expression such as
{ var params = args; stmts; expr }
or even a statement such as
results = { var params = args; stmts; expr }
Consequently the declaration and scope of the result variables, and the assignment and control-flow implications of the return statement, must be dealt with by cases.
A standalone call statement that calls a function whose body is "return expr" cannot be simply replaced by the body expression if it is not itself a call or channel receive expression; it is necessary to explicitly discard the result using "_ = expr".
Similarly, if the body is a call expression, only calls to some built-in functions with no result (such as copy or panic) are permitted as statements, whereas others (such as append) return a result that must be used, even if just by discarding.
If a parameter or result variable is updated by an assignment within the function body, it cannot always be safely replaced by a variable in the caller. For example, given
func f(a int) int { a++; return a }
The call y = f(x) cannot be replaced by { x++; y = x } because this would change the value of the caller's variable x. Only if the caller is finished with x is this safe.
A similar argument applies to parameter or result variables that escape: by eliminating a variable, inlining would change the identity of the variable that escapes.
If the function body uses 'defer' and the inlined call is not a tail-call, inlining may delay the deferred effects.
Because the scope of a control label is the entire function, a call cannot be reduced if the caller and callee have intersecting sets of control labels. (It is possible to α-rename any conflicting ones, but our colleagues building C++ refactoring tools report that, when tools must choose new identifiers, they generally do a poor job.)
Given
func f() uint8 { return 0 }
var x any = f()
reducing the call to var x any = 0 is unsound because it discards the implicit conversion to uint8. We may need to make each argument-to-parameter conversion explicit if the types differ. Assignments to variadic parameters may need to explicitly construct a slice.
An analogous problem applies to the implicit assignments in return statements:
func g() any { return f() }
Replacing the call f() with 0 would silently lose a conversion to uint8 and change the behavior of the program.
When inlining a call f(1, x, g()) where those parameters are unreferenced, we should be able to avoid evaluating 1 and x since they are pure and thus have no effect. But x may be the last reference to a local variable in the caller, so removing it would cause a compilation error. Parameter substitution must avoid making the caller's local variables unreferenced (or must be prepared to eliminate the declaration too---this is where an iterative framework for simplification would really help).
An expression such as s[i] may be valid if s and i are variables but invalid if either or both of them are constants. For example, a negative constant index s[-1] is always out of bounds, and even a non-negative constant index may be out of bounds depending on the particular string constant (e.g. "abc"[4]).
So, if a parameter participates in any expression that is subject to additional compile-time checks when its operands are constant, it may be unsafe to substitute that parameter by a constant argument value (#62664).
More complex callee functions are inlinable with more elaborate and invasive changes to the statements surrounding the call expression.
TODO(adonovan): future work:
Handle more of the above special cases by careful analysis, thoughtful factoring of the large design space, and thorough test coverage.
Compute precisely (not conservatively) when parameter substitution would remove the last reference to a caller local variable, and blank out the local instead of retreating from the substitution.
Afford the client more control such as a limit on the total increase in line count, or a refusal to inline using the general approach (replacing name by function literal). This could be achieved by returning metadata alongside the result and having the client conditionally discard the change.
Support inlining of generic functions, replacing type parameters by their instantiations.
Support inlining of calls to function literals ("closures"). But note that the existing algorithm makes widespread assumptions that the callee is a package-level function or method.
Eliminate explicit conversions of "untyped" literals inserted conservatively when they are redundant. For example, the conversion int32(1) is redundant when this value is used only as a slice index; but it may be crucial if it is used in x := int32(1) as it changes the type of x, which may have further implications. The conversions may also be important to the falcon analysis.
Allow non-'go' build systems such as Bazel/Blaze a chance to decide whether an import is accessible using logic other than "/internal/" path segments. This could be achieved by returning the list of added import paths instead of a text diff.
Inlining a function from another module may change the effective version of the Go language spec that governs it. We should probably make the client responsible for rejecting attempts to inline from newer callees to older callers, since there's no way for this package to access module versions.
Use an alternative implementation of the import-organizing operation that doesn't require operating on a complete file (and reformatting). Then return the results in a higher-level form as a set of import additions and deletions plus a single diff that encloses the call expression. This interface could perhaps be implemented atop imports.Process by post-processing its result to obtain the abstract import changes and discarding its formatted output.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Inline ¶
Inline inlines the called function (callee) into the function call (caller) and returns the updated, formatted content of the caller source file.
Inline does not mutate any public fields of Caller or Callee.
The log records the decision-making process.
TODO(adonovan): provide an API for clients that want structured output: a list of import additions and deletions plus one or more localized diffs (or even AST transformations, though ownership and mutation are tricky) near the call site.
Types ¶
type Callee ¶
type Callee struct {
// contains filtered or unexported fields
}
A Callee holds information about an inlinable function. Gob-serializable.
func AnalyzeCallee ¶
func AnalyzeCallee(logf func(string, ...any), fset *token.FileSet, pkg *types.Package, info *types.Info, decl *ast.FuncDecl, content []byte) (*Callee, error)
AnalyzeCallee analyzes a function that is a candidate for inlining and returns a Callee that describes it. The Callee object, which is serializable, can be passed to one or more subsequent calls to Inline, each with a different Caller.
This design allows separate analysis of callers and callees in the golang.org/x/tools/go/analysis framework: the inlining information about a callee can be recorded as a "fact".
The content should be the actual input to the compiler, not the apparent source file according to any //line directives that may be present within it.
type Caller ¶
type Caller struct { Fset *token.FileSet Types *types.Package Info *types.Info File *ast.File Call *ast.CallExpr Content []byte // source of file containing // contains filtered or unexported fields }
A Caller describes the function call and its enclosing context.
The client is responsible for populating this struct and passing it to Inline.