Documentation ¶
Index ¶
- func AnalyzeFunc(fn *ir.Func, canInline func(*ir.Func), budgetForFunc func(*ir.Func) int32, ...)
- func BudgetExpansion(maxBudget int32) int32
- func DumpFuncProps(fn *ir.Func, dumpfile string)
- func DumpInlCallSiteScores(profile *pgoir.Profile, ...)
- func Enabled() bool
- func EncodeCallSiteKey(cs *CallSite) string
- func GetCallSiteScore(fn *ir.Func, call *ir.CallExpr) (int, bool)
- func LargestNegativeScoreAdjustment(fn *ir.Func, props *FuncProps) int
- func LargestPositiveScoreAdjustment(fn *ir.Func) int
- func ScoreCalls(fn *ir.Func)
- func ScoreCallsCleanup()
- func SetupScoreAdjustments()
- func ShouldFoldIfNameConstant(n ir.Node, names []*ir.Name) bool
- func TearDown()
- func UnitTesting() bool
- func UpdateCallsiteTable(callerfn *ir.Func, n *ir.CallExpr, ic *ir.InlinedCallExpr)
- type ActualExprPropBits
- type CSPropBits
- type CallSite
- type CallSiteTab
- type FuncPropBits
- type FuncProps
- type ParamPropBits
- type ResultPropBits
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AnalyzeFunc ¶
func AnalyzeFunc(fn *ir.Func, canInline func(*ir.Func), budgetForFunc func(*ir.Func) int32, inlineMaxBudget int)
AnalyzeFunc computes function properties for fn and its contained closures, updating the global 'fpmap' table. It is assumed that "CanInline" has been run on fn and on the closures that feed directly into calls; other closures not directly called will also be checked inlinability for inlinability here in case they are returned as a result.
func BudgetExpansion ¶
BudgetExpansion returns the amount to relax/expand the base inlining budget when the new inliner is turned on; the inliner will add the returned value to the hairiness budget.
Background: with the new inliner, the score for a given callsite can be adjusted down by some amount due to heuristics, however we won't know whether this is going to happen until much later after the CanInline call. This function returns the amount to relax the budget initially (to allow for a large score adjustment); later on in RevisitInlinability we'll look at each individual function to demote it if needed.
func DumpFuncProps ¶
DumpFuncProps computes and caches function properties for the func 'fn', writing out a description of the previously computed set of properties to the file given in 'dumpfile'. Used for the "-d=dumpinlfuncprops=..." command line flag, intended for use primarily in unit testing.
func DumpInlCallSiteScores ¶
func DumpInlCallSiteScores(profile *pgoir.Profile, budgetCallback func(fn *ir.Func, profile *pgoir.Profile) (int32, bool))
DumpInlCallSiteScores is invoked by the inliner if the debug flag "-d=dumpinlcallsitescores" is set; it dumps out a human-readable summary of all (potentially) inlinable callsites in the package, along with info on call site scoring and the adjustments made to a given score. Here profile is the PGO profile in use (may be nil), budgetCallback is a callback that can be invoked to find out the original pre-adjustment hairiness limit for the function, and inlineHotMaxBudget is the constant of the same name used in the inliner. Sample output lines:
Score Adjustment Status Callee CallerPos ScoreFlags 115 40 DEMOTED cmd/compile/internal/abi.(*ABIParamAssignment).Offset expand_calls.go:1679:14|6 panicPathAdj 76 -5n PROMOTED runtime.persistentalloc mcheckmark.go:48:45|3 inLoopAdj 201 0 --- PGO unicode.DecodeRuneInString utf8.go:312:30|1 7 -5 --- PGO internal/abi.Name.DataChecked type.go:625:22|0 inLoopAdj
In the dump above, "Score" is the final score calculated for the callsite, "Adjustment" is the amount added to or subtracted from the original hairiness estimate to form the score. "Status" shows whether anything changed with the site -- did the adjustment bump it down just below the threshold ("PROMOTED") or instead bump it above the threshold ("DEMOTED"); this will be blank ("---") if no threshold was crossed as a result of the heuristics. Note that "Status" also shows whether PGO was involved. "Callee" is the name of the function called, "CallerPos" is the position of the callsite, and "ScoreFlags" is a digest of the specific properties we used to make adjustments to callsite score via heuristics.
func EncodeCallSiteKey ¶
func GetCallSiteScore ¶
GetCallSiteScore returns the previously calculated score for call within fn.
func LargestNegativeScoreAdjustment ¶
LargestNegativeScoreAdjustment tries to estimate the largest possible negative score adjustment that could be applied to a call of the function with the specified props. Example:
func foo() { func bar(x int, p *int) int { ... if x < 0 { *p = x } } return 99 }
Function 'foo' above on the left has no interesting properties, thus as a result the most we'll adjust any call to is the value for "call in loop". If the calculated cost of the function is 150, and the in-loop adjustment is 5 (for example), then there is not much point treating it as inlinable. On the other hand "bar" has a param property (parameter "x" feeds unmodified to an "if" statement") and a return property (always returns same constant) meaning that a given call _could_ be rescored down as much as -35 points-- thus if the size of "bar" is 100 (for example) then there is at least a chance that scoring will enable inlining.
func LargestPositiveScoreAdjustment ¶
LargestPositiveScoreAdjustment tries to estimate the largest possible positive score adjustment that could be applied to a given callsite. At the moment we don't have very many positive score adjustments, so this is just hard-coded, not table-driven.
func ScoreCalls ¶
ScoreCalls assigns numeric scores to each of the callsites in function 'fn'; the lower the score, the more helpful we think it will be to inline.
Unlike a lot of the other inline heuristics machinery, callsite scoring can't be done as part of the CanInline call for a function, due to fact that we may be working on a non-trivial SCC. So for example with this SCC:
func foo(x int) { func bar(x int, f func()) { if x != 0 { f() bar(x, func(){}) foo(x-1) } } }
We don't want to perform scoring for the 'foo' call in "bar" until after foo has been analyzed, but it's conceivable that CanInline might visit bar before foo for this SCC.
func ScoreCallsCleanup ¶
func ScoreCallsCleanup()
ScoreCallsCleanup resets the state of the callsite cache once ScoreCalls is done with a function.
func SetupScoreAdjustments ¶
func SetupScoreAdjustments()
SetupScoreAdjustments interprets the value of the -d=inlscoreadj debugging option, if set. The value of this flag is expected to be a series of "/"-separated clauses of the form adj1:value1. Example: -d=inlscoreadj=inLoopAdj=0/passConstToIfAdj=-99
func ShouldFoldIfNameConstant ¶
ShouldFoldIfNameConstant analyzes expression tree 'e' to see whether it contains only combinations of simple references to all of the names in 'names' with selected constants + operators. The intent is to identify expression that could be folded away to a constant if the value of 'n' were available. Return value is TRUE if 'e' does look foldable given the value of 'n', and given that 'e' actually makes reference to 'n'. Some examples where the type of "n" is int64, type of "s" is string, and type of "p" is *byte:
Simple? Expr yes n<10 yes n*n-100 yes (n < 10 || n > 100) && (n >= 12 || n <= 99 || n != 101) yes s == "foo" yes p == nil no n<foo() no n<1 || n>m no float32(n)<1.0 no *p == 1 no 1 + 100 no 1 / n no 1 + unsafe.Sizeof(n)
To avoid complexities (e.g. nan, inf) we stay way from folding and floating point or complex operations (integers, bools, and strings only). We also try to be conservative about avoiding any operation that might result in a panic at runtime, e.g. for "n" with type int64:
1<<(n-9) < 100/(n<<9999)
we would return FALSE due to the negative shift count and/or potential divide by zero.
func TearDown ¶
func TearDown()
TearDown is invoked at the end of the main inlining pass; doing function analysis and call site scoring is unlikely to help a lot after this point, so nil out fpmap and other globals to reclaim storage.
func UnitTesting ¶
func UnitTesting() bool
func UpdateCallsiteTable ¶
UpdateCallsiteTable handles updating of callerfn's call site table after an inlined has been carried out, e.g. the call at 'n' as been turned into the inlined call expression 'ic' within function callerfn. The chief thing of interest here is to make sure that any call nodes within 'ic' are added to the call site table for 'callerfn' and scored appropriately.
Types ¶
type ActualExprPropBits ¶
type ActualExprPropBits uint8
ActualExprPropBits describes a property of an actual expression (value passed to some specific func argument at a call site).
const ( ActualExprConstant ActualExprPropBits = 1 << iota ActualExprIsConcreteConvIface ActualExprIsFunc ActualExprIsInlinableFunc )
func (ActualExprPropBits) String ¶
func (i ActualExprPropBits) String() string
type CSPropBits ¶
type CSPropBits uint32
const ( CallSiteInLoop CSPropBits = 1 << iota CallSiteOnPanicPath CallSiteInInitFunc )
func (CSPropBits) String ¶
func (i CSPropBits) String() string
type CallSite ¶
type CallSite struct { Callee *ir.Func Call *ir.CallExpr Assign ir.Node Flags CSPropBits ArgProps []ActualExprPropBits Score int ScoreMask scoreAdjustTyp ID uint // contains filtered or unexported fields }
CallSite records useful information about a potentially inlinable (direct) function call. "Callee" is the target of the call, "Call" is the ir node corresponding to the call itself, "Assign" is the top-level assignment statement containing the call (if the call appears in the form of a top-level statement, e.g. "x := foo()"), "Flags" contains properties of the call that might be useful for making inlining decisions, "Score" is the final score assigned to the site, and "ID" is a numeric ID for the site within its containing function.
type CallSiteTab ¶
CallSiteTab is a table of call sites, keyed by call expr. Ideally it would be nice to key the table by src.XPos, but this results in collisions for calls on very long lines (the front end saturates column numbers at 255). We also wind up with many calls that share the same auto-generated pos.
type FuncPropBits ¶
type FuncPropBits uint32
const ( // Function always panics or invokes os.Exit() or a func that does // likewise. FuncPropNeverReturns FuncPropBits = 1 << iota )
func (FuncPropBits) String ¶
func (i FuncPropBits) String() string
type FuncProps ¶
type FuncProps struct { Flags FuncPropBits ParamFlags []ParamPropBits // slot 0 receiver if applicable ResultFlags []ResultPropBits }
FuncProps describes a set of function or method properties that may be useful for inlining heuristics. Here 'Flags' are properties that we think apply to the entire function; 'RecvrParamFlags' are properties of specific function params (or the receiver), and 'ResultFlags' are things properties we think will apply to values of specific results. Note that 'ParamFlags' includes and entry for the receiver if applicable, and does include etries for blank params; for a function such as "func foo(_ int, b byte, _ float32)" the length of ParamFlags will be 3.
func DeserializeFromString ¶
func (*FuncProps) SerializeToString ¶
type ParamPropBits ¶
type ParamPropBits uint32
const ( // No info about this param ParamNoInfo ParamPropBits = 0 // Parameter value feeds unmodified into a top-level interface // call (this assumes the parameter is of interface type). ParamFeedsInterfaceMethodCall ParamPropBits = 1 << iota // Parameter value feeds unmodified into an interface call that // may be conditional/nested and not always executed (this assumes // the parameter is of interface type). ParamMayFeedInterfaceMethodCall ParamPropBits = 1 << iota // Parameter value feeds unmodified into a top level indirect // function call (assumes parameter is of function type). ParamFeedsIndirectCall // Parameter value feeds unmodified into an indirect function call // that is conditional/nested (not guaranteed to execute). Assumes // parameter is of function type. ParamMayFeedIndirectCall // Parameter value feeds unmodified into a top level "switch" // statement or "if" statement simple expressions (see more on // "simple" expression classification below). ParamFeedsIfOrSwitch // Parameter value feeds unmodified into a "switch" or "if" // statement simple expressions (see more on "simple" expression // classification below), where the if/switch is // conditional/nested. ParamMayFeedIfOrSwitch )
func (ParamPropBits) String ¶
func (i ParamPropBits) String() string
type ResultPropBits ¶
type ResultPropBits uint32
const ( // No info about this result ResultNoInfo ResultPropBits = 0 // This result always contains allocated memory. ResultIsAllocatedMem ResultPropBits = 1 << iota // This result is always a single concrete type that is // implicitly converted to interface. ResultIsConcreteTypeConvertedToInterface // Result is always the same non-composite compile time constant. ResultAlwaysSameConstant // Result is always the same function or closure. ResultAlwaysSameFunc // Result is always the same (potentially) inlinable function or closure. ResultAlwaysSameInlinableFunc )
func (ResultPropBits) String ¶
func (i ResultPropBits) String() string
Source Files ¶
- actualexprpropbits_string.go
- analyze.go
- analyze_func_callsites.go
- analyze_func_flags.go
- analyze_func_params.go
- analyze_func_returns.go
- callsite.go
- cspropbits_string.go
- eclassify.go
- funcprop_string.go
- funcpropbits_string.go
- function_properties.go
- names.go
- parampropbits_string.go
- pstate_string.go
- resultpropbits_string.go
- score_callresult_uses.go
- scoreadjusttyp_string.go
- scoring.go
- serialize.go
- trace_off.go