Documentation ¶
Overview ¶
Package dataframe provides DataFrame which is a TraceSet with a calculated ParamSet and associated commit info.
Index ¶
Constants ¶
const ( // DEFAULT_NUM_COMMITS is the number of commits in the DataFrame returned // from New(). DEFAULT_NUM_COMMITS = 50 MAX_SAMPLE_SIZE = 5000 )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ColumnHeader ¶
type ColumnHeader struct { Offset types.CommitNumber `json:"offset"` Timestamp TimestampSeconds `json:"timestamp"` }
ColumnHeader describes each column in a DataFrame.
func FromTimeRange ¶
func FromTimeRange(ctx context.Context, git perfgit.Git, begin, end time.Time, downsample bool) ([]*ColumnHeader, []types.CommitNumber, int, error)
FromTimeRange returns the slices of ColumnHeader and int32. The slices are for the commits that fall in the given time range [begin, end).
If 'downsample' is true then the number of commits returned is limited to MAX_SAMPLE_SIZE. TODO(jcgregorio) Remove downsample, it is currently ignored. The value for 'skip', the number of commits skipped, is also returned.
func MergeColumnHeaders ¶
func MergeColumnHeaders(a, b []*ColumnHeader) ([]*ColumnHeader, map[int]int, map[int]int)
MergeColumnHeaders creates a merged header from the two given headers.
I.e. {1,4,5} + {3,4} => {1,3,4,5}
type DataFrame ¶
type DataFrame struct { TraceSet types.TraceSet `json:"traceset"` Header []*ColumnHeader `json:"header"` ParamSet paramtools.ReadOnlyParamSet `json:"paramset"` Skip int `json:"skip"` }
DataFrame stores Perf measurements in a table where each row is a Trace indexed by a structured key (see go/query), and each column is described by a ColumnHeader, which could be a commit or a trybot patch level.
Skip is the number of commits skipped to bring the DataFrame down to less than MAX_SAMPLE_SIZE commits. If Skip is zero then no commits were skipped.
The name DataFrame was gratuitously borrowed from R.
func Join ¶
Join create a new DataFrame that is the union of 'a' and 'b'.
Will handle the case of a and b having data for different sets of commits, i.e. a.Header doesn't have to equal b.Header.
func NewHeaderOnly ¶
func NewHeaderOnly(ctx context.Context, git perfgit.Git, begin, end time.Time, downsample bool) (*DataFrame, error)
NewHeaderOnly returns a DataFrame with a populated Header, with no traces. The 'progress' callback is called periodically as the query is processed.
If 'downsample' is true then the number of commits returned is limited to MAX_SAMPLE_SIZE.
func (*DataFrame) BuildParamSet ¶
func (d *DataFrame) BuildParamSet()
BuildParamSet rebuilds d.ParamSet from the keys of d.TraceSet.
func (*DataFrame) Compress ¶
Compress returns a DataFrame with all columns that don't contain any data removed. If the DataFrame is already fully compressed then the original DataFrame is returned.
func (*DataFrame) FilterOut ¶
func (d *DataFrame) FilterOut(f TraceFilter)
FilterOut removes traces from d.TraceSet if the filter function 'f' returns true for a trace.
FilterOut rebuilds the ParamSet to match the new set of traces once filtering is complete.
func (*DataFrame) Slice ¶
Slice returns a dataframe that contains a subset of the current dataframe, starting from 'offset', the next 'size' num points will be returned as a new dataframe. Note that the data is composed of slices of the original data, not copies, so the returned dataframe must not be altered.
type DataFrameBuilder ¶
type DataFrameBuilder interface { // NewFromQueryAndRange returns a populated DataFrame of the traces that match // the given time range [begin, end) and the passed in query, or a non-nil // error if the traces can't be retrieved. The 'progress' callback is called // periodically as the query is processed. NewFromQueryAndRange(ctx context.Context, begin, end time.Time, q *query.Query, downsample bool, progress progress.Progress) (*DataFrame, error) // NewFromKeysAndRange returns a populated DataFrame of the traces that match // the given set of 'keys' over the range of [begin, end). The 'progress' // callback is called periodically as the query is processed. NewFromKeysAndRange(ctx context.Context, keys []string, begin, end time.Time, downsample bool, progress progress.Progress) (*DataFrame, error) // NewNFromQuery returns a populated DataFrame of condensed traces of N data // points ending at the given 'end' time that match the given query. NewNFromQuery(ctx context.Context, end time.Time, q *query.Query, n int32, progress progress.Progress) (*DataFrame, error) // NewNFromQuery returns a populated DataFrame of condensed traces of N data // points ending at the given 'end' time for the given keys. NewNFromKeys(ctx context.Context, end time.Time, keys []string, n int32, progress progress.Progress) (*DataFrame, error) // NumMatches returns the number of traces that will match the query. NumMatches(ctx context.Context, q *query.Query) (int64, error) // PreflightQuery returns the number of traces that will match the query and // a refined ParamSet to use for further queries. The referenceParamSet // should be a ParamSet that includes all the Params that could appear in a // query. For example, the ParamSet managed by ParamSetRefresher. PreflightQuery(ctx context.Context, q *query.Query, referenceParamSet paramtools.ReadOnlyParamSet) (int64, paramtools.ParamSet, error) }
DataFrameBuilder is an interface for things that construct DataFrames.
type TimestampSeconds ¶
type TimestampSeconds int64
TimestampSeconds represents a timestamp in seconds from the Unix epoch.
type TraceFilter ¶
TraceFilter is a function type that should return true if trace 'tr' should be removed from a DataFrame. It is used in FilterOut.