Documentation ¶
Overview ¶
Package sqltracestore implements a tracestore.TraceStore on top of SQL. We'll look that the SQL schema used to explain how SQLTraceStore maps traces into an SQL database.
We store the name of every source file that has been ingested in the SourceFiles table so we can use the shorter 64 bit source_file_id in other tables.
SourceFiles ( source_file_id INT PRIMARY KEY DEFAULT unique_rowid(), source_file TEXT UNIQUE NOT NULL )
Each trace name, which is a structured key (See /infra/go/query) of the form,key1=value1,key2=value2,..., is stored either as the md5 hash of the trace name, i.e. trace_id = md5(trace_name) or as the series of key=value pairs that make up the params of the key.
When we store the values of each trace in the TraceValues table, use the trace_id and the commit_number as the primary key. We also store not only the value but the id of the source file that the value came from.
CREATE TABLE IF NOT EXISTS TraceValues ( trace_id BYTES, -- Id of the trace name from TraceIDS. commit_number INT, -- A types.CommitNumber. val REAL, -- The floating point measurement. source_file_id INT, -- Id of the source filename, from SourceFiles. PRIMARY KEY (trace_id, commit_number) );
Just using this table we can construct some useful queries. For example we can count the number of traces in a single tile, in this case the 0th tile in a system with a tileSize of 256:
SELECT COUNT(DISTINCT trace_id) FROM TraceValues WHERE commit_number >= 0 AND commit_number < 256;
The Postings table is our inverted index for looking up which trace ids contain which key=value pairs. For a good introduction to postings and search https://www.tbray.org/ongoing/When/200x/2003/06/18/HowSearchWorks is a good resource.
Remember that each trace name is a structured key of the form,arch=x86,config=8888,..., and that over time traces may come and go, i.e. we may stop running a test, or start running new tests, so if we want to make searching for traces efficient we need to be aware of how those trace ids change over time. The answer is to break our store in Tiles, i.e. blocks of commits of tileSize length, and then for each Tile we keep an inverted index of the trace ids.
In the table below we store a key_value which is the literal "key=value" part of a trace name, along with the tile_number and the md5 trace_id. Note that tile_number is just int(commitNumber/tileSize).
CREATE TABLE IF NOT EXISTS Postings ( -- A types.TileNumber. tile_number INT, -- A key value pair from a structured key, e.g. "config=8888". key_value STRING NOT NULL, -- md5(trace_name) trace_id BYTES, PRIMARY KEY (tile_number, key_value, trace_id) );
Finally, to make it fast to turn UI queries into SQL queries we store the ParamSet representing all the trace names in the Tile.
CREATE TABLE IF NOT EXISTS ParamSets ( tile_number INT, param_key STRING, param_value STRING, PRIMARY KEY (tile_number, param_key, param_value), INDEX (tile_number DESC), );
So for example to build a ParamSet for a tile:
SELECT param_key, param_value FROM ParamSets WHERE tile_number=0;
To find the most recent tile:
SELECT tile_number FROM ParamSets ORDER BY tile_number DESC LIMIT 1;
To query for traces we first find the trace_ids of all the traces that would match the given query on a tile.
SELECT encode(trace_id, 'hex') FROM Postings WHERE key_value IN ('config=8888', 'config=565') AND tile_number = 0 INTERSECT SELECT encode(trace_id, 'hex') FROM Postings WHERE key_value IN ('arch=x86', 'arch=risc-v') AND tile_number = 0;
Then once you have all the trace_ids, load the values from the TraceValues table.
SELECT trace_id, commit_number, val FROM TraceValues WHERE tracevalues.commit_number >= 0 AND tracevalues.commit_number < 256 AND tracevalues.trace_id IN ( '\xfe385b159ff55dca481069805e5ff050', '\x277262a9236d571883d47dab102070bc' );
Look in migrations/cdb.sql for more example of raw queries using a simple example dataset.
Index ¶
- type SQLTraceStore
- func (s *SQLTraceStore) ClearOrderedParamSetCache()
- func (s *SQLTraceStore) CommitNumberOfTileStart(commitNumber types.CommitNumber) types.CommitNumber
- func (s *SQLTraceStore) GetLastNSources(ctx context.Context, traceID string, n int) ([]tracestore.Source, error)
- func (s *SQLTraceStore) GetLatestTile(ctx context.Context) (types.TileNumber, error)
- func (s *SQLTraceStore) GetParamSet(ctx context.Context, tileNumber types.TileNumber) (paramtools.ReadOnlyParamSet, error)
- func (s *SQLTraceStore) GetSource(ctx context.Context, commitNumber types.CommitNumber, traceName string) (string, error)
- func (s *SQLTraceStore) GetTraceIDsBySource(ctx context.Context, sourceFilename string, tileNumber types.TileNumber) ([]string, error)
- func (s *SQLTraceStore) OffsetFromCommitNumber(commitNumber types.CommitNumber) int32
- func (s *SQLTraceStore) QueryTraces(ctx context.Context, tileNumber types.TileNumber, q *query.Query) (types.TraceSet, []provider.Commit, error)
- func (s *SQLTraceStore) QueryTracesIDOnly(ctx context.Context, tileNumber types.TileNumber, q *query.Query) (<-chan paramtools.Params, error)
- func (s *SQLTraceStore) ReadTraces(ctx context.Context, tileNumber types.TileNumber, traceNames []string) (types.TraceSet, []provider.Commit, error)
- func (s *SQLTraceStore) ReadTracesForCommitRange(ctx context.Context, traceNames []string, beginCommit types.CommitNumber, ...) (types.TraceSet, []provider.Commit, error)
- func (s *SQLTraceStore) StartBackgroundMetricsGathering()
- func (s *SQLTraceStore) TileNumber(commitNumber types.CommitNumber) types.TileNumber
- func (s *SQLTraceStore) TileSize() int32
- func (s *SQLTraceStore) TraceCount(ctx context.Context, tileNumber types.TileNumber) (int64, error)
- func (s *SQLTraceStore) WriteTraces(ctx context.Context, commitNumber types.CommitNumber, ...) error
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type SQLTraceStore ¶
type SQLTraceStore struct {
// contains filtered or unexported fields
}
SQLTraceStore implements tracestore.TraceStore backed onto an SQL database.
func New ¶
func New(db pool.Pool, datastoreConfig config.DataStoreConfig) (*SQLTraceStore, error)
New returns a new *SQLTraceStore.
We presume all migrations have been run against db before this function is called.
func (*SQLTraceStore) ClearOrderedParamSetCache ¶
func (s *SQLTraceStore) ClearOrderedParamSetCache()
ClearOrderedParamSetCache is only used for tests.
func (*SQLTraceStore) CommitNumberOfTileStart ¶
func (s *SQLTraceStore) CommitNumberOfTileStart(commitNumber types.CommitNumber) types.CommitNumber
CommitNumberOfTileStart implements the tracestore.TraceStore interface.
func (*SQLTraceStore) GetLastNSources ¶
func (s *SQLTraceStore) GetLastNSources(ctx context.Context, traceID string, n int) ([]tracestore.Source, error)
GetLastNSources implements the tracestore.TraceStore interface.
func (*SQLTraceStore) GetLatestTile ¶
func (s *SQLTraceStore) GetLatestTile(ctx context.Context) (types.TileNumber, error)
GetLatestTile implements the tracestore.TraceStore interface.
func (*SQLTraceStore) GetParamSet ¶
func (s *SQLTraceStore) GetParamSet(ctx context.Context, tileNumber types.TileNumber) (paramtools.ReadOnlyParamSet, error)
GetParamSet implements the tracestore.TraceStore interface.
func (*SQLTraceStore) GetSource ¶
func (s *SQLTraceStore) GetSource(ctx context.Context, commitNumber types.CommitNumber, traceName string) (string, error)
GetSource implements the tracestore.TraceStore interface.
func (*SQLTraceStore) GetTraceIDsBySource ¶
func (s *SQLTraceStore) GetTraceIDsBySource(ctx context.Context, sourceFilename string, tileNumber types.TileNumber) ([]string, error)
GetTraceIDsBySource implements the tracestore.TraceStore interface.
func (*SQLTraceStore) OffsetFromCommitNumber ¶
func (s *SQLTraceStore) OffsetFromCommitNumber(commitNumber types.CommitNumber) int32
OffsetFromCommitNumber implements the tracestore.TraceStore interface.
func (*SQLTraceStore) QueryTraces ¶
func (s *SQLTraceStore) QueryTraces(ctx context.Context, tileNumber types.TileNumber, q *query.Query) (types.TraceSet, []provider.Commit, error)
QueryTraces implements the tracestore.TraceStore interface.
func (*SQLTraceStore) QueryTracesIDOnly ¶
func (s *SQLTraceStore) QueryTracesIDOnly(ctx context.Context, tileNumber types.TileNumber, q *query.Query) (<-chan paramtools.Params, error)
QueryTracesIDOnly implements the tracestore.TraceStore interface.
func (*SQLTraceStore) ReadTraces ¶
func (s *SQLTraceStore) ReadTraces(ctx context.Context, tileNumber types.TileNumber, traceNames []string) (types.TraceSet, []provider.Commit, error)
ReadTraces implements the tracestore.TraceStore interface.
func (*SQLTraceStore) ReadTracesForCommitRange ¶
func (s *SQLTraceStore) ReadTracesForCommitRange(ctx context.Context, traceNames []string, beginCommit types.CommitNumber, endCommit types.CommitNumber) (types.TraceSet, []provider.Commit, error)
ReadTracesForCommitRange implements the tracestore.TraceStore interface.
func (*SQLTraceStore) StartBackgroundMetricsGathering ¶
func (s *SQLTraceStore) StartBackgroundMetricsGathering()
StartBackgroundMetricsGathering runs continuously in the background and gathers metrics related to param sets in the database.
func (*SQLTraceStore) TileNumber ¶
func (s *SQLTraceStore) TileNumber(commitNumber types.CommitNumber) types.TileNumber
TileNumber implements the tracestore.TraceStore interface.
func (*SQLTraceStore) TileSize ¶
func (s *SQLTraceStore) TileSize() int32
TileSize implements the tracestore.TraceStore interface.
func (*SQLTraceStore) TraceCount ¶
func (s *SQLTraceStore) TraceCount(ctx context.Context, tileNumber types.TileNumber) (int64, error)
TraceCount implements the tracestore.TraceStore interface.
func (*SQLTraceStore) WriteTraces ¶
func (s *SQLTraceStore) WriteTraces(ctx context.Context, commitNumber types.CommitNumber, params []paramtools.Params, values []float32, ps paramtools.ParamSet, source string, _ time.Time) error
WriteTraces implements the tracestore.TraceStore interface.