sqltracestore

package
v0.0.0-...-90fa48b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 11, 2025 License: BSD-3-Clause Imports: 30 Imported by: 0

Documentation

Overview

Package sqltracestore implements a tracestore.TraceStore on top of SQL. We'll look that the SQL schema used to explain how SQLTraceStore maps traces into an SQL database.

We store the name of every source file that has been ingested in the SourceFiles table so we can use the shorter 64 bit source_file_id in other tables.

SourceFiles (
    source_file_id INT PRIMARY KEY DEFAULT unique_rowid(),
    source_file TEXT UNIQUE NOT NULL
)

Each trace name, which is a structured key (See /infra/go/query) of the form,key1=value1,key2=value2,..., is stored either as the md5 hash of the trace name, i.e. trace_id = md5(trace_name) or as the series of key=value pairs that make up the params of the key.

When we store the values of each trace in the TraceValues table, use the trace_id and the commit_number as the primary key. We also store not only the value but the id of the source file that the value came from.

CREATE TABLE IF NOT EXISTS TraceValues (
    trace_id BYTES,
    -- Id of the trace name from TraceIDS.
    commit_number INT,
    -- A types.CommitNumber.
    val REAL,
    -- The floating point measurement.
    source_file_id INT,
    -- Id of the source filename, from SourceFiles.
    PRIMARY KEY (trace_id, commit_number)
);

Just using this table we can construct some useful queries. For example we can count the number of traces in a single tile, in this case the 0th tile in a system with a tileSize of 256:

SELECT
    COUNT(DISTINCT trace_id)
FROM
    TraceValues
WHERE
    commit_number >= 0 AND commit_number < 256;

The Postings table is our inverted index for looking up which trace ids contain which key=value pairs. For a good introduction to postings and search https://www.tbray.org/ongoing/When/200x/2003/06/18/HowSearchWorks is a good resource.

Remember that each trace name is a structured key of the form,arch=x86,config=8888,..., and that over time traces may come and go, i.e. we may stop running a test, or start running new tests, so if we want to make searching for traces efficient we need to be aware of how those trace ids change over time. The answer is to break our store in Tiles, i.e. blocks of commits of tileSize length, and then for each Tile we keep an inverted index of the trace ids.

In the table below we store a key_value which is the literal "key=value" part of a trace name, along with the tile_number and the md5 trace_id. Note that tile_number is just int(commitNumber/tileSize).

CREATE TABLE IF NOT EXISTS Postings (
    -- A types.TileNumber.
    tile_number INT,
    -- A key value pair from a structured key, e.g. "config=8888".
    key_value STRING NOT NULL,
    -- md5(trace_name)
    trace_id BYTES,
    PRIMARY KEY (tile_number, key_value, trace_id)
);

Finally, to make it fast to turn UI queries into SQL queries we store the ParamSet representing all the trace names in the Tile.

CREATE TABLE IF NOT EXISTS ParamSets (
    tile_number INT,
    param_key STRING,
    param_value STRING,
    PRIMARY KEY (tile_number, param_key, param_value),
    INDEX (tile_number DESC),
);

So for example to build a ParamSet for a tile:

SELECT
    param_key, param_value
FROM
    ParamSets
WHERE
    tile_number=0;

To find the most recent tile:

SELECT
    tile_number
FROM
    ParamSets
ORDER BY
    tile_number DESC LIMIT 1;

To query for traces we first find the trace_ids of all the traces that would match the given query on a tile.

SELECT
    encode(trace_id, 'hex')
FROM
    Postings
WHERE
    key_value IN ('config=8888', 'config=565')
    AND tile_number = 0
INTERSECT
SELECT
    encode(trace_id, 'hex')
FROM
    Postings
WHERE
    key_value IN ('arch=x86', 'arch=risc-v')
    AND tile_number = 0;

Then once you have all the trace_ids, load the values from the TraceValues table.

SELECT
    trace_id,
    commit_number,
    val
FROM
    TraceValues
WHERE
    tracevalues.commit_number >= 0
    AND tracevalues.commit_number < 256
    AND tracevalues.trace_id IN (
        '\xfe385b159ff55dca481069805e5ff050',
        '\x277262a9236d571883d47dab102070bc'
    );

Look in migrations/cdb.sql for more example of raw queries using a simple example dataset.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type SQLTraceStore

type SQLTraceStore struct {
	// contains filtered or unexported fields
}

SQLTraceStore implements tracestore.TraceStore backed onto an SQL database.

func New

func New(db pool.Pool, datastoreConfig config.DataStoreConfig) (*SQLTraceStore, error)

New returns a new *SQLTraceStore.

We presume all migrations have been run against db before this function is called.

func (*SQLTraceStore) ClearOrderedParamSetCache

func (s *SQLTraceStore) ClearOrderedParamSetCache()

ClearOrderedParamSetCache is only used for tests.

func (*SQLTraceStore) CommitNumberOfTileStart

func (s *SQLTraceStore) CommitNumberOfTileStart(commitNumber types.CommitNumber) types.CommitNumber

CommitNumberOfTileStart implements the tracestore.TraceStore interface.

func (*SQLTraceStore) GetLastNSources

func (s *SQLTraceStore) GetLastNSources(ctx context.Context, traceID string, n int) ([]tracestore.Source, error)

GetLastNSources implements the tracestore.TraceStore interface.

func (*SQLTraceStore) GetLatestTile

func (s *SQLTraceStore) GetLatestTile(ctx context.Context) (types.TileNumber, error)

GetLatestTile implements the tracestore.TraceStore interface.

func (*SQLTraceStore) GetParamSet

func (s *SQLTraceStore) GetParamSet(ctx context.Context, tileNumber types.TileNumber) (paramtools.ReadOnlyParamSet, error)

GetParamSet implements the tracestore.TraceStore interface.

func (*SQLTraceStore) GetSource

func (s *SQLTraceStore) GetSource(ctx context.Context, commitNumber types.CommitNumber, traceName string) (string, error)

GetSource implements the tracestore.TraceStore interface.

func (*SQLTraceStore) GetTraceIDsBySource

func (s *SQLTraceStore) GetTraceIDsBySource(ctx context.Context, sourceFilename string, tileNumber types.TileNumber) ([]string, error)

GetTraceIDsBySource implements the tracestore.TraceStore interface.

func (*SQLTraceStore) OffsetFromCommitNumber

func (s *SQLTraceStore) OffsetFromCommitNumber(commitNumber types.CommitNumber) int32

OffsetFromCommitNumber implements the tracestore.TraceStore interface.

func (*SQLTraceStore) QueryTraces

func (s *SQLTraceStore) QueryTraces(ctx context.Context, tileNumber types.TileNumber, q *query.Query) (types.TraceSet, []provider.Commit, error)

QueryTraces implements the tracestore.TraceStore interface.

func (*SQLTraceStore) QueryTracesIDOnly

func (s *SQLTraceStore) QueryTracesIDOnly(ctx context.Context, tileNumber types.TileNumber, q *query.Query) (<-chan paramtools.Params, error)

QueryTracesIDOnly implements the tracestore.TraceStore interface.

func (*SQLTraceStore) ReadTraces

func (s *SQLTraceStore) ReadTraces(ctx context.Context, tileNumber types.TileNumber, traceNames []string) (types.TraceSet, []provider.Commit, error)

ReadTraces implements the tracestore.TraceStore interface.

func (*SQLTraceStore) ReadTracesForCommitRange

func (s *SQLTraceStore) ReadTracesForCommitRange(ctx context.Context, traceNames []string, beginCommit types.CommitNumber, endCommit types.CommitNumber) (types.TraceSet, []provider.Commit, error)

ReadTracesForCommitRange implements the tracestore.TraceStore interface.

func (*SQLTraceStore) StartBackgroundMetricsGathering

func (s *SQLTraceStore) StartBackgroundMetricsGathering()

StartBackgroundMetricsGathering runs continuously in the background and gathers metrics related to param sets in the database.

func (*SQLTraceStore) TileNumber

func (s *SQLTraceStore) TileNumber(commitNumber types.CommitNumber) types.TileNumber

TileNumber implements the tracestore.TraceStore interface.

func (*SQLTraceStore) TileSize

func (s *SQLTraceStore) TileSize() int32

TileSize implements the tracestore.TraceStore interface.

func (*SQLTraceStore) TraceCount

func (s *SQLTraceStore) TraceCount(ctx context.Context, tileNumber types.TileNumber) (int64, error)

TraceCount implements the tracestore.TraceStore interface.

func (*SQLTraceStore) WriteTraces

func (s *SQLTraceStore) WriteTraces(ctx context.Context, commitNumber types.CommitNumber, params []paramtools.Params, values []float32, ps paramtools.ParamSet, source string, _ time.Time) error

WriteTraces implements the tracestore.TraceStore interface.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL