bq

package

v0.0.0-...-a70aae3 Latest Latest Go to latest Published: Nov 14, 2024 License: Apache-2.0 Imports: 42 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

chromium.googlesource.com/infra/luci/luci-go

Documentation ¶

Overview ¶

Package bq implements the export of datastore objects to bigquery.

See go/swarming/bq for more information.

Index ¶

Constants
func Register(srv *server.Server, disp *tq.Dispatcher, cron *cron.Dispatcher, ...) error
type AbstractFetcher
type ExportOp
- func (p *ExportOp) Close(ctx context.Context)
- func (p *ExportOp) Execute(ctx context.Context, start time.Time, duration time.Duration) error
type ExportSchedule
type ExportState
type Fetcher
- func (f *Fetcher[E, R]) Descriptor() *descriptorpb.DescriptorProto
- func (f *Fetcher[E, R]) Fetch(ctx context.Context, start time.Time, duration time.Duration, ...) error
type Flusher

Constants ¶

View Source

const (
	// TaskRequests exports model.TaskRequest to the `task_requests` table.
	TaskRequests = "task_requests"

	// BotEvents exports model.BotEvent to the `bot_events` table.
	BotEvents = "bot_events"

	// TaskRunResults exports model.TaskRunResults to the `task_results_run`
	// table.
	TaskRunResults = "task_results_run"

	// TaskResultSummaries exports model.TaskResultSummaries to the
	// `task_results_summary` table.
	TaskResultSummaries = "task_results_summary"
)

Variables ¶

This section is empty.

Functions ¶

func Register ¶

func Register(
	srv *server.Server,
	disp *tq.Dispatcher,
	cron *cron.Dispatcher,
	dataset, onlyOneTable string,
	exportToPubSub stringset.Set,
	pubSubTopicPrefix string,
) error

Register registers TQ tasks and cron handlers that implement BigQuery export.

Types ¶

type AbstractFetcher ¶

type AbstractFetcher interface {
	// Descriptor is the proto descriptor of the BQ row protobuf message.
	Descriptor() *descriptorpb.DescriptorProto
	// Fetch fetches entities, converts them to BQ rows and flushes the result.
	Fetch(ctx context.Context, start time.Time, duration time.Duration, flushers []*Flusher) error
}

AbstractFetcher is implemented by all Fetcher[E, R].

func BotEventsFetcher ¶

func BotEventsFetcher() AbstractFetcher

BotEventsFetcher makes a fetcher that can produce BotEvent BQ rows.

func TaskRequestFetcher ¶

func TaskRequestFetcher() AbstractFetcher

TaskRequestFetcher makes a fetcher that can produce TaskRequest BQ rows.

func TaskResultSummariesFetcher ¶

func TaskResultSummariesFetcher() AbstractFetcher

TaskResultSummariesFetcher makes a fetcher that can produce TaskResult BQ rows.

func TaskRunResultsFetcher ¶

func TaskRunResultsFetcher() AbstractFetcher

TaskRunResultsFetcher makes a fetcher that can produce TaskResult BQ rows.

type ExportOp ¶

type ExportOp struct {
	BQClient    *managedwriter.Client   // the BigQuery client to use
	PSClient    *pubsub.PublisherClient // the PubSub client to use
	OperationID string                  // ID of this particular export operation
	TableID     string                  // full table name to write results into
	Topic       string                  // if set, export rows to this PubSub topic
	Fetcher     AbstractFetcher         // fetches data and coverts it to [][]byte
	// contains filtered or unexported fields
}

ExportOp can fetch data from datastore and write it to BigQuery and PubSub.

It can fail with a transient error and be retried. On a retry it will attempt to finish the previous export (if possible).

May publish duplicate rows to PubSub on retries, since PubSub doesn't support committed writes in the same way BigQuery does.

func (*ExportOp) Close ¶

func (p *ExportOp) Close(ctx context.Context)

Close cleans up resources.

func (*ExportOp) Execute ¶

func (p *ExportOp) Execute(ctx context.Context, start time.Time, duration time.Duration) error

Execute performs the export operation.

type ExportSchedule ¶

type ExportSchedule struct {

	// ID is the BQ table name where events are exported to.
	//
	// It is one of the constants defined above, e.g. TaskRequests.
	ID string `gae:"$id"`

	// NextExport is a timestamp to start exporting events from.
	//
	// It is updated by scheduleExportTasks once it schedules TQ tasks that do
	// actual export.
	NextExport time.Time `gae:",noindex"`
	// contains filtered or unexported fields
}

ExportSchedule stores the per-table timestamp to start exporting events from.

type ExportState ¶

type ExportState struct {

	// ID is taskspb.ExportInterval.OperationId of the corresponding export task.
	//
	// This ID is derived based on the destination table name and the time
	// interval being exported.
	ID string `gae:"$id"`

	// WriteStreamName is name of the BigQuery WriteStream being exported to.
	//
	// Used when restarting the export task after transient errors to check if
	// the stream is committed.
	WriteStreamName string `gae:",noindex"`

	// ExpireAt is a timestamp when this entity can be deleted.
	//
	// Used in the Cloud Datastore TTL policy.
	ExpireAt time.Time `gae:",noindex"`
	// contains filtered or unexported fields
}

ExportState stores the BQ WriteStream associated with an export task.

It exists only if the write stream is already finalize and either already committed or is about to be committed. The purpose of this entity is to make sure that if an export task operation is retried, it doesn't accidentally export the same data again.

type Fetcher ¶

type Fetcher[E any, R proto.Message] struct {
	// contains filtered or unexported fields
}

Fetcher knows how to fetch Datastore data that should be exported to BQ.

`E` is the entity struct to fetch (e.g. model.TaskResultSummary) and `R` is the proto message representing the row (e.g. *bqpb.TaskResult).

func (*Fetcher[E, R]) Descriptor ¶

func (f *Fetcher[E, R]) Descriptor() *descriptorpb.DescriptorProto

Descriptor is the proto descriptor of the row protobuf message.

func (*Fetcher[E, R]) Fetch ¶

func (f *Fetcher[E, R]) Fetch(ctx context.Context, start time.Time, duration time.Duration, flushers []*Flusher) error

Fetch fetches entities, converts them to BQ rows and flushes the result.

Visits entities in range `[start, start+duration)`. Calls `flush` callback when it wants to send a batch of rows to BQ.

type Flusher ¶

type Flusher struct {
	// CountThreshold is how many rows to buffer before flushing them.
	CountThreshold int
	// ByteThreshold is how many bytes to buffer before flushing them.
	ByteThreshold int
	// Marshal converts a row proto to a raw byte message.
	Marshal func(proto.Message) ([]byte, error)
	// Flush flushes a batch of converted messages.
	Flush func([][]byte) error
	// contains filtered or unexported fields
}

Flusher knows how to serialize rows and flush a bunch of them.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
taskspb Package taskspb contains proto definitions for tq.Tasks payloads used in swarming-go.	Package taskspb contains proto definitions for tq.Tasks payloads used in swarming-go.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL