Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
View Source
var ( // MaxDelaySeconds is the maximum number of seconds to randomly wait in // response to BigQuery errors. MaxDelaySeconds = 60 // QueryRetries is the maximum number of times to retry a query. QueryRetries = 2 // ErrCorrupt may be returned by a processor implementation if the file // content should be considered corrupt and not included in the output archive. ErrCorrupt = errors.New("file content is corrupt") )
Functions ¶
This section is empty.
Types ¶
type Manager ¶
type Manager[Row any] struct { Jobs *jobs.Client Process Processor[Row] QueryClient query.Querier Query string RetryQueryOnError bool }
Manager uses a Processor to act on every result returned by the Querier. Manager uses a type parameter for the query result rows and Processor type.
func (*Manager[Row]) ProcessDate ¶
ProcessDate processes all archives found on a given date.
type Processor ¶
type Processor[Row any] interface { // Init sets up the processor for processing the given date, e.g. downloading daily databases. Init(ctx context.Context, date string) // Source creates a new archive source to read archive files to process. Source(ctx context.Context, row Row) *archive.Source // File processes the given file content. File should only return ErrCorrupt // if the content is corrupt. If the file content cannot be processed for other // reasons, then return the original data with no error. File(h *tar.Header, b []byte) ([]byte, error) // Finish concludes an archive after all files have been processed. Finish(ctx context.Context, out *archive.Target) error }
A Processor is used by the process Manager to act on the content of every file of every row archive. Processor uses a type parameter for the specific query row type.
Click to show internal directories.
Click to hide internal directories.