bqx

package
v0.1.73 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2024 License: Apache-2.0 Imports: 16 Imported by: 11

README

The bqx package

This package contains extensions for the bigquery library, to facilitate various common operations, notably query processing.

Documentation

Overview

Package bqx includes generally useful abstractions for simplifying interactions with bigquery. Production extensions should go here, but test facilities should go in a separate bqtest package.

Package bqx provides utilities and extensions for working with bigquery.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrInvalidProjectName = errors.New("Invalid project name")
	ErrInvalidDatasetName = errors.New("Invalid dataset name")
	ErrInvalidTableName   = errors.New("Invalid table name")
	ErrInvalidFQTable     = errors.New("Invalid fully qualified table name")
)

These errors are self-explanatory.

Functions

func Customize

func Customize(schema bigquery.Schema, subs map[string]bigquery.FieldSchema) bigquery.Schema

Customize recursively traverses a schema, substituting any fields that have a matching name in the provided map.

func CustomizeAppend

func CustomizeAppend(schema bigquery.Schema, additions map[string]*bigquery.FieldSchema) bigquery.Schema

CustomizeAppend recursively traverses a schema, appending the bigquery.FieldSchema to existing fields matching a name in the provided map.

func PrettyPrint

func PrettyPrint(schema bigquery.Schema, simplify bool) (string, error)

PrettyPrint generates a formatted json representation of a Schema. It simplifies the schema by removing zero valued fields, and compacting each field record onto a single line. Intended for diagnostics and debugging. Not suitable for production use.

func RemoveRequired

func RemoveRequired(schema bigquery.Schema) bigquery.Schema

RemoveRequired recursively traverses a schema, setting Required to false in all fields that are not fundamentally required by BigQuery.

func UpdateSchemaDescription

func UpdateSchemaDescription(schema bigquery.Schema, docs SchemaDoc) error

UpdateSchemaDescription walks each field in the given schema and assigns the Description field in place using values found in the given SchemaDoc.

func WalkSchema

func WalkSchema(schema bigquery.Schema, visit func(prefix []string, field *bigquery.FieldSchema) error) error

WalkSchema visits every FieldSchema object in the given schema by calling the visit function. The prefix is a path of field names from the top level to the current Field.

Types

type Dataset

type Dataset struct {
	*bigquery.Dataset // Exposes Dataset API directly.
	BqClient          *bigquery.Client
}

Dataset provides extensions to the bigquery Dataset and Dataset objects to streamline common actions. It encapsulates the Client and Dataset to simplify methods.

func NewDataset

func NewDataset(project, dataset string, clientOpts ...option.ClientOption) (Dataset, error)

NewDataset creates a Dataset for a project. httpClient is used to inject mocks for the bigquery client. if httpClient is nil, a suitable default client is used. Additional bigquery ClientOptions may be optionally passed as final

clientOpts argument.  This is useful for testing credentials.

func (*Dataset) DestQuery

func (dsExt *Dataset) DestQuery(query string, dest *bigquery.Table, disposition bigquery.TableWriteDisposition) *bigquery.Query

DestQuery constructs a query with common Config settings for writing results to a table. If dest is nil, then this will create a DryRun query. TODO - should disposition be an opts... field instead?

func (*Dataset) ExecDestQuery

func (dsExt *Dataset) ExecDestQuery(q *bigquery.Query) (*bigquery.JobStatus, error)

ExecDestQuery executes a destination or dryrun query, and returns status or error.

func (Dataset) GetPartitionInfo

func (dsExt Dataset) GetPartitionInfo(table string, partition string) (PartitionInfo, error)

GetPartitionInfo provides basic information about a partition.

func (*Dataset) QueryAndParse

func (dsExt *Dataset) QueryAndParse(q string, structPtr interface{}) error

QueryAndParse executes a query that should return a single row, with all struct fields that match query columns filled in. The caller must pass in the *address* of an appropriate struct. TODO - extend this to also handle multirow results, by passing slice of structs.

func (*Dataset) ResultQuery

func (dsExt *Dataset) ResultQuery(query string, dryRun bool) *bigquery.Query

ResultQuery constructs a query with common QueryConfig settings for writing results to a table. Generally, may need to change WriteDisposition.

type PDT

type PDT struct {
	Project string `json:",omitempty"`
	Dataset string `json:",omitempty"`
	Table   string `json:",omitempty"`
}

PDT contains a bigquery project, dataset, and table name.

func ParsePDT

func ParsePDT(fq string) (PDT, error)

ParsePDT parses and validates a fully qualified bigquery table name of the form project.dataset.table. None of the elements needs to exist, but all must conform to the corresponding naming restrictions.

func (PDT) CreateTable

func (pdt PDT) CreateTable(ctx context.Context, client *bigquery.Client, schema bigquery.Schema, description string,
	partitioning *bigquery.TimePartitioning, clustering *bigquery.Clustering) error

CreateTable will create a new table, or fail if the table already exists. It will also set appropriate time-partitioning field and clustering fields if non-nil arguments are provided. Returns error if the dataset does not already exist, or if other errors are encountered.

func (PDT) UpdateTable

func (pdt PDT) UpdateTable(ctx context.Context, client *bigquery.Client, schema bigquery.Schema,
	partitioning *bigquery.TimePartitioning) error

UpdateTable will update an existing table. Returns error if the table doesn't already exist, or if the schema changes are incompatible.

type PartitionInfo

type PartitionInfo struct {
	PartitionID  string
	CreationTime time.Time
	LastModified time.Time
}

PartitionInfo provides basic information about a partition.

type SchemaDoc

type SchemaDoc map[string]map[string]string

SchemaDoc contains bigquery.Schema field Descriptions as read from an auxiliary source, such as YAML.

func NewSchemaDoc

func NewSchemaDoc(docs []byte) SchemaDoc

NewSchemaDoc reads the given file and attempts to parse it as a SchemaDoc. Errors are fatal.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL