bigquery

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 3, 2016 License: Apache-2.0 Imports: 11 Imported by: 0

Documentation

Overview

Package bigquery provides a client for the BigQuery service.

Note: This package is a work-in-progress. Backwards-incompatible changes should be expected.

Index

Constants

View Source
const (
	BatchPriority       = "BATCH"
	InteractivePriority = "INTERACTIVE"
)
View Source
const Scope = "https://www.googleapis.com/auth/bigquery"

Variables

This section is empty.

Functions

This section is empty.

Types

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client may be used to perform BigQuery operations.

func NewClient

func NewClient(ctx context.Context, projectID string, opts ...option.ClientOption) (*Client, error)

NewClient constructs a new Client which can perform BigQuery operations. Operations performed via the client are billed to the specified GCP project.

func (*Client) Copy

func (c *Client) Copy(ctx context.Context, dst Destination, src Source, options ...Option) (*Job, error)

Copy starts a BigQuery operation to copy data from a Source to a Destination.

func (*Client) CreateTable

func (c *Client) CreateTable(ctx context.Context, projectID, datasetID, tableID string, options ...CreateTableOption) (*Table, error)

CreateTable creates a table in the BigQuery service and returns a handle to it.

func (*Client) Dataset

func (c *Client) Dataset(id string) *Dataset

func (*Client) JobFromID

func (c *Client) JobFromID(ctx context.Context, id string) (*Job, error)

JobFromID creates a Job which refers to an existing BigQuery job. The job need not have been created by this package. For example, the job may have been created in the BigQuery console.

func (*Client) NewGCSReference

func (c *Client) NewGCSReference(uri ...string) *GCSReference

NewGCSReference constructs a reference to one or more Google Cloud Storage objects, which together constitute a data source or destination. In the simple case, a single URI in the form gs://bucket/object may refer to a single GCS object. Data may also be split into mutiple files, if multiple URIs or URIs containing wildcards are provided. Each URI may contain one '*' wildcard character, which (if present) must come after the bucket name. For more information about the treatment of wildcards and multiple URIs, see https://cloud.google.com/bigquery/exporting-data-from-bigquery#exportingmultiple

func (*Client) OpenTable

func (c *Client) OpenTable(projectID, datasetID, tableID string) *Table

OpenTable creates a handle to an existing BigQuery table. If the table does not already exist, subsequent uses of the *Table will fail.

func (*Client) Read

func (c *Client) Read(ctx context.Context, src ReadSource, options ...ReadOption) (*Iterator, error)

Read fetches data from a ReadSource and returns the data via an Iterator.

type Compression

type Compression string

Compression is the type of compression to apply when writing data to Google Cloud Storage.

const (
	None Compression = "NONE"
	Gzip Compression = "GZIP"
)

type CreateTableOption

type CreateTableOption interface {
	// contains filtered or unexported methods
}

A CreateTableOption is an optional argument to CreateTable.

func TableExpiration

func TableExpiration(exp time.Time) CreateTableOption

TableExpiration returns a CreateTableOption that will cause the created table to be deleted after the expiration time.

func ViewQuery

func ViewQuery(query string) CreateTableOption

ViewQuery returns a CreateTableOption that causes the created table to be a virtual table defined by the supplied query. For more information see: https://cloud.google.com/bigquery/querying-data#views

type DataFormat

type DataFormat string
const (
	CSV             DataFormat = "CSV"
	Avro            DataFormat = "AVRO"
	JSON            DataFormat = "NEWLINE_DELIMITED_JSON"
	DatastoreBackup DataFormat = "DATASTORE_BACKUP"
)

type Dataset

type Dataset struct {
	// contains filtered or unexported fields
}

Dataset is a reference to a BigQuery dataset.

func (*Dataset) Create

func (d *Dataset) Create(ctx context.Context) error

func (*Dataset) ListTables

func (d *Dataset) ListTables(ctx context.Context) ([]*Table, error)

ListTables returns a list of all the tables contained in the Dataset.

type Destination

type Destination interface {
	// contains filtered or unexported methods
}

A Destination is a destination of data for the Copy function.

type Encoding

type Encoding string

Encoding specifies the character encoding of data to be loaded into BigQuery. See https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load.encoding for more details about how this is used.

const (
	UTF_8      Encoding = "UTF-8"
	ISO_8859_1 Encoding = "ISO-8859-1"
)

type Error

type Error struct {
	// Mirrors bq.ErrorProto, but drops DebugInfo
	Location, Message, Reason string
}

An Error contains detailed information about a failed bigquery operation.

func (Error) Error

func (e Error) Error() string

type FieldSchema

type FieldSchema struct {
	// The field name.
	// Must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_),
	// and must start with a letter or underscore.
	// The maximum length is 128 characters.
	Name string

	// A description of the field. The maximum length is 16,384 characters.
	Description string

	// Whether the field may contain multiple values.
	Repeated bool
	// Whether the field is required.  Ignored if Repeated is true.
	Required bool

	// The field data type.  If Type is Record, then this field contains a nested schema,
	// which is described by Schema.
	Type FieldType
	// Describes the nested schema if Type is set to Record.
	Schema Schema
}

type FieldType

type FieldType string
const (
	StringFieldType    FieldType = "STRING"
	IntegerFieldType   FieldType = "INTEGER"
	FloatFieldType     FieldType = "FLOAT"
	BooleanFieldType   FieldType = "BOOLEAN"
	TimestampFieldType FieldType = "TIMESTAMP"
	RecordFieldType    FieldType = "RECORD"
)

type GCSReference

type GCSReference struct {

	// FieldDelimiter is the separator for fields in a CSV file, used when loading or exporting data.
	// The default is ",".
	FieldDelimiter string

	// The number of rows at the top of a CSV file that BigQuery will skip when loading the data.
	SkipLeadingRows int64

	// SourceFormat is the format of the GCS data to be loaded into BigQuery.
	// Allowed values are: CSV, JSON, DatastoreBackup.  The default is CSV.
	SourceFormat DataFormat
	// Only used when loading data.
	Encoding Encoding

	// Quote is the value used to quote data sections in a CSV file.
	// The default quotation character is the double quote ("), which is used if both Quote and ForceZeroQuote are unset.
	// To specify that no character should be interpreted as a quotation character, set ForceZeroQuote to true.
	// Only used when loading data.
	Quote          string
	ForceZeroQuote bool

	// DestinationFormat is the format to use when writing exported files.
	// Allowed values are: CSV, Avro, JSON.  The default is CSV.
	// CSV is not supported for tables with nested or repeated fields.
	DestinationFormat DataFormat
	// Only used when writing data.  Default is None.
	Compression Compression
	// contains filtered or unexported fields
}

GCSReference is a reference to one or more Google Cloud Storage objects, which together constitute an input or output to a BigQuery operation.

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator provides access to the result of a BigQuery lookup. Next must be called before the first call to Get.

func (*Iterator) Err

func (it *Iterator) Err() error

Err returns the last error encountered by Next, or nil for no error.

func (*Iterator) Get

func (it *Iterator) Get(dst interface{}) error

Get loads the current row into dst, which must implement ValueLoader.

func (*Iterator) Next

func (it *Iterator) Next(ctx context.Context) bool

Next advances the Iterator to the next row, making that row available via the Get method. Next must be called before the first call to Get or Schema, and blocks until data is available. Next returns false when there are no more rows available, either because the end of the output was reached, or because there was an error (consult the Err method to determine which).

func (*Iterator) Schema

func (it *Iterator) Schema() (Schema, error)

Schema returns the schema of the result rows.

type Job

type Job struct {
	// contains filtered or unexported fields
}

A Job represents an operation which has been submitted to BigQuery for processing.

func (*Job) ID

func (j *Job) ID() string

func (*Job) Status

func (j *Job) Status(ctx context.Context) (*JobStatus, error)

Status returns the current status of the job. It fails if the Status could not be determined.

type JobStatus

type JobStatus struct {
	State State

	// All errors encountered during the running of the job.
	// Not all Errors are fatal, so errors here do not necessarily mean that the job has completed or was unsuccessful.
	Errors []*Error
	// contains filtered or unexported fields
}

JobStatus contains the current State of a job, and errors encountered while processing that job.

func (*JobStatus) Done

func (s *JobStatus) Done() bool

Done reports whether the job has completed. After Done returns true, the Err method will return an error if the job completed unsuccesfully.

func (*JobStatus) Err

func (s *JobStatus) Err() error

Err returns the error that caused the job to complete unsuccesfully (if any).

type MultiError

type MultiError []error

A MultiError contains multiple related errors.

func (MultiError) Error

func (m MultiError) Error() string

type Option

type Option interface {
	// contains filtered or unexported methods
}

An Option is an optional argument to Copy.

func AllowJaggedRows

func AllowJaggedRows() Option

AllowJaggedRows returns an Option that causes missing trailing optional columns to be tolerated in CSV data. Missing values are treated as nulls.

func AllowLargeResults

func AllowLargeResults() Option

AllowLargeResults returns an Option that allows the query to produce arbitrarily large result tables. The destination must be a table. When using this option, queries will take longer to execute, even if the result set is small. For additional limitations, see https://cloud.google.com/bigquery/querying-data#largequeryresults

func AllowQuotedNewlines

func AllowQuotedNewlines() Option

AllowQuotedNewlines returns an Option that allows quoted data sections containing newlines in CSV data.

func CreateDisposition

func CreateDisposition(disp TableCreateDisposition) Option

func DestinationSchema

func DestinationSchema(schema Schema) Option

DestinationSchema returns an Option that specifies the schema to use when loading data into a new table. A DestinationSchema Option must be supplied when loading data from Google Cloud Storage into a non-existent table. Caveat: DestinationSchema is not required if the data being loaded is a datastore backup. schema must not be nil.

func DisableFlattenedResults

func DisableFlattenedResults() Option

DisableFlattenedResults returns an Option that prevents results being flattened. If this Option is not used, results from nested and repeated fields are flattened. DisableFlattenedResults implies AllowLargeResults For more information, see https://cloud.google.com/bigquery/docs/data#nested

func DisableHeader

func DisableHeader() Option

DisableHeader returns an Option that disables the printing of a header row in exported data.

func DisableQueryCache

func DisableQueryCache() Option

DisableQueryCache returns an Option that prevents results being fetched from the query cache. If this Option is not used, results are fetched from the cache if they are available. The query cache is a best-effort cache that is flushed whenever tables in the query are modified. Cached results are only available when TableID is unspecified in the query's destination Table. For more information, see https://cloud.google.com/bigquery/querying-data#querycaching

func IgnoreUnknownValues

func IgnoreUnknownValues() Option

IgnoreUnknownValues returns an Option that causes values not matching the schema to be tolerated. Unknown values are ignored. For CSV this ignores extra values at the end of a line. For JSON this ignores named values that do not match any column name. If this Option is not used, records containing unknown values are treated as bad records. The MaxBadRecords Option can be used to customize how bad records are handled.

func JobID

func JobID(ID string) Option

JobID returns an Option that sets the job ID of a BigQuery job. If this Option is not used, a job ID is generated automatically.

func JobPriority

func JobPriority(priority string) Option

JobPriority returns an Option that causes a query to be scheduled with the specified priority. The default priority is InteractivePriority. For more information, see https://cloud.google.com/bigquery/querying-data#batchqueries

func MaxBadRecords

func MaxBadRecords(n int64) Option

MaxBadRecords returns an Option that sets the maximum number of bad records that will be ignored. If this maximum is exceeded, the operation will be unsuccessful.

func WriteDisposition

func WriteDisposition(disp TableWriteDisposition) Option

type PutMultiError

type PutMultiError []RowInsertionError

PutMultiError contains an error for each row which was not successfully inserted into a BigQuery table.

func (PutMultiError) Error

func (pme PutMultiError) Error() string

type Query

type Query struct {
	// The query to execute. See https://cloud.google.com/bigquery/query-reference for details.
	Q string

	// DefaultProjectID and DefaultDatasetID specify the dataset to use for unqualified table names in the query.
	// If DefaultProjectID is set, DefaultDatasetID must also be set.
	DefaultProjectID string
	DefaultDatasetID string
}

Query represents a query to be executed.

type ReadOption

type ReadOption interface {
	// contains filtered or unexported methods
}

A ReadOption is an optional argument to Read.

func RecordsPerRequest

func RecordsPerRequest(n int64) ReadOption

RecordsPerRequest returns a ReadOption that sets the number of records to fetch per request when streaming data from BigQuery.

func StartIndex

func StartIndex(i uint64) ReadOption

StartIndex returns a ReadOption that sets the zero-based index of the row to start reading from.

type ReadSource

type ReadSource interface {
	// contains filtered or unexported methods
}

A ReadSource is a source of data for the Read function.

type RowInsertionError

type RowInsertionError struct {
	InsertID string // The InsertID associated with the affected row.
	RowIndex int    // The 0-based index of the affected row in the batch of rows being inserted.
	Errors   MultiError
}

RowInsertionError contains all errors that occurred when attempting to insert a row.

func (*RowInsertionError) Error

func (e *RowInsertionError) Error() string

type Schema

type Schema []*FieldSchema

Schema describes the fields in a table or query result.

func InferSchema

func InferSchema(st interface{}) (Schema, error)

InferSchema tries to derive a BigQuery schema from the supplied struct value. NOTE: All fields in the returned Schema are configured to be required, unless the corresponding field in the supplied struct is a slice or array. It is considered an error if the struct (including nested structs) contains any exported fields that are pointers or one of the following types: map, interface, complex64, complex128, func, chan. In these cases, an error will be returned. Future versions may handle these cases without error.

type Source

type Source interface {
	// contains filtered or unexported methods
}

A Source is a source of data for the Copy function.

type State

type State int

State is one of a sequence of states that a Job progresses through as it is processed.

const (
	Pending State = iota
	Running
	Done
)

type Table

type Table struct {
	// ProjectID, DatasetID and TableID may be omitted if the Table is the destination for a query.
	// In this case the result will be stored in an ephemeral table.
	ProjectID string
	DatasetID string
	// TableID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_).
	// The maximum length is 1,024 characters.
	TableID string
	// contains filtered or unexported fields
}

A Table is a reference to a BigQuery table.

func (*Table) Delete

func (t *Table) Delete(ctx context.Context) error

Delete deletes the table.

func (*Table) FullyQualifiedName

func (t *Table) FullyQualifiedName() string

FullyQualifiedName returns the ID of the table in projectID:datasetID.tableID format.

func (*Table) Metadata

func (t *Table) Metadata(ctx context.Context) (*TableMetadata, error)

Metadata fetches the metadata for the table.

func (*Table) NewUploader

func (t *Table) NewUploader(opts ...UploadOption) *Uploader

NewUploader returns an *Uploader that can be used to append rows to t.

func (*Table) Patch

func (t *Table) Patch() *TableMetadataPatch

Patch returns a *TableMetadataPatch, which can be used to modify specific Table metadata fields. In order to apply the changes, the TableMetadataPatch's Apply method must be called.

type TableCreateDisposition

type TableCreateDisposition string

CreateDisposition specifies the circumstances under which destination table will be created. Default is CreateIfNeeded.

const (
	// The table will be created if it does not already exist.  Tables are created atomically on successful completion of a job.
	CreateIfNeeded TableCreateDisposition = "CREATE_IF_NEEDED"

	// The table must already exist and will not be automatically created.
	CreateNever TableCreateDisposition = "CREATE_NEVER"
)

type TableMetadata

type TableMetadata struct {
	Description string // The user-friendly description of this table.
	Name        string // The user-friendly name for this table.
	Schema      Schema
	View        string

	ID   string // An opaque ID uniquely identifying the table.
	Type TableType

	// The time when this table expires. If not set, the table will persist
	// indefinitely. Expired tables will be deleted and their storage reclaimed.
	ExpirationTime time.Time

	CreationTime     time.Time
	LastModifiedTime time.Time

	// The size of the table in bytes.
	// This does not include data that is being buffered during a streaming insert.
	NumBytes int64

	// The number of rows of data in this table.
	// This does not include data that is being buffered during a streaming insert.
	NumRows uint64
}

TableMetadata contains information about a BigQuery table.

type TableMetadataPatch

type TableMetadataPatch struct {
	// contains filtered or unexported fields
}

TableMetadataPatch represents a set of changes to a table's metadata.

func (*TableMetadataPatch) Apply

Apply applies the patch operation.

func (*TableMetadataPatch) Description

func (p *TableMetadataPatch) Description(desc string)

Description sets the table description.

func (*TableMetadataPatch) Name

func (p *TableMetadataPatch) Name(name string)

Name sets the table name.

type TableType

type TableType string

TableType is the type of table.

const (
	RegularTable TableType = "TABLE"
	ViewTable    TableType = "VIEW"
)

type TableWriteDisposition

type TableWriteDisposition string

TableWriteDisposition specifies how existing data in a destination table is treated. Default is WriteAppend.

const (
	// Data will be appended to any existing data in the destination table.
	// Data is appended atomically on successful completion of a job.
	WriteAppend TableWriteDisposition = "WRITE_APPEND"

	// Existing data in the destination table will be overwritten.
	// Data is overwritten atomically on successful completion of a job.
	WriteTruncate TableWriteDisposition = "WRITE_TRUNCATE"

	// Writes will fail if the destination table already contains data.
	WriteEmpty TableWriteDisposition = "WRITE_EMPTY"
)

type Tables

type Tables []*Table

Tables is a group of tables. The tables may belong to differing projects or datasets.

type UploadOption

type UploadOption interface {
	// contains filtered or unexported methods
}

An UploadOption is an optional argument to NewUploader.

func SkipInvalidRows

func SkipInvalidRows() UploadOption

SkipInvalidRows returns an UploadOption that causes rows containing invalid data to be silently ignored. The default value is false, which causes the entire request to fail, if there is an attempt to insert an invalid row.

func TableTemplateSuffix

func TableTemplateSuffix(suffix string) UploadOption

A TableTemplateSuffix allows Uploaders to create tables automatically.

Experimental: this option is experimental and may be modified or removed in future versions, regardless of any other documented package stability guarantees.

When you specify a suffix, the table you upload data to will be used as a template for creating a new table, with the same schema, called <table> + <suffix>.

More information is available at https://cloud.google.com/bigquery/streaming-data-into-bigquery#template-tables

func UploadIgnoreUnknownValues

func UploadIgnoreUnknownValues() UploadOption

UploadIgnoreUnknownValues returns an UploadOption that causes values not matching the schema to be ignored. If this option is not used, records containing such values are treated as invalid records.

type Uploader

type Uploader struct {
	// contains filtered or unexported fields
}

An Uploader does streaming inserts into a BigQuery table. It is safe for concurrent use.

func (*Uploader) Put

func (u *Uploader) Put(ctx context.Context, src interface{}) error

Put uploads one or more rows to the BigQuery service. src must implement ValueSaver or be a slice of ValueSavers. Put returns a PutMultiError if one or more rows failed to be uploaded. The PutMultiError contains a RowInsertionError for each failed row.

type Value

type Value interface{}

Value stores the contents of a single cell from a BigQuery result.

type ValueList

type ValueList []Value

ValueList converts a []Value to implement ValueLoader.

func (*ValueList) Load

func (vs *ValueList) Load(v []Value) error

Load stores a sequence of values in a ValueList.

type ValueLoader

type ValueLoader interface {
	Load(v []Value) error
}

ValueLoader stores a slice of Values representing a result row from a Read operation. See Iterator.Get for more information.

type ValueSaver

type ValueSaver interface {
	// Save returns a row to be inserted into a BigQuery table, represented
	// as a map from field name to Value.
	// If insertID is non-empty, BigQuery will use it to de-duplicate
	// insertions of this row on a best-effort basis.
	Save() (row map[string]Value, insertID string, err error)
}

A ValueSaver returns a row of data to be inserted into a table.

type ValuesSaver

type ValuesSaver struct {
	Schema Schema

	// If non-empty, BigQuery will use InsertID to de-duplicate insertions
	// of this row on a best-effort basis.
	InsertID string

	Row []Value
}

ValuesSaver implements ValueSaver for a slice of Values.

func (*ValuesSaver) Save

func (vls *ValuesSaver) Save() (map[string]Value, string, error)

Save implements ValueSaver

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL