bigquery

package
v0.0.0-...-48dec30 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 12, 2018 License: MIT Imports: 20 Imported by: 0

README

Google BigQuery Data source

Provides MySQL acess to Google BigQuery which opens up the usage of bigquery via standard sql allowing tools that don't have native bigquery clients.

dataux_bigquery

# assuming you are running local, if you are instead in Google Cloud, or Google Container Engine
# you don't need the credentials or volume mount
docker run -e "GOOGLE_APPLICATION_CREDENTIALS=/.config/gcloud/application_default_credentials.json" \
  -e "LOGGING=debug" \
  --rm -it \
  -p 4000:4000 \
  -v ~/.config/gcloud:/.config/gcloud \
  gcr.io/dataux-io/dataux:latest

# connect to dataux
mysql -h 127.0.0.1 -P4000


-- Create a new schema = "bq" with one source
-- a bigquery public dataset is the only source/tables
-- replace BIGQUERY_PROJECT with your billing account project

CREATE source `BIGQUERY_PROJECT` WITH {
    "type":"bigquery",
    "schema":"bq",
    "table_aliases" : {
       "bikeshare_stations" : "bigquery-public-data.san_francisco.bikeshare_stations"
    },
    "settings" {
      "billing_project" : "BIGQUERY_PROJECT",
      "data_project" : "bigquery-public-data",
      "dataset" : "san_francisco"
    }
};

-- WITH Properties:
-- "schema":  Name of schema to attach this source to
-- "type":  Source type, most be datasource registered in registry (mongo, bigtable, etc)

CREATE source BIGQUERY_PROJECT WITH json_properties


use bq;

show tables;

select title, release_year AS year, locations from film_locations limit 10;

select count(*) as ct, landmark from bikeshare_stations GROUP BY landmark;

SELECT landmark from bikeshare_stations WHERE landmark like "Palo%"

select count(*) AS ct, landmark FROM bikeshare_stations GROUP BY landmark ORDER BY ct DESC LIMIT 1;


# Drop it when your are done if you want

drop schema bq;

Documentation

Overview

package bigquery implements a data source (backend) to allow dataux to query google bigquery so that bigquery power is available via the pervasive mysql protocol.

Index

Constants

View Source
const (
	// DataSourceLabel is public sourcetype for bigquery.
	DataSourceLabel = "bigquery"
)

Variables

View Source
var (
	// ErrNoSchema is an error that no schema could be found.
	ErrNoSchema = fmt.Errorf("No schema or configuration exists")

	// SchemaRefreshInterval is time between checking for schema changes
	SchemaRefreshInterval = time.Duration(time.Minute * 5)
)
View Source
var (
	// DefaultLimit ie page-size defaut
	DefaultLimit = 5000

	// Timeout default for BigQuery queries
	Timeout = 10 * time.Second
)

Functions

This section is empty.

Types

type Mutator

type Mutator struct {
	// contains filtered or unexported fields
}

Mutator a bigquery mutator connection

type ResultReader

type ResultReader struct {
	*exec.TaskBase

	Total int
	Req   *SqlToBQ
	// contains filtered or unexported fields
}

ResultReader implements result paging, reading

func NewResultReader

func NewResultReader(req *SqlToBQ) *ResultReader

func (*ResultReader) Close

func (m *ResultReader) Close() error

func (*ResultReader) Run

func (m *ResultReader) Run() error

Runs the Google BigQuery job

type ResultReaderNext

type ResultReaderNext struct {
	*ResultReader
}

A wrapper, allowing us to implement sql/driver Next() interface

which is different than qlbridge/datasource Next()

type RowVals

type RowVals struct {
	// contains filtered or unexported fields
}

func (*RowVals) Save

func (r *RowVals) Save() (map[string]bigquery.Value, string, error)

Save implements the ValueSaver interface.

type Source

type Source struct {
	// contains filtered or unexported fields
}

Source is a BigQuery datasource, this provides Reads, Insert, Update, Delete - singleton shared instance - creates clients to bigquery (clients perform queries) - provides schema info about bigquery table/column-families

func (*Source) Close

func (m *Source) Close() error

func (*Source) DataSource

func (m *Source) DataSource() schema.Source

func (*Source) Init

func (m *Source) Init()

func (*Source) Open

func (m *Source) Open(tableName string) (schema.Conn, error)

func (*Source) Setup

func (m *Source) Setup(ss *schema.Schema) error

func (*Source) Table

func (m *Source) Table(table string) (*schema.Table, error)

func (*Source) Tables

func (m *Source) Tables() []string

type SqlToBQ

type SqlToBQ struct {
	*exec.TaskBase
	// contains filtered or unexported fields
}

SqlToBQ Convert a Sql Query to a bigquery read/write rows

  • responsible for passing through query if possible, or rewrite if necessary

func NewSqlToBQ

func NewSqlToBQ(s *Source, t *schema.Table) *SqlToBQ

NewSqlToBQ create a SQL ast -> BigQuery Converter

func (*SqlToBQ) CreateMutator

func (m *SqlToBQ) CreateMutator(pc interface{}) (schema.ConnMutator, error)

CreateMutator part of Mutator interface to allow data sources create a stateful

mutation context for update/delete operations.

func (*SqlToBQ) Delete

func (m *SqlToBQ) Delete(key driver.Value) (int, error)

Delete delete by row key

func (*SqlToBQ) DeleteExpression

func (m *SqlToBQ) DeleteExpression(p interface{}, where expr.Node) (int, error)

DeleteExpression - delete by expression (where clause)

  • For where columns contained in Partition Keys we can push to bigtable
  • for others we might have to do a select -> delete

func (*SqlToBQ) Put

func (m *SqlToBQ) Put(ctx context.Context, key schema.Key, val interface{}) (schema.Key, error)

Put Interface for mutation (insert, update)

func (*SqlToBQ) PutMulti

func (m *SqlToBQ) PutMulti(ctx context.Context, keys []schema.Key, src interface{}) ([]schema.Key, error)

func (*SqlToBQ) WalkExecSource

func (m *SqlToBQ) WalkExecSource(p *plan.Source) (exec.Task, error)

WalkExecSource an interface of executor that allows this source to create its own execution Task so that it can push down as much as it can to bigquery.

func (*SqlToBQ) WalkSourceSelect

func (m *SqlToBQ) WalkSourceSelect(planner plan.Planner, p *plan.Source) (plan.Task, error)

WalkSourceSelect An interface implemented by this connection allowing the planner to push down as much logic into this source as possible

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL