transfer

package
v1.4.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 10, 2022 License: Apache-2.0 Imports: 13 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func JSONDotAccessorToArrowAccessor

func JSONDotAccessorToArrowAccessor(str string) string

JSONDotAccessorToArrowAccessor converts StandardSQL's dot operator for accessing JSON fields into Postgres's arrow operator. Example: `a.b.c` to `a->'b'->>'c'`

Types

type BigQueryClient

type BigQueryClient struct {
	// contains filtered or unexported fields
}

BigQueryClient interacts with an instance of BigQuery.

func NewBigQueryClient

func NewBigQueryClient(ctx context.Context, config BigQueryConfig) (*BigQueryClient, error)

NewBigQueryClient creates a new BigQueryClient.

func (*BigQueryClient) GetDataAfterDatetime

func (bqc *BigQueryClient) GetDataAfterDatetime(dataset, table, dateField, datetime string, bqSchema *BigQuerySchema) (*bigquery.RowIterator, error)

GetDataAfterDatetime gets all data after the specified datetime. Golang's client library for BigQuery supports automatic paging, meaning that this function can be called without worrying about how much data is being returned. See https://cloud.google.com/bigquery/docs/paging-results.

func (*BigQueryClient) GetTableSchema

func (bqc *BigQueryClient) GetTableSchema(dataset, table string) (*BigQuerySchema, error)

GetTableSchema gets the schema for the specified BigQuery table. It returns a map whose keys are column names and values are BigQuery types.

func (*BigQueryClient) ListTables

func (bqc *BigQueryClient) ListTables() error

ListTables lists all tables in the BigQuery instance.

type BigQueryConfig

type BigQueryConfig struct {
	ProjectID string `yaml:"projectID"`
}

BigQueryConfig stores configuration needed to connect to the BigQuery instance.

type BigQuerySchema

type BigQuerySchema struct {
	// contains filtered or unexported fields
}

BigQuerySchema is a map of column names to BigQuery datatypes.

type Config

type Config struct {
	*YAMLConfig
}

Config holds the configuration for the transfer, dictated by the YAML file or environment variables.

func NewConfig

func NewConfig(yamlFile string) (*Config, error)

NewConfig creates a new Config.

type Logger

type Logger struct {
	*log.Logger
}

Logger proves per-table logging.

func NewLogger

func NewLogger(tableName string) *Logger

NewLogger returns a new Logger.

func (*Logger) Errorf

func (l *Logger) Errorf(format string, v ...interface{})

Errorf adds an "Error: " prefix to the format string

type PostgresClient

type PostgresClient struct {
	*pgxpool.Pool
	// contains filtered or unexported fields
}

PostgresClient interacts with an instance of PostgreSQL.

func NewPostgresClient

func NewPostgresClient(config PostgresConfig) (*PostgresClient, error)

NewPostgresClient creates a new PostgresClient.

func (*PostgresClient) CreateTableFromSchema

func (pc *PostgresClient) CreateTableFromSchema(tableName string, pgSchema *PostgresSchema) error

CreateTableFromSchema creates a new table from a PostgresSchema.

func (*PostgresClient) GetExistingTableNames

func (pc *PostgresClient) GetExistingTableNames() ([]string, error)

GetExistingTableNames returns a list of public tables.

func (*PostgresClient) GetMostRecentEntry

func (pc *PostgresClient) GetMostRecentEntry(table, datetimeField string) (string, error)

GetMostRecentEntry returns the lastest timestamp of an entry. If the table is empty, an empty string will be returned.

func (*PostgresClient) TableExists

func (pc *PostgresClient) TableExists(searchTable string) (bool, error)

TableExists returns whether a table with the given name exists. Table names are cached when the PostgresClient is initialized.

type PostgresConfig

type PostgresConfig struct {
	DbHost string `yaml:"dbHost"`
	DbPort string `yaml:"dbPort"`
	DbUser string `yaml:"dbUser"`
	DbPass string `yaml:"dbPass"`
	DbName string `yaml:"dbName"`
}

PostgresConfig stores configuration needed to connect to the PostgreSQL instance.

type PostgresSchema

type PostgresSchema struct {
	// contains filtered or unexported fields
}

PostgresSchema is a map of column names to Postgres datatypes.

type TableConfig

type TableConfig struct {
	Datasets []struct {
		Name   string `yaml:"name"`
		Tables []struct {
			Name      string `yaml:"name"`
			DateField string `yaml:"dateField"`
		} `yaml:"tables"`
	} `yaml:"datasets"`
}

TableConfig stores configuration about which BigQuery datasets and tables to transfer to PostgreSQL.

type Transfer

type Transfer struct {
	// contains filtered or unexported fields
}

Transfer provides functions to transfer data from BigQuery to PostgreSQL.

func NewTransfer

func NewTransfer(bq *BigQueryClient, pg *PostgresClient, config *TableConfig) *Transfer

NewTransfer returns a new Transfer.

func (*Transfer) Run

func (t *Transfer) Run()

Run creates a tableTransfer goroutine for each table to be transfered.

func (*Transfer) RunContinuously

func (t *Transfer) RunContinuously(sleepAfterTransferInSecs int)

RunContinuously continuously runs Transfer.Run, with sleepTimeInSecs between transfers.

type YAMLConfig

type YAMLConfig struct {
	BigQuery BigQueryConfig `yaml:"bigQuery"`
	Postgres PostgresConfig `yaml:"postgres"`
	Transfer TableConfig    `yaml:"transfer"`
}

YAMLConfig stores the configuration of the application.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL