duckdb

package
v0.50.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 16, 2024 License: Apache-2.0 Imports: 38 Imported by: 0

README

The duckdb driver has a number of complexities that can be removed as we change our logic or duckDB evolves:

  • duckDB runs into a number of issues on concurrent reads and writes like the wal file explodes. To solve this we write every table in a different .db file and attach it into main db. Relevant issue : https://github.com/duckdb/duckdb/issues/9150 (Note : its not fully fixed as of writing). We call this as external table storage.
  • duckDB doesn't free storage when table is dropped. We see that it is also not able to re-use this entire space leading to ever increasing db file size due to source refreshes. The above fix also solve this issue since every new table is created in a new file.
  • duckDB sometimes can run into internal errors after which every new query fails. So we need to reopen db handles when this happens. Check reopenDB() in runtime/drivers/duckdb/duckdb.go.
  • varchar columns can take up more space and are inefficient for querying due to duckDB's lightweight compression. If the cardinality of such columns is low, we can convert them into enum to improve performance. More details in this notion doc : https://www.notion.so/rilldata/Converting-low-cardinality-VARCHAR-dimensions-to-ENUMs-a07ca0a26bca4338a6f941c2604e9f62?pvs=4
  • duckDB views have somewhat unusual behaviour if using * in the view definition and order of the columns in the underlying table changes. Refer Test_connection_ChangingOrder in runtime/drivers/duckdb/olap_crud_test.go for an example. To mitigate this we expand * to include all columns in sorted order in the view. Refer generateSelectQuery in runtime/drivers/duckdb/olap.go. Since we are changing order of the columns we also enable it just for cloud since users can be interested in original order while doing modelling locally.
  • We use allow_host_access as a proxy to check if its local or cloud which is a hack we would like to remove in future.

Few others are also listed in comments in the code.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewDuckDBToDuckDB added in v0.38.0

func NewDuckDBToDuckDB(to drivers.OLAPStore, logger *zap.Logger) drivers.Transporter

func NewFileStoreToDuckDB added in v0.38.0

func NewFileStoreToDuckDB(from drivers.FileStore, to drivers.OLAPStore, logger *zap.Logger) drivers.Transporter

func NewMotherduckToDuckDB added in v0.38.0

func NewMotherduckToDuckDB(from drivers.Handle, to drivers.OLAPStore, logger *zap.Logger) drivers.Transporter

func NewObjectStoreToDuckDB added in v0.38.0

func NewObjectStoreToDuckDB(from drivers.ObjectStore, to drivers.OLAPStore, logger *zap.Logger) drivers.Transporter

func NewSQLStoreToDuckDB added in v0.38.0

func NewSQLStoreToDuckDB(from drivers.SQLStore, to drivers.OLAPStore, logger *zap.Logger) drivers.Transporter

func NewWarehouseToDuckDB added in v0.48.0

func NewWarehouseToDuckDB(from drivers.Warehouse, to drivers.OLAPStore, logger *zap.Logger) drivers.Transporter

func RowsToSchema added in v0.39.0

func RowsToSchema(r *sqlx.Rows) (*runtimev1.StructType, error)

Types

type Driver

type Driver struct {
	// contains filtered or unexported fields
}

func (Driver) Drop added in v0.27.0

func (d Driver) Drop(cfgMap map[string]any, logger *zap.Logger) error

func (Driver) HasAnonymousSourceAccess added in v0.30.0

func (d Driver) HasAnonymousSourceAccess(ctx context.Context, src map[string]any, logger *zap.Logger) (bool, error)

func (Driver) Open

func (d Driver) Open(instanceID string, cfgMap map[string]any, ac *activity.Client, logger *zap.Logger) (drivers.Handle, error)

func (Driver) Spec added in v0.30.0

func (d Driver) Spec() drivers.Spec

func (Driver) TertiarySourceConnectors added in v0.35.0

func (d Driver) TertiarySourceConnectors(ctx context.Context, src map[string]any, logger *zap.Logger) ([]string, error)

type ModelInputProperties added in v0.45.0

type ModelInputProperties struct {
	SQL  string `mapstructure:"sql"`
	Args []any  `mapstructure:"args"`
}

func (*ModelInputProperties) Validate added in v0.45.0

func (p *ModelInputProperties) Validate() error

type ModelOutputProperties added in v0.45.0

type ModelOutputProperties struct {
	Table               string                      `mapstructure:"table"`
	Materialize         *bool                       `mapstructure:"materialize"`
	UniqueKey           []string                    `mapstructure:"unique_key"`
	IncrementalStrategy drivers.IncrementalStrategy `mapstructure:"incremental_strategy"`
}

func (*ModelOutputProperties) Validate added in v0.45.0

type ModelResultProperties added in v0.45.0

type ModelResultProperties struct {
	Table         string `mapstructure:"table"`
	View          bool   `mapstructure:"view"`
	UsedModelName bool   `mapstructure:"used_model_name"`
}

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL