Documentation ¶
Overview ¶
Package odbc provides a DataSource which reads data from an ODBC connection. Rows will be assigned to workers according to provided configuration parameters, dividing the total row count into equal-sized partitions via LIMIT/OFFSET.
Only a subset of Sif ColumnTypes are supported by this DataSource, corresponding to the sql.NullX types provided by the Go sql package.
The odbc DataSource utilizes github.com/alexbrainman/odbc under the hood, and requires an appropriately configured ODBC environment and driver. Configuration information can be found at https://github.com/alexbrainman/odbc/wiki. In a Linux environment, installing unixODBC and the corresponding development package [unixODBC-devel (dnf) or unixodbc-dev (apt)], along with an ODBC driver, is typically sufficient setup to utilize this package.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CreateDataFrame ¶
func CreateDataFrame(conf *DataSourceConf, schema sif.Schema) sif.DataFrame
CreateDataFrame is a factory for DataSources
Example (Mariadb) ¶
package main import ( "fmt" "github.com/go-sif/sif/coltype" "github.com/go-sif/sif/datasource/odbc" "github.com/go-sif/sif/schema" ) func main() { // The provided Schema should correspond to column names in the // underlying database table. Only columns in this Schema will be // SELECTed from the underlying table. columnA := coltype.String("columnA", 32) columnB := coltype.Int16("columnB") schema, err := schema.CreateSchema(columnA, columnB) if err != nil { panic(err) } // Configure with an ODBC connection string, // along with a table name and maximum Partition size. // "driver" name should correspond to the name found // in square brackets in /etc/odbcinst.ini // Configuration options can be found here: https://mariadb.com/kb/en/creating-a-data-source-with-mariadb-connectorodbc/ conn := fmt.Sprintf("driver=MariaDB;server=%s;database=%s;user=%s;password=%s;", "localhost", "db", "root", "password") conf := &odbc.DataSourceConf{ DataSourceName: conn, TableName: "taxi_one_day", PartitionSize: 1024 * 10, } odbc.CreateDataFrame(conf, schema) // returns a usable DataFrame }
Output:
Example (Sqlite) ¶
package main import ( "fmt" "github.com/go-sif/sif/coltype" "github.com/go-sif/sif/datasource/odbc" "github.com/go-sif/sif/schema" ) func main() { // The provided Schema should correspond to column names in the // underlying database table. Only columns in this Schema will be // SELECTed from the underlying table. columnA := coltype.String("columnA", 32) columnB := coltype.Int16("columnB") schema, err := schema.CreateSchema(columnA, columnB) if err != nil { panic(err) } // Configure with an ODBC connection string, // along with a table name and maximum Partition size. // "driver" name should correspond to the name found // in square brackets in /etc/odbcinst.ini conn := fmt.Sprintf("driver=SQLITE3;database=%s;", "path/to/database.sqlite3") conf := &odbc.DataSourceConf{ DataSourceName: conn, TableName: "taxi_one_day", PartitionSize: 1024 * 10, } odbc.CreateDataFrame(conf, schema) // returns a usable DataFrame }
Output:
Types ¶
type DataSource ¶
type DataSource struct {
// contains filtered or unexported fields
}
DataSource an ODBC connection which will supply data to be manipulated by a DataFrame
func (*DataSource) Analyze ¶
func (ds *DataSource) Analyze() (sif.PartitionMap, error)
Analyze returns a PartitionMap, describing how the source data should be divided into Partitions
func (*DataSource) DeserializeLoader ¶
func (ds *DataSource) DeserializeLoader(bytes []byte) (sif.PartitionLoader, error)
DeserializeLoader creates a PartitionLoader for this DataSource from a serialized representation
func (*DataSource) IsStreaming ¶
func (ds *DataSource) IsStreaming() bool
IsStreaming returns true iff this DataSource provides a continuous stream of data
type DataSourceConf ¶
type DataSourceConf struct { DataSourceName string // The ODBC DataSourceName to pass to sql.Open TableName string // A database table to fetch rows from PartitionSize int // The maximum number of rows per Partition. Defaults to 1024. }
DataSourceConf configures an odbc DataSource
type PartitionLoader ¶
type PartitionLoader struct {
// contains filtered or unexported fields
}
PartitionLoader is capable of loading partitions of data from a SQL query via ODBC
func (*PartitionLoader) GobDecode ¶
func (pl *PartitionLoader) GobDecode(in []byte) error
GobDecode deserializes a PartitionLoader
func (*PartitionLoader) GobEncode ¶
func (pl *PartitionLoader) GobEncode() ([]byte, error)
GobEncode serializes a PartitionLoader
func (*PartitionLoader) Load ¶
func (pl *PartitionLoader) Load(parser sif.DataSourceParser) (sif.PartitionIterator, error)
Load is capable of loading partitions of data from a SQL query via ODBC
func (*PartitionLoader) ToString ¶
func (pl *PartitionLoader) ToString() string
ToString returns a string representation of this PartitionLoader
type PartitionMap ¶
type PartitionMap struct {
// contains filtered or unexported fields
}
PartitionMap is an iterator producing a sequence of PartitionLoaders
func (*PartitionMap) HasNext ¶
func (pm *PartitionMap) HasNext() bool
HasNext returns true iff there is another PartitionLoader remaining
func (*PartitionMap) Next ¶
func (pm *PartitionMap) Next() sif.PartitionLoader
Next returns the next PartitionLoader for a file