Documentation ¶
Overview ¶
Package source provides functionality for dealing with data sources.
Index ¶
- Constants
- func AbsLocation(loc string) string
- func IsSQLLocation(loc string) bool
- func LocationFileName(src *Source) (string, error)
- func ParseTableHandle(input string) (handle, table string, err error)
- func RedactLocation(loc string) string
- func ReservedHandles() []string
- func ShortLocation(loc string) string
- func SuggestHandle(typ Type, loc string, takenFn func(string) bool) (string, error)
- func TempDirFile(filename string) (dir string, f *os.File, cleanFn func() error, err error)
- func VerifyLegalHandle(handle string) error
- func VerifySetIntegrity(ss *Set) error
- type ColMetadata
- type DBVar
- type FileOpenFunc
- type Files
- func (fs *Files) AddStdin(f *os.File) error
- func (fs *Files) AddTypeDetectors(detectFns ...TypeDetectFunc)
- func (fs *Files) CleanupE(fn func() error)
- func (fs *Files) Close() error
- func (fs *Files) Open(src *Source) (io.ReadCloser, error)
- func (fs *Files) OpenFunc(src *Source) func() (io.ReadCloser, error)
- func (fs *Files) ReadAll(src *Source) ([]byte, error)
- func (fs *Files) Size(src *Source) (size int64, err error)
- func (fs *Files) Type(ctx context.Context, loc string) (Type, error)
- func (fs *Files) TypeStdin(ctx context.Context) (Type, error)
- type Metadata
- type Set
- func (s *Set) Active() *Source
- func (s *Set) Add(src *Source) error
- func (s *Set) Exists(handle string) bool
- func (s *Set) Get(handle string) (*Source, error)
- func (s *Set) Handles() []string
- func (s *Set) Items() []*Source
- func (s *Set) MarshalJSON() ([]byte, error)
- func (s *Set) MarshalYAML() (interface{}, error)
- func (s *Set) Remove(handle string) error
- func (s *Set) Scratch() *Source
- func (s *Set) SetActive(handle string) (*Source, error)
- func (s *Set) SetScratch(handle string) (*Source, error)
- func (s *Set) String() string
- func (s *Set) UnmarshalJSON(b []byte) error
- func (s *Set) UnmarshalYAML(unmarshal func(interface{}) error) error
- type Source
- type TableMetadata
- type Type
- type TypeDetectFunc
Constants ¶
const ( // StdinHandle is the reserved handle for stdin pipe input. StdinHandle = "@stdin" // ActiveHandle is the reserved handle for the active source. // FIXME: it should be possible to use "@0" as the active handle, but // the SLQ grammar doesn't currently allow it. Possibly change this // value to "@0" after modifying the SLQ grammar. ActiveHandle = "@active" // ScratchHandle is the reserved handle for the scratch source. ScratchHandle = "@scratch" // JoinHandle is the reserved handle for the join db source. JoinHandle = "@join" // MonotableName is the table name used for "mono-table" drivers // such as CSV. Thus a source @address_csv will have its // data accessible via @address_csv.data. MonotableName = "data" )
const TypeNone = Type("")
TypeNone is the zero value of driver.Type.
Variables ¶
This section is empty.
Functions ¶
func AbsLocation ¶
AbsLocation returns the absolute path of loc. That is, relative paths etc in loc are resolved. If loc is not a file path or it cannot be processed, loc is returned unmodified.
func IsSQLLocation ¶
IsSQLLocation returns true if source location loc seems to be a DSN for a SQL driver.
func LocationFileName ¶
LocationFileName returns the final component of the file/URL path.
func ParseTableHandle ¶
ParseTableHandle attempts to parse a SLQ source handle and/or table name. Surrounding whitespace is trimmed. Examples of valid input values are:
@handle.tblName @handle .tblName
func RedactLocation ¶ added in v0.15.0
RedactLocation returns a redacted version of the source location loc, with the password component (if any) of the location masked.
func ReservedHandles ¶
func ReservedHandles() []string
ReservedHandles returns a slice of the handle names that are reserved for sq use.
func ShortLocation ¶
ShortLocation returns a short location string. For example, the base name (data.xlsx) for a file or for a DSN, user@host[:port]/db.
func SuggestHandle ¶
SuggestHandle suggests a handle based on location and type. If typ is TypeNone, the type will be inferred from loc. The takenFn is used to determine if a suggested handle is free to be used (e.g. "@sakila_csv" -> "@sakila_csv_1", etc).
If the base name (derived from loc) contains illegal handle runes, those are replaced with underscore. If the handle would start with a number or underscore, it will be prefixed with "h" (for "handle"). Thus "123.xlsx" becomes "@h123_xlsx".
func TempDirFile ¶
TempDirFile creates a new temporary file in a new temp dir, opens the file for reading and writing, and returns the resulting *os.File, as well as the parent dir. It is the caller's responsibility to close the file and remove the temp dir, which the returned cleanFn encapsulates.
func VerifyLegalHandle ¶
VerifyLegalHandle returns an error if handle is not an acceptable source handle value. Valid input must match:
\A[@][a-zA-Z][a-zA-Z0-9_]*$
func VerifySetIntegrity ¶
VerifySetIntegrity verifies the internal state of s. Typically this func is invoked after s has been loaded from config, verifying that the config is not corrupt.
Types ¶
type ColMetadata ¶
type ColMetadata struct { Name string `json:"name"` Position int64 `json:"position"` PrimaryKey bool `json:"primary_key"` BaseType string `json:"base_type"` ColumnType string `json:"column_type"` Kind kind.Kind `json:"kind"` Nullable bool `json:"nullable"` DefaultValue string `json:"default_value,omitempty"` Comment string `json:"comment,omitempty"` }
ColMetadata models metadata for a particular column of a data source.
func (*ColMetadata) String ¶
func (c *ColMetadata) String() string
type DBVar ¶
DBVar models a key-value pair for driver config. REVISIT: maybe better named as SourceSetting or such?
type FileOpenFunc ¶
type FileOpenFunc func() (io.ReadCloser, error)
FileOpenFunc returns a func that opens a ReadCloser. The caller is responsible for closing the returned ReadCloser.
type Files ¶
type Files struct {
// contains filtered or unexported fields
}
Files is the centralized API for interacting with files.
Why does Files exist? There's a need for functionality to transparently get a Reader for remote or local files, and most importantly, an ability for multiple goroutines to read/sample a file while its being read (mainly to "sample" the file type, e.g. to determine if it's an XLSX file etc). Currently we use fscache under the hood for this, but our implementation is not satisfactory: in particular, the implementation currently requires that we read the entire source file into fscache before it's available to be read (which is awful if we're reading long-running pipe from stdin). This entire thing needs to be revisited.
func (*Files) AddStdin ¶
AddStdin copies f to fs's cache: the stdin data in f is later accessible via fs.Open(src) where src.Handle is StdinHandle; f's type can be detected via TypeStdin. Note that f is closed by this method.
DESIGN: it's possible we'll ditch AddStdin and TypeStdin
in some future version; this mechanism is a stopgap.
func (*Files) AddTypeDetectors ¶
func (fs *Files) AddTypeDetectors(detectFns ...TypeDetectFunc)
AddTypeDetectors adds type detectors.
func (*Files) Open ¶
func (fs *Files) Open(src *Source) (io.ReadCloser, error)
Open returns a new io.ReadCloser for src.Location. If src.Handle is StdinHandle, AddStdin must first have been invoked. The caller must close the reader.
func (*Files) OpenFunc ¶
func (fs *Files) OpenFunc(src *Source) func() (io.ReadCloser, error)
OpenFunc returns a func that invokes fs.Open for src.Location.
func (*Files) Size ¶
Size returns the file size of src.Location. This exists as a convenience function and something of a replacement for using os.Stat to get the file size.
type Metadata ¶
type Metadata struct { // Handle is the source handle. Handle string `json:"handle"` // Name is the base name of the source, e.g. the base filename // or DB name etc. For example, "sakila". Name string `json:"name"` // FQName is the full name of the data source, typically // including catalog/schema etc. For example, "sakila.public" FQName string `json:"name_fq"` // SourceType is the source driver type. SourceType Type `json:"driver"` // DBDriverType is the type of the underling DB driver. // This is the same value as SourceType for SQL database types. DBDriverType Type `json:"db_driver"` // DBProduct is the DB product string, such as "PostgreSQL 9.6.17 on x86_64-pc-linux-gnu". DBProduct string `json:"db_product"` // DBVersion is the DB version. DBVersion string `json:"db_version"` // DBVars are configuration name-value pairs from the DB. DBVars []DBVar `json:"db_variables,omitempty"` // Location is the source location such as a DB connection string, // a file path, or a URL. Location string `json:"location"` // User is the username, if applicable. User string `json:"user,omitempty"` // Size is the physical size of the source in bytes, e.g. DB file size. Size int64 `json:"size"` // Tables is the metadata for each table in the source. Tables []*TableMetadata `json:"tables"` }
Metadata holds metadata for a source.
func (*Metadata) TableNames ¶
TableNames is a convenience method that returns md's table names.
type Set ¶
type Set struct {
// contains filtered or unexported fields
}
Set is a set of sources. Typically it is loaded from config at a start of a run.
func (*Set) MarshalJSON ¶
MarshalJSON implements json.Marshaler.
func (*Set) MarshalYAML ¶
MarshalYAML implements yaml.Marshaler.
func (*Set) SetActive ¶
SetActive sets the active src, or unsets any active src if handle is empty. If handle does not exist, an error is returned.
func (*Set) SetScratch ¶
SetScratch sets the scratch src to handle.
func (*Set) UnmarshalJSON ¶
UnmarshalJSON implements json.Unmarshaler
func (*Set) UnmarshalYAML ¶
UnmarshalYAML implements yaml.Unmarshaler.
type Source ¶
type Source struct { Handle string `yaml:"handle" json:"handle"` Type Type `yaml:"type" json:"type"` Location string `yaml:"location" json:"location"` Options options.Options `yaml:"options,omitempty" json:"options,omitempty"` }
Source describes a data source.
func (*Source) RedactedLocation ¶
RedactedLocation returns s.Location, with the password component of the location masked.
func (*Source) ShortLocation ¶
ShortLocation returns a short location string. For example, the base name (data.xlsx) for a file or for a DSN, user@host[:port]/db.
type TableMetadata ¶
type TableMetadata struct { // Name is the table name, such as "actor". Name string `json:"name"` // FQName is the fully-qualified name, such as "sakila.public.actor" FQName string `json:"name_fq,omitempty"` // TableType indicates if this is a "table" or "view". The value // is driver-independent. See DBTableType for the driver-dependent // value. TableType string `json:"table_type,omitempty"` // DBTableType indicates if this is a table or view, etc. // The value is driver-dependent, e.g. "BASE TABLE" or "VIEW" for postgres. DBTableType string `json:"table_type_db,omitempty"` // RowCount is the number of rows in the table. RowCount int64 `json:"row_count"` // Size is the physical size of the table in bytes. For a view, this // may be nil. Size *int64 `json:"size,omitempty"` // Comment is the comment for the table. Typically empty. Comment string `json:"comment,omitempty"` // Columns holds the metadata for the table's columns. Columns []*ColMetadata `json:"columns"` }
TableMetadata models table (or view) metadata.
func TableFromSourceMetadata
deprecated
func TableFromSourceMetadata(srcMeta *Metadata, tblName string) (*TableMetadata, error)
TableFromSourceMetadata returns TableMetadata whose name matches tblName.
Deprecated: Each driver should implement this correctly for a single table.
func (*TableMetadata) Column ¶
func (t *TableMetadata) Column(colName string) *ColMetadata
Column returns the named col or nil.
func (*TableMetadata) PKCols ¶
func (t *TableMetadata) PKCols() []*ColMetadata
PKCols returns a possibly empty slice of cols that are part of the table primary key.
func (*TableMetadata) String ¶
func (t *TableMetadata) String() string
type Type ¶
type Type string
Type is a source type, e.g. "mysql", "postgres", "csv", etc.
func DetectMagicNumber ¶
func DetectMagicNumber(ctx context.Context, log lg.Log, openFn FileOpenFunc) (detected Type, score float32, err error)
DetectMagicNumber is a TypeDetectFunc that uses an external pkg (h2non/filetype) to detect the "magic number" from the start of files.
type TypeDetectFunc ¶
type TypeDetectFunc func(ctx context.Context, log lg.Log, openFn FileOpenFunc) (detected Type, score float32, err error)
TypeDetectFunc interrogates a byte stream to determine the source driver type. A score is returned indicating the the confidence that the driver type has been detected. A score <= 0 is failure, a score >= 1 is success; intermediate values indicate some level of confidence. An error is returned only if an IO problem occurred. The implementation gets access to the byte stream by invoking openFn, and is responsible for closing any reader it opens.