source

package
v0.40.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 3, 2023 License: MIT Imports: 32 Imported by: 0

Documentation

Overview

Package source provides functionality for dealing with data sources.

Index

Constants

View Source
const (
	// StdinHandle is the reserved handle for stdin pipe input.
	StdinHandle = "@stdin"

	// ActiveHandle is the reserved handle for the active source.
	// FIXME: it should be possible to use "@0" as the active handle, but
	//  the SLQ grammar doesn't currently allow it. Possibly change this
	//  value to "@0" after modifying the SLQ grammar.
	ActiveHandle = "@active"

	// ScratchHandle is the reserved handle for the scratch source.
	ScratchHandle = "@scratch"

	// JoinHandle is the reserved handle for the join db source.
	JoinHandle = "@join"

	// MonotableName is the table name used for "mono-table" drivers
	// such as CSV. Thus a source @address_csv will have its
	// data accessible via @address_csv.data.
	MonotableName = "data"
)
View Source
const (

	// RootGroup is the identifier for the default root group.
	RootGroup = "/"
)
View Source
const TypeNone = DriverType("")

TypeNone is the zero value of DriverType.

Variables

This section is empty.

Functions

func AbsLocation

func AbsLocation(loc string) string

AbsLocation returns the absolute path of loc. That is, relative paths etc. are resolved. If loc is not a file path or it cannot be processed, loc is returned unmodified.

func Contains added in v0.33.0

func Contains[S *Source | ~string](srcs []*Source, s S) bool

Contains returns true if srcs contains s, where s is a Source or a source handle.

func IsSQLLocation

func IsSQLLocation(loc string) bool

IsSQLLocation returns true if source location loc seems to be a DSN for a SQL driver.

func IsValidGroup added in v0.33.0

func IsValidGroup(group string) bool

IsValidGroup returns true if group is a valid group. Examples:

/
prod
prod/customer
prod/customer/pg

Note that "/" is a special case, representing the root group.

func IsValidHandle added in v0.33.0

func IsValidHandle(handle string) bool

IsValidHandle returns false if handle is not a valid handle.

See also: ValidHandle.

func LocationFileName

func LocationFileName(src *Source) (string, error)

LocationFileName returns the final component of the file/URL path.

func LocationWithPassword added in v0.18.0

func LocationWithPassword(loc, passw string) (string, error)

LocationWithPassword returns the location string with the password value set, overriding any existing password. If loc is not a URL (e.g. it's a file path), it is returned unmodified.

func ParseTableHandle

func ParseTableHandle(input string) (handle, table string, err error)

ParseTableHandle attempts to parse a SLQ source handle and/or table name. Surrounding whitespace is trimmed. Examples of valid input values are:

@handle.tblName
@handle
.tblName

func RedactGroup added in v0.34.0

func RedactGroup(g *Group)

RedactGroup modifies g, cloning each descendant Source, and setting the Source.Location field of each contained source to its redacted value.

func RedactLocation added in v0.15.0

func RedactLocation(loc string) string

RedactLocation returns a redacted version of the source location loc, with the password component (if any) of the location masked.

func ReservedHandles

func ReservedHandles() []string

ReservedHandles returns a slice of the handle names that are reserved for sq use.

func ShortLocation

func ShortLocation(loc string) string

ShortLocation returns a short location string. For example, the base name (data.xlsx) for a file, or for a DSN, user@host[:port]/db.

func Sort added in v0.33.0

func Sort(srcs []*Source)

Sort sorts a slice of sources by handle.

func SortGroups added in v0.33.0

func SortGroups(groups []*Group)

SortGroups sorts a slice of groups by name.

func SuggestHandle

func SuggestHandle(coll *Collection, typ DriverType, loc string) (string, error)

SuggestHandle suggests a handle based on location and type. If typ is TypeNone, the type will be inferred from loc. The takenFn is used to determine if a suggested handle is free to be used (e.g. "@csv/sakila" -> "@csv/sakila1", etc).

If the base name (derived from loc) contains illegal handle runes, those are replaced with underscore. If the handle starts with a number or underscore, it will be prefixed with "h" (for "handle"). Thus "123.xlsx" becomes "@h123_xlsx".

func Target added in v0.31.0

func Target(src *Source, tbl string) string

Target returns @handle.tbl. This is often used in log messages.

func TempDirFile

func TempDirFile(filename string) (dir, file string, err error)

TempDirFile creates a new temporary file in a new temp dir, opens the file for reading and writing, and then closes it. It's probably unnecessary to go through the ceremony of opening and closing the file, but maybe it's better to fail early. It is the caller's responsibility to remove the file and/or dir if desired.

func ValidGroup added in v0.33.0

func ValidGroup(group string) error

ValidGroup returns an error if group is not a valid group name.

func ValidHandle added in v0.33.0

func ValidHandle(handle string) error

ValidHandle returns an error if handle is not an acceptable source handle value. Valid input must match:

\A@([a-zA-Z][a-zA-Z0-9_]*)(/[a-zA-Z][a-zA-Z0-9_]*)*$

Examples:

@handle
@group/handle
@group/sub/sub2/handle

See also: IsValidHandle.

func VerifyIntegrity added in v0.34.0

func VerifyIntegrity(coll *Collection) (repaired bool, err error)

VerifyIntegrity verifies the internal state of coll. Typically this func is invoked after coll has been loaded from config, verifying that the config is not corrupt. If err is returned non-nil, repaired may be returned true to indicate that coll has been repaired and modified. The caller should save the config to persist the repair.

Types

type ColMetadata

type ColMetadata struct {
	Name         string    `json:"name"`
	Position     int64     `json:"position"`
	PrimaryKey   bool      `json:"primary_key"`
	BaseType     string    `json:"base_type"`
	ColumnType   string    `json:"column_type"`
	Kind         kind.Kind `json:"kind"`
	Nullable     bool      `json:"nullable"`
	DefaultValue string    `json:"default_value,omitempty"`
	Comment      string    `json:"comment,omitempty"`
}

ColMetadata models metadata for a particular column of a data source.

func (*ColMetadata) Clone added in v0.23.0

func (c *ColMetadata) Clone() *ColMetadata

Clone returns a deep copy of c. If c is nil, nil is returned.

func (*ColMetadata) String

func (c *ColMetadata) String() string

type Collection added in v0.34.0

type Collection struct {
	// contains filtered or unexported fields
}

Collection is a set of sources. Typically it is loaded from config at a start of a run. Collection's methods are safe for concurrent use.

func (*Collection) Active added in v0.34.0

func (c *Collection) Active() *Source

Active returns the active source, or nil if no active source.

func (*Collection) ActiveGroup added in v0.34.0

func (c *Collection) ActiveGroup() string

ActiveGroup returns the active group, which may be the root group, represented by "/".

func (*Collection) ActiveHandle added in v0.34.0

func (c *Collection) ActiveHandle() string

ActiveHandle returns the handle of the active source, or empty string if no active src.

func (*Collection) Add added in v0.34.0

func (c *Collection) Add(src *Source) error

Add adds src to s.

func (*Collection) Clone added in v0.34.0

func (c *Collection) Clone() *Collection

Clone returns a deep copy of s. If s is nil, nil is returned.

func (*Collection) Data added in v0.34.0

func (c *Collection) Data() any

Data returns the internal representation of the set data. This is a filthy hack so that the internal data can be passed directly to sq's colorizing json encoder (it can't handle colorization of values that implement json.Marshaler).

There are two long-term solutions here:

  1. The color encoder needs to be able to handle json.RawMessage.
  2. Refactor source.Collection so that it doesn't have this weird internal representation.

func (*Collection) Get added in v0.34.0

func (c *Collection) Get(handle string) (*Source, error)

Get gets the src with handle, or returns an error.

func (*Collection) Groups added in v0.34.0

func (c *Collection) Groups() []string

Groups returns the sorted set of groups, as defined via the handle names.

Given a set of handles:

@handle1
@group1/handle2
@group1/handle3
@group2/handle4
@group2/sub1/handle5
@group2/sub1/sub2/sub3/handle6

Then these groups will be returned.

/
group1
group2
group2/sub1
group2/sub1/sub2
group2/sub1/sub2/sub3

Note that default or root group is represented by "/".

func (*Collection) Handles added in v0.34.0

func (c *Collection) Handles() []string

Handles returns a new slice containing the set of all source handles.

func (*Collection) HandlesInGroup added in v0.34.0

func (c *Collection) HandlesInGroup(group string) ([]string, error)

HandlesInGroup returns the set of handles in the active group.

func (*Collection) IsExistingGroup added in v0.34.0

func (c *Collection) IsExistingGroup(group string) bool

IsExistingGroup returns false if group does not exist.

func (*Collection) IsExistingSource added in v0.34.0

func (c *Collection) IsExistingSource(handle string) bool

IsExistingSource returns true if handle already exists in the set.

func (*Collection) MarshalJSON added in v0.34.0

func (c *Collection) MarshalJSON() ([]byte, error)

MarshalJSON implements json.Marshaler.

func (*Collection) MarshalYAML added in v0.34.0

func (c *Collection) MarshalYAML() (any, error)

MarshalYAML implements yaml.Marshaler.

func (*Collection) MoveHandleToGroup added in v0.34.0

func (c *Collection) MoveHandleToGroup(handle, toGroup string) (*Source, error)

MoveHandleToGroup moves renames handle to be in toGroup.

$ sq mv @prod/db production
@production/db

$ sq mv @prod/db /
@db

func (*Collection) Remove added in v0.34.0

func (c *Collection) Remove(handle string) error

Remove removes from the set the src having handle.

func (*Collection) RemoveGroup added in v0.34.0

func (c *Collection) RemoveGroup(group string) ([]*Source, error)

RemoveGroup removes all sources that are children of group. The removed sources are returned. If group was the active group, the active group is set to "/" (root group).

func (*Collection) RenameGroup added in v0.34.0

func (c *Collection) RenameGroup(oldGroup, newGroup string) ([]*Source, error)

RenameGroup renames oldGroup to newGroup. Each affected source is returned. This effectively "moves" sources in oldGroup to newGroup, by renaming those sources.

func (*Collection) RenameSource added in v0.34.0

func (c *Collection) RenameSource(oldHandle, newHandle string) (*Source, error)

RenameSource renames oldHandle to newHandle. If the source was the active source, it remains so (under the new handle). If the source's group was the active group and oldHandle was the only member of the group, newHandle's group becomes the new active group.

func (*Collection) Scratch added in v0.34.0

func (c *Collection) Scratch() *Source

Scratch returns the scratch source, or nil.

func (*Collection) SetActive added in v0.34.0

func (c *Collection) SetActive(handle string, force bool) (*Source, error)

SetActive sets the active src, or unsets any active src if handle is empty (and thus returns nil,nil). If handle does not exist, an error is returned, unless arg force is true. In which case, the returned *Source may be nil.

TODO: Revisit SetActive(force) mechanism. It's a hack that we shouldn't need.

func (*Collection) SetActiveGroup added in v0.34.0

func (c *Collection) SetActiveGroup(group string) error

SetActiveGroup sets the active group, returning an error if group does not exist.

func (*Collection) SetScratch added in v0.34.0

func (c *Collection) SetScratch(handle string) (*Source, error)

SetScratch sets the scratch src to handle. If handle is empty string, the scratch src is unset, and nil,nil is returned.

func (*Collection) Sources added in v0.34.0

func (c *Collection) Sources() []*Source

Sources returns a new slice containing the set's sources. It is safe to mutate the returned slice, but note that changes to the *Source elements themselves do take effect in the set's backing data.

func (*Collection) SourcesInGroup added in v0.34.0

func (c *Collection) SourcesInGroup(group string) ([]*Source, error)

SourcesInGroup returns all sources that are descendants of group. If group is "" or "/", all sources are returned.

func (*Collection) String added in v0.34.0

func (c *Collection) String() string

String returns a log/debug friendly representation.

func (*Collection) Tree added in v0.34.0

func (c *Collection) Tree(fromGroup string) (*Group, error)

Tree returns a new Group representing the structure of the set starting at fromGroup downwards. If fromGroup is empty, RootGroup is used. The Group structure is a snapshot of the Collection at the time Tree is invoked. Thus, any change to Collection structure is not reflected in the Group. However, the Source elements of Group are pointers back to the Collection elements, and thus changes to the fields of a Source are reflected in the Collection.

func (*Collection) UnmarshalJSON added in v0.34.0

func (c *Collection) UnmarshalJSON(b []byte) error

UnmarshalJSON implements json.Unmarshaler.

func (*Collection) UnmarshalYAML added in v0.34.0

func (c *Collection) UnmarshalYAML(unmarshal func(any) error) error

UnmarshalYAML implements yaml.Unmarshaler.

func (*Collection) Visit added in v0.34.0

func (c *Collection) Visit(fn func(src *Source) error) error

Visit visits each source.

type DriverDetectFunc added in v0.34.0

type DriverDetectFunc func(ctx context.Context, openFn FileOpenFunc) (detected DriverType, score float32, err error)

DriverDetectFunc interrogates a byte stream to determine the source driver type. A score is returned indicating the confidence that the driver type has been detected. A score <= 0 is failure, a score >= 1 is success; intermediate values indicate some level of confidence. An error is returned only if an IO problem occurred. The implementation gets access to the byte stream by invoking openFn, and is responsible for closing any reader it opens.

type DriverType added in v0.34.0

type DriverType string

DriverType is a driver type, e.g. "mysql", "postgres", "csv", etc.

func DetectMagicNumber

func DetectMagicNumber(ctx context.Context, openFn FileOpenFunc,
) (detected DriverType, score float32, err error)

DetectMagicNumber is a DriverDetectFunc that uses an external pkg (h2non/filetype) to detect the "magic number" from the start of files.

func (DriverType) String added in v0.34.0

func (t DriverType) String() string

String returns a log/debug-friendly representation.

type FileOpenFunc

type FileOpenFunc func() (io.ReadCloser, error)

FileOpenFunc returns a func that opens a ReadCloser. The caller is responsible for closing the returned ReadCloser.

type Files

type Files struct {
	// contains filtered or unexported fields
}

Files is the centralized API for interacting with files.

Why does Files exist? There's a need for functionality to transparently get a Reader for remote or local files, and most importantly, an ability for multiple goroutines to read/sample a file while it's being read (mainly to "sample" the file type, e.g. to determine if it's an XLSX file etc.). Currently we use fscache under the hood for this, but our implementation is not satisfactory: in particular, the implementation currently requires that we read the entire source file into fscache before it's available to be read (which is awful if we're reading long-running pipe from stdin). This entire thing needs to be revisited. Maybe Files even becomes a fs.FS.

func NewFiles

func NewFiles(ctx context.Context) (*Files, error)

NewFiles returns a new Files instance.

func (*Files) AddDriverDetectors added in v0.34.0

func (fs *Files) AddDriverDetectors(detectFns ...DriverDetectFunc)

AddDriverDetectors adds driver type detectors.

func (*Files) AddStdin

func (fs *Files) AddStdin(f *os.File) error

AddStdin copies f to fs's cache: the stdin data in f is later accessible via fs.Open(src) where src.Handle is StdinHandle; f's type can be detected via TypeStdin. Note that f is closed by this method.

REVISIT: it's possible we'll ditch AddStdin and TypeStdin in some future version; this mechanism is a stopgap.

func (*Files) CleanupE

func (fs *Files) CleanupE(fn func() error)

CleanupE adds fn to the cleanup sequence invoked by fs.Close.

func (*Files) Close

func (fs *Files) Close() error

Close closes any open resources.

func (*Files) DriverType added in v0.34.0

func (fs *Files) DriverType(ctx context.Context, loc string) (DriverType, error)

DriverType returns the driver type of loc.

func (*Files) Open

func (fs *Files) Open(src *Source) (io.ReadCloser, error)

Open returns a new io.ReadCloser for src.Location. If src.Handle is StdinHandle, AddStdin must first have been invoked. The caller must close the reader.

func (*Files) OpenFunc

func (fs *Files) OpenFunc(src *Source) func() (io.ReadCloser, error)

OpenFunc returns a func that invokes fs.Open for src.Location.

func (*Files) ReadAll

func (fs *Files) ReadAll(src *Source) ([]byte, error)

ReadAll is a convenience method to read the bytes of a source.

func (*Files) Size

func (fs *Files) Size(src *Source) (size int64, err error)

Size returns the file size of src.Location. This exists as a convenience function and something of a replacement for using os.Stat to get the file size.

func (*Files) TypeStdin

func (fs *Files) TypeStdin(ctx context.Context) (DriverType, error)

TypeStdin detects the type of stdin as previously added by AddStdin. An error is returned if AddStdin was not first invoked. If the type cannot be detected, TypeNone and nil are returned.

type Group added in v0.33.0

type Group struct {
	// Name is the group name. For the root group, this is source.RootGroup ("/").
	Name string `json:"name" yaml:"name"`

	// Active is true if this is the active group in the set.
	Active bool `json:"active" yaml:"active"`

	// Sources are the direct members of the group.
	Sources []*Source `json:"sources,omitempty" yaml:"sources,omitempty"`

	// Groups holds any subgroups.
	Groups []*Group `json:"groups,omitempty" yaml:"groups,omitempty"`
}

Group models the hierarchical group structure of a set.

func (*Group) AllGroups added in v0.33.0

func (g *Group) AllGroups() []*Group

AllGroups returns a new flattened slice of Groups containing g and any subgroups.

func (*Group) AllSources added in v0.33.0

func (g *Group) AllSources() []*Source

AllSources returns a new flattened slice of *Source containing all the sources in g and its descendants.

func (*Group) Counts added in v0.33.0

func (g *Group) Counts() (directSrc, totalSrc, directGroup, totalGroup int)

Counts returns counts for g.

- directSrc: direct source child members of g - totalSrc: all source descendants of g - directGroup: direct group child members of g - totalGroup: all group descendants of g

If g is empty, {0,0,0,0} is returned.

func (*Group) String added in v0.33.0

func (g *Group) String() string

String returns a log/debug friendly representation.

type Metadata

type Metadata struct {
	// Handle is the source handle.
	Handle string `json:"handle" yaml:"handle"`

	// Location is the source location such as a DB connection string,
	// a file path, or a URL.
	Location string `json:"location" yaml:"location"`

	// Name is the base name of the source, e.g. the base filename
	// or DB name etc. For example, "sakila".
	Name string `json:"name" yaml:"name"`

	// FQName is the full name of the data source, typically
	// including catalog/schema etc. For example, "sakila.public"
	FQName string `json:"name_fq" yaml:"name_fq"`

	// Schema is the schema name, for example "public".
	// This may be empty for some sources.
	Schema string `json:"schema,omitempty" yaml:"schema,omitempty"`

	// Driver is the source driver type.
	Driver DriverType `json:"driver" yaml:"driver"`

	// DBDriver is the type of the underling DB driver.
	// This is the same value as Driver for SQL database types.
	DBDriver DriverType `json:"db_driver" yaml:"db_driver"`

	// DBProduct is the DB product string, such as "PostgreSQL 9.6.17 on x86_64-pc-linux-gnu".
	DBProduct string `json:"db_product" yaml:"db_product"`

	// DBVersion is the DB version.
	DBVersion string `json:"db_version" yaml:"db_version"`

	// User is the username, if applicable.
	User string `json:"user,omitempty" yaml:"user,omitempty"`

	// Size is the physical size of the source in bytes, e.g. DB file size.
	Size int64 `json:"size" yaml:"size"`

	// TableCount is the count of tables (excluding views).
	TableCount int64 `json:"table_count" yaml:"table_count"`

	// ViewCount is the count of views.
	ViewCount int64 `json:"view_count" yaml:"view_count"`

	// Tables is the metadata for each table/view in the source.
	Tables []*TableMetadata `json:"tables" yaml:"tables"`

	// DBProperties are name-value pairs from the DB.
	// Typically the value is a scalar such as integer or string, but
	// it can be a nested value such as map or array.
	DBProperties map[string]any `json:"db_properties,omitempty"`
}

Metadata holds metadata for a source.

func (*Metadata) Clone added in v0.23.0

func (md *Metadata) Clone() *Metadata

Clone returns a deep copy of md. If md is nil, nil is returned.

func (*Metadata) String

func (md *Metadata) String() string

func (*Metadata) Table added in v0.31.0

func (md *Metadata) Table(tblName string) *TableMetadata

Table returns the named table, or nil.

func (*Metadata) TableNames

func (md *Metadata) TableNames() []string

TableNames is a convenience method that returns md's table names.

type Source

type Source struct {
	// Handle is used to refer to a source, e.g. "@sakila".
	Handle string `yaml:"handle" json:"handle"`

	// Type is the driver type, e.g. postgres.Type.
	Type DriverType `yaml:"driver" json:"driver"`

	// Location is the source location, such as a DB connection URI,
	// or a file path.
	Location string `yaml:"location" json:"location"`

	// Options are additional params, typically empty.
	Options options.Options `yaml:"options,omitempty" json:"options,omitempty"`
}

Source describes a data source.

func RedactSources added in v0.33.0

func RedactSources(srcs ...*Source) []*Source

RedactSources returns a new slice, where each element is a clone of the input *Source with its location field redacted. This is useful for printing.

func (*Source) Clone added in v0.23.0

func (s *Source) Clone() *Source

Clone returns a deep copy of s. If s is nil, nil is returned.

func (*Source) Group added in v0.33.0

func (s *Source) Group() string

Group returns the source's group. If s is in the root group, the empty string is returned.

FIXME: For root group, should "/" be returned instead of empty string?

func (*Source) LogValue added in v0.31.0

func (s *Source) LogValue() slog.Value

LogValue implements slog.LogValuer.

func (*Source) RedactedLocation

func (s *Source) RedactedLocation() string

RedactedLocation returns s.Location, with the password component of the location masked.

func (*Source) ShortLocation

func (s *Source) ShortLocation() string

ShortLocation returns a short location string. For example, the base name (data.xlsx) for a file or for a DSN, user@host[:port]/db.

func (*Source) String

func (s *Source) String() string

String returns a log/debug-friendly representation.

type TableMetadata

type TableMetadata struct {
	// Name is the table name, such as "actor".
	Name string `json:"name" yaml:"name"`

	// FQName is the fully-qualified name, such as "sakila.public.actor"
	FQName string `json:"name_fq,omitempty" yaml:"name_fq,omitempty"`

	// TableType indicates if this is a "table" or "view". The value
	// is driver-independent. See DBTableType for the driver-dependent
	// value.
	TableType string `json:"table_type,omitempty" yaml:"table_type,omitempty"`

	// DBTableType indicates if this is a table or view, etc.
	// The value is driver-dependent, e.g. "BASE TABLE" or "VIEW" for postgres.
	DBTableType string `json:"table_type_db,omitempty" yaml:"table_type_db,omitempty"`

	// RowCount is the number of rows in the table.
	RowCount int64 `json:"row_count" yaml:"row_count"`

	// Size is the physical size of the table in bytes. For a view, this
	// may be nil.
	Size *int64 `json:"size,omitempty" yaml:"size,omitempty"`

	// Comment is the comment for the table. Typically empty.
	Comment string `json:"comment,omitempty" yaml:"comment,omitempty"`

	// Columns holds the metadata for the table's columns.
	Columns []*ColMetadata `json:"columns" yaml:"columns"`
}

TableMetadata models table (or view) metadata.

func TableFromSourceMetadata deprecated

func TableFromSourceMetadata(srcMeta *Metadata, tblName string) (*TableMetadata, error)

TableFromSourceMetadata returns TableMetadata whose name matches tblName.

Deprecated: Each driver should implement this correctly for a single table.

func (*TableMetadata) Clone added in v0.23.0

func (t *TableMetadata) Clone() *TableMetadata

Clone returns a deep copy of t. If t is nil, nil is returned.

func (*TableMetadata) Column

func (t *TableMetadata) Column(colName string) *ColMetadata

Column returns the named col or nil.

func (*TableMetadata) PKCols

func (t *TableMetadata) PKCols() []*ColMetadata

PKCols returns a possibly empty slice of cols that are part of the table primary key.

func (*TableMetadata) String

func (t *TableMetadata) String() string

Directories

Path Synopsis
Package fetcher provides a mechanism for fetching files from URLs.
Package fetcher provides a mechanism for fetching files from URLs.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL