Documentation
¶
Index ¶
- Variables
- func RegisterStorageFormat(name string, format StorageFormat)
- func RegisterStorageMedia(name string, media StorageMedia)
- type DatumReader
- type DatumWriter
- type GCSMedia
- type LocalMedia
- type RecordIOFormat
- type ResourceSpec
- func (rc *ResourceSpec) DatumReader(ctx context.Context, shard int) (DatumReader, error)
- func (rc *ResourceSpec) DatumWriter(ctx context.Context, shard int) (DatumWriter, error)
- func (rc *ResourceSpec) HasSpec() bool
- func (rc *ResourceSpec) IOReader(ctx context.Context, shard int) (io.ReadCloser, error)
- func (rc *ResourceSpec) IOWriter(ctx context.Context, shard int) (io.WriteCloser, error)
- func (rc *ResourceSpec) ShardPath(shard int) string
- func (rc *ResourceSpec) Sharded() bool
- func (rc *ResourceSpec) String() string
- type StorageFormat
- type StorageMedia
- type TextFormat
Constants ¶
This section is empty.
Variables ¶
Functions ¶
func RegisterStorageFormat ¶
func RegisterStorageFormat(name string, format StorageFormat)
Register a StorageFormat, should be run in init()
func RegisterStorageMedia ¶
func RegisterStorageMedia(name string, media StorageMedia)
Register a StorageMedia, should be run in init()
Types ¶
type DatumReader ¶
type DatumWriter ¶
type GCSMedia ¶
type GCSMedia struct { }
Media: gs Read / write from Google Cloud storage, your resource path looks like format:/gs/bucket-name/object-name, sharding supported with recommended naming. EXPERIMENTAL: bugs bugs.
func (GCSMedia) IOReader ¶
func (gm GCSMedia) IOReader( ctx context.Context, rc ResourceSpec, shard int) (io.ReadCloser, error)
func (GCSMedia) IOWriter ¶
func (gm GCSMedia) IOWriter( ctx context.Context, rc ResourceSpec, shard int) (io.WriteCloser, error)
type LocalMedia ¶
type LocalMedia struct { }
Media: local Read / write local file system. Special path name STDIN, STDOUT, STDERR has their conventional meaning.
func (LocalMedia) IOReader ¶
func (lm LocalMedia) IOReader( ctx context.Context, rc ResourceSpec, shard int) (io.ReadCloser, error)
func (LocalMedia) IOWriter ¶
func (lm LocalMedia) IOWriter( ctx context.Context, rc ResourceSpec, shard int) (io.WriteCloser, error)
type RecordIOFormat ¶
type RecordIOFormat struct {
// contains filtered or unexported fields
}
Format: recordio, recordkv Reads and stores data using recordio format specified in github.com/kuangyh/recordio recordkv stores one datum in two records: one for key and one for value. recordio ignores datum.Key.
func (RecordIOFormat) DatumReader ¶
func (rf RecordIOFormat) DatumReader( ctx context.Context, rc ResourceSpec, shard int) (DatumReader, error)
func (RecordIOFormat) DatumWriter ¶
func (rf RecordIOFormat) DatumWriter( ctx context.Context, rc ResourceSpec, shard int) (DatumWriter, error)
type ResourceSpec ¶
ResourceSpec specifies a external data source / destination in Saw.
func ParseResourcePath ¶
func ParseResourcePath(path string) (ResourceSpec, error)
A resource path has the format: format:{path}{@numShards}? format and media should already be registered. If path is started by '/', parser tries to map its first section to a media, fallsaback to local FS if nothing specified --- do not name your media after well known UNIX root dir, or it would cause confusion.
func (*ResourceSpec) DatumReader ¶
func (rc *ResourceSpec) DatumReader(ctx context.Context, shard int) (DatumReader, error)
Returns a DatumReader for format specified in ResourceSpec, it may or may not use underling IOReader for reading, when the specified format cannot be implemented on a specific media, ErrStorageFeatureNotSupported will be returned.
func (*ResourceSpec) DatumWriter ¶
func (rc *ResourceSpec) DatumWriter(ctx context.Context, shard int) (DatumWriter, error)
Returns a DatumWriter for format specified in ResourceSpec, it may or may not use underling IOWriter for reading, when the specified format cannot be implemented on a specific media, ErrStorageFeatureNotSupported will be returned.
func (*ResourceSpec) IOReader ¶
func (rc *ResourceSpec) IOReader(ctx context.Context, shard int) (io.ReadCloser, error)
Returns io.ReaderCloser for Media specified in ResourceSpec, that can read from specified shard, it would not points to local file system, or even not points to a persistent storage (as consumer of message system eg.)
func (*ResourceSpec) IOWriter ¶
func (rc *ResourceSpec) IOWriter(ctx context.Context, shard int) (io.WriteCloser, error)
Returns io.ReaderCloser for Media specified in ResourceSpec, that can write to specified shard, it would not points to local file system, or even not points to a persistent storage (emits to message system eg.)
func (*ResourceSpec) ShardPath ¶
func (rc *ResourceSpec) ShardPath(shard int) string
For shared path, it returns {path}-{shardIndex}-of-{totalShards}, it's a recommended format when stores sharded data in filesystem, but individual impelementation can have there own rules.
func (*ResourceSpec) Sharded ¶
func (rc *ResourceSpec) Sharded() bool
func (*ResourceSpec) String ¶
func (rc *ResourceSpec) String() string
type StorageFormat ¶
type StorageFormat interface { DatumReader(ctx context.Context, rc ResourceSpec, shard int) (DatumReader, error) DatumWriter(ctx context.Context, rc ResourceSpec, shard int) (DatumWriter, error) }
StorageFormat specifies how to read/write datum from underling StorageMedia StorageMedia implementation can be globally regsitered by RegisterStorageFormat()
type StorageMedia ¶
type StorageMedia interface { IOReader(ctx context.Context, rc ResourceSpec, shard int) (io.ReadCloser, error) IOWriter(ctx context.Context, rc ResourceSpec, shard int) (io.WriteCloser, error) }
StorageMedia specifies how to read/write bytes from external stroages. StorageMedia implementation can be globally regsitered by RegisterStorageMedia()
type TextFormat ¶
type TextFormat struct { }
Format: textio Reads and writes data line by line. datum.Key is ignored.
func (TextFormat) DatumReader ¶
func (tf TextFormat) DatumReader( ctx context.Context, rc ResourceSpec, shard int) (DatumReader, error)
func (TextFormat) DatumWriter ¶
func (tf TextFormat) DatumWriter( ctx context.Context, rc ResourceSpec, shard int) (DatumWriter, error)