Documentation ¶
Overview ¶
Package walk provides interfaces and methods for walking Library of Congress (LoC) data files.
Index ¶
- func RegisterWalker(ctx context.Context, scheme string, f WalkerInitializeFunc) error
- func Schemes() []string
- type LocalWalkReader
- type NDJSONWalker
- func (w *NDJSONWalker) WalkFile(ctx context.Context, cb WalkCallbackFunction, uri string) error
- func (w *NDJSONWalker) WalkReader(ctx context.Context, cb WalkCallbackFunction, r io.Reader) error
- func (w *NDJSONWalker) WalkURIs(ctx context.Context, cb WalkCallbackFunction, uris ...string) error
- func (w *NDJSONWalker) WalkZipFile(ctx context.Context, cb WalkCallbackFunction, uri string) error
- type RemoteWalkReader
- type WalkCallbackFunction
- type WalkReader
- type Walker
- type WalkerInitializeFunc
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func RegisterWalker ¶
func RegisterWalker(ctx context.Context, scheme string, f WalkerInitializeFunc) error
RegisterWalker() associates 'scheme' with 'init_func' in an internal list of avilable `Walker` implementations.
Types ¶
type LocalWalkReader ¶ added in v0.2.0
type LocalWalkReader struct { WalkReader // contains filtered or unexported fields }
type LocalWalkReader implements the `WalkReader` interface for files on a local disk.
func (*LocalWalkReader) Close ¶ added in v0.2.0
func (r *LocalWalkReader) Close() error
Close closes the underlying `os.File` instance for 'r'.
func (*LocalWalkReader) Read ¶ added in v0.2.0
func (r *LocalWalkReader) Read(p []byte) (int, error)
Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered. Even if Read returns n < len(p), it may use all of p as scratch space during the call. If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.
type NDJSONWalker ¶
type NDJSONWalker struct { Walker // contains filtered or unexported fields }
type NDJSONWalker implements the `Walker` interface for NDJSON files.
func (*NDJSONWalker) WalkFile ¶
func (w *NDJSONWalker) WalkFile(ctx context.Context, cb WalkCallbackFunction, uri string) error
WalkFile() processes 'uri' dispatch each record to 'cb'.
func (*NDJSONWalker) WalkReader ¶
func (w *NDJSONWalker) WalkReader(ctx context.Context, cb WalkCallbackFunction, r io.Reader) error
WalkReader() processes each record in 'r' (which is expected to a line-separate JSON document) and dispatches each record to 'cb'.
func (*NDJSONWalker) WalkURIs ¶
func (w *NDJSONWalker) WalkURIs(ctx context.Context, cb WalkCallbackFunction, uris ...string) error
WalkURIs() processes 'uris' dispatching each record to 'cb'. 'uris' is expected to be a list of compressed ('.zip') or uncompressed files on disk.
func (*NDJSONWalker) WalkZipFile ¶
func (w *NDJSONWalker) WalkZipFile(ctx context.Context, cb WalkCallbackFunction, uri string) error
WalkZipFile() decompresses 'uri' and processes each file (contained in the zip archive) dispatching each record to 'cb'.
type RemoteWalkReader ¶ added in v0.2.0
type RemoteWalkReader struct { WalkReader // contains filtered or unexported fields }
type RemoteWalkReader implements the `WalkReader` interface for files on a remote web server.
func (*RemoteWalkReader) Close ¶ added in v0.2.0
func (r *RemoteWalkReader) Close() error
Close is a no-op.
func (*RemoteWalkReader) Read ¶ added in v0.2.0
func (r *RemoteWalkReader) Read(p []byte) (int, error)
Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered. Even if Read returns n < len(p), it may use all of p as scratch space during the call. If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.
type WalkCallbackFunction ¶
type WalkCallbackFunction defines a user-specified callback function for processing a LoC data file.
type WalkReader ¶ added in v0.2.0
type WalkReader interface { // Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered. Even if Read returns n < len(p), it may use all of p as scratch space during the call. If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more. Read(p []byte) (int, error) // ReadAt reads len(buf) bytes into buf starting at offset off. ReadAt([]byte, int64) (int, error) // Close closes any underlying file handles. It is implementation specific. Close() error }
WalkReader is an interface which implements the `io.Reader`, `io.ReaderAt` and `io.Closer` interface for reading Library of Congress data files. This provides a common interface for reading local and remote data files regardless of whether or not they are compressed.
type Walker ¶
type Walker interface { // WalkURIs iterates (walks) LoC data files from one or more URIs. WalkURIs(context.Context, WalkCallbackFunction, ...string) error // WalkFile iterates (walks) a LoC data file on disk. WalkFile(context.Context, WalkCallbackFunction, string) error // WalkZipFile iterates (walks) a LoC zip-compressed data file on disk. WalkZipFile(context.Context, WalkCallbackFunction, string) error // WalkZipFile iterates (walks) LoC data from an `io.Reader` instance. WalkReader(context.Context, WalkCallbackFunction, io.Reader) error }
type Walker defines an interface for iterating (walking) LoC data files from a variety or sources.
func NewNDJSONWalker ¶
NewNDJSONWalker creates a new instance that implements the `Walker` interface for NDJSON files configured by 'uri' which is expected to take the form of:
ndjson://?{PARAMETERS}
Where {PARAMETERS} may be: * `?workers=` The number of maximum simultaneous workers for processing NDJSON records. Default is 100.