Documentation ¶
Overview ¶
Package textio contains transforms for reading and writing text files.
Index ¶
- func Immediate(s beam.Scope, filename string) (beam.PCollection, error)
- func Read(s beam.Scope, glob string, opts ...ReadOptionFn) beam.PCollection
- func ReadAll(s beam.Scope, col beam.PCollection, opts ...ReadOptionFn) beam.PCollection
- func ReadAllSdf(s beam.Scope, col beam.PCollection) beam.PCollectiondeprecated
- func ReadSdf(s beam.Scope, glob string) beam.PCollectiondeprecated
- func ReadWithFilename(s beam.Scope, glob string, opts ...ReadOptionFn) beam.PCollection
- func Write(s beam.Scope, filename string, col beam.PCollection)
- type ReadOptionFn
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Immediate ¶
Immediate reads a local file at pipeline construction-time and embeds the data into a I/O-free pipeline source. Should be used for small files only.
func Read ¶
func Read(s beam.Scope, glob string, opts ...ReadOptionFn) beam.PCollection
Read reads a set of files indicated by the glob pattern and returns the lines as a PCollection<string>. The newlines are not part of the lines. Read accepts a variadic number of ReadOptionFn that can be used to configure the compression type of the file. By default, the compression type is determined by the file extension.
func ReadAll ¶
func ReadAll(s beam.Scope, col beam.PCollection, opts ...ReadOptionFn) beam.PCollection
ReadAll expands and reads the filename given as globs by the incoming PCollection<string>. It returns the lines of all files as a single PCollection<string>. The newlines are not part of the lines. ReadAll accepts a variadic number of ReadOptionFn that can be used to configure the compression type of the files. By default, the compression type is determined by the file extension.
func ReadAllSdf
deprecated
func ReadAllSdf(s beam.Scope, col beam.PCollection) beam.PCollection
ReadAllSdf is a variation of ReadAll implemented via SplittableDoFn. This should result in increased performance with runners that support splitting.
Deprecated: Use ReadAll instead, which has been migrated to use this SDF implementation.
func ReadSdf
deprecated
func ReadSdf(s beam.Scope, glob string) beam.PCollection
ReadSdf is a variation of Read implemented via SplittableDoFn. This should result in increased performance with runners that support splitting.
Deprecated: Use Read instead, which has been migrated to use this SDF implementation.
func ReadWithFilename ¶
func ReadWithFilename(s beam.Scope, glob string, opts ...ReadOptionFn) beam.PCollection
ReadWithFilename reads a set of files indicated by the glob pattern and returns a PCollection<KV<string, string>> of each filename and line. The newlines are not part of the lines. ReadWithFilename accepts a variadic number of ReadOptionFn that can be used to configure the compression type of the files. By default, the compression type is determined by the file extension.
Types ¶
type ReadOptionFn ¶
type ReadOptionFn func(*readOption)
ReadOptionFn is a function that can be passed to Read or ReadAll to configure options for reading files.
func ReadAutoCompression ¶
func ReadAutoCompression() ReadOptionFn
ReadAutoCompression specifies that the compression type of files should be auto-detected.
func ReadGzip ¶
func ReadGzip() ReadOptionFn
ReadGzip specifies that files have been compressed using gzip.
func ReadUncompressed ¶
func ReadUncompressed() ReadOptionFn
ReadUncompressed specifies that files have not been compressed.