snippets

package
v0.0.40 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 28, 2021 License: MIT Imports: 15 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Extensions

type Extensions []string

Extensions is used to tokenize snippets in directories using the list of file extensions.

func (Extensions) ReadLines added in v0.0.9

func (e Extensions) ReadLines(dirs ...string) apoco.StreamFunc

ReadLines returns a stream function that reads snippet files in the directories (identyfied by the given file extensions) and returns a stream of line tokens. The directories are read in parallel by GOMAXPROCS goroutines.

If a extension ends with `.txt`, one line is read from the text file (no confidences); if the file ends with `.json`, calamari's extended data format is assumed. Otherwise the file is read as a TSV file expecting a char (or a sequence thereof) and its confidence on each line.

func (Extensions) Tokenize

func (e Extensions) Tokenize(ctx context.Context, dirs ...string) apoco.StreamFunc

Tokenize is a helper function that combines ReadLines and TokenizeLines into one function. It is the same as calling `apoco.Pipe(ReadLines, TokenizeLines,...)`.

func (Extensions) TokenizeLines added in v0.0.9

func (e Extensions) TokenizeLines() apoco.StreamFunc

TokenizeLines returns a stream function that tokenizes and aligns line tokens.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL