Documentation ¶
Overview ¶
Package for working with the plain, full text of corpus documents
Functions for retrieving text text matches in parallel from text that are either the file or in a remote object store
Index ¶
Constants ¶
const (
SNIPPET_LEN = 200
)
Variables ¶
This section is empty.
Functions ¶
Types ¶
type DocMatch ¶
type DocMatch struct { PlainTextFile string MT MatchingText }
Details of best matching text for the query terms
type GCSLoader ¶
type GCSLoader struct {
// contains filtered or unexported fields
}
Implements the TextLoader interface, loads the text from a Google Cloud Storage. Params:
Bucket - The base URL for the location of the plain text files
func NewGCSLoader ¶
Creates and initiates a new GCSLoader object
func (GCSLoader) GetMatching ¶
func (loader GCSLoader) GetMatching(plainTextFile string, queryTerms []string) (MatchingText, error)
Gets the matching text from a local file and find the best match
type Job ¶
type Job struct {
// contains filtered or unexported fields
}
func (Job) Do ¶
func (job Job) Do(loader TextLoader, queryTerms []string)
A long operation, needs to be done in parallel
type LocalTextLoader ¶
type LocalTextLoader struct {
// contains filtered or unexported fields
}
Implements the TextLoader interface, loads the text from a local file mounted on the application server Params:
corpusDir - The top level directory for the plain text files
func (LocalTextLoader) GetMatching ¶
func (loader LocalTextLoader) GetMatching(plainTextFile string, queryTerms []string) (MatchingText, error)
Gets the matching text from a local file and find the best match
type MatchingText ¶
Details of best matching text for the query terms
type TextLoader ¶
type TextLoader interface { // Get the document text // param: // plainTextFile - file containing plain text of the document // , queryTerms - an array of query terms GetMatching(plainTextFile string, queryTerms []string) (MatchingText, error) }
Interface for plain text retrieval