fulltext

package
v0.0.172 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 4, 2023 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Overview

Package for working with the plain, full text of corpus documents

Functions for retrieving text text matches in parallel from text that are either the file or in a remote object store

Index

Constants

View Source
const (
	SNIPPET_LEN = 200
)

Variables

This section is empty.

Functions

func GetMatches

func GetMatches(keys []string, queryTerms []string) map[string]DocMatch

Types

type DocMatch

type DocMatch struct {
	PlainTextFile string
	MT            MatchingText
}

Details of best matching text for the query terms

type GCSLoader

type GCSLoader struct {
	// contains filtered or unexported fields
}

Implements the TextLoader interface, loads the text from a Google Cloud Storage. Params:

Bucket - The base URL for the location of the plain text files

func NewGCSLoader

func NewGCSLoader(bucket string) (GCSLoader, error)

Creates and initiates a new GCSLoader object

func (GCSLoader) GetMatching

func (loader GCSLoader) GetMatching(plainTextFile string, queryTerms []string) (MatchingText, error)

Gets the matching text from a local file and find the best match

type Job

type Job struct {
	// contains filtered or unexported fields
}

func (Job) Do

func (job Job) Do(loader TextLoader, queryTerms []string)

A long operation, needs to be done in parallel

type LocalTextLoader

type LocalTextLoader struct {
	// contains filtered or unexported fields
}

Implements the TextLoader interface, loads the text from a local file mounted on the application server Params:

corpusDir - The top level directory for the plain text files

func (LocalTextLoader) GetMatching

func (loader LocalTextLoader) GetMatching(plainTextFile string,
	queryTerms []string) (MatchingText, error)

Gets the matching text from a local file and find the best match

type MatchingText

type MatchingText struct {
	Snippet, LongestMatch string
	ExactMatch            bool
}

Details of best matching text for the query terms

type Result

type Result struct {
	// contains filtered or unexported fields
}

type TextLoader

type TextLoader interface {

	// Get the document text
	// param:
	//   plainTextFile - file containing plain text of the document
	//   , queryTerms - an array of query terms
	GetMatching(plainTextFile string,
		queryTerms []string) (MatchingText, error)
}

Interface for plain text retrieval

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL