Documentation ¶
Overview ¶
Package for scanning the corpus collections
Index ¶
- func GetOutfileMap(loader CorpusLoader) (*map[string]CorpusEntry, error)
- func IsExcluded(excluded map[string]bool, text string) bool
- func LoadExcluded(file io.Reader) (*map[string]bool, error)
- func ReadIntroFile(r io.Reader) string
- func ReadText(r io.Reader) string
- type CollectionEntry
- type CorpusConfig
- type CorpusEntry
- type CorpusLoader
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetOutfileMap ¶
func GetOutfileMap(loader CorpusLoader) (*map[string]CorpusEntry, error)
Method to get a a map of entries with keys being output (HTML) file names Param:
sourceMap: A map by source (plain) text file name
Returns
map with keys being the output file names
func IsExcluded ¶
Tests whether the string should be excluded from corpus analysis Parameter chunk: the string to be tested
func ReadIntroFile ¶
Reads a text file introducing the collection. The file should be a plain text file. HTML breaks will be added for line breaks. Parameter r: with text introducing the collection
Types ¶
type CollectionEntry ¶
type CollectionEntry struct {
CollectionFile, GlossFile, Title, Summary, Intro, DateUpdated, Corpus string
CorpusEntries []CorpusEntry
AnalysisFile, Format, Date, Genre string
}
type CorpusConfig ¶
type CorpusConfig struct { CorpusDataDir string CorpusDir string Excluded map[string]bool ProjectHome string // contains filtered or unexported fields }
CorpusConfig encapsulates parameters for corpus configuration
func NewFileCorpusConfig ¶ added in v0.0.22
func NewFileCorpusConfig(corpusDataDir, corpusDir string, excluded map[string]bool, projectHome string) CorpusConfig
Creates a new CorpusConfig strct
type CorpusEntry ¶
type CorpusEntry struct {
RawFile, GlossFile, Title, ColTitle, ColFile string
}
An entry in a collection
type CorpusLoader ¶
type CorpusLoader interface { // Method to get the corpus configuration // Parameter: // r: to reader the text GetConfig() CorpusConfig // Method to get a single entry in a collection // Param: // fName: The file name of the collection // Returns // A CollectionEntry encapsulating the collection or an error GetCollectionEntry(fName string) (*CollectionEntry, error) // Method to load the entries in a collection // Param: // fName: A file name containing the entries in the collection // colTitle: The title of the collection LoadCollection(fName, colTitle string) (*[]CorpusEntry, error) // Method to load the collections in a corpus from the default file // Parameter: // r: to read the listing of the collections LoadCollections() (*[]CollectionEntry, error) // Method to load the collections in a corpus // Parameter: // r: to read the listing of the collections LoadCorpus(r io.Reader) (*[]CollectionEntry, error) // Method to read the contents of a corpus entry // Parameter: // r: to reader the text ReadText(srcFile string) (string, error) }
Interface for loading corpus with hierarchical collections of documents
func NewFileCorpusLoader ¶ added in v0.0.22
func NewFileCorpusLoader(corpusConfig CorpusConfig) CorpusLoader
CorpusLoader gets the default kind of CorpusLoader