Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
View Source
var ErrLoad = errors.New("unable to load config")
ErrLoad indicates the config for a job could not be loaded successfully.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct { // WorkPath is an exact path to the working directory to store intermediate // and final outputs. All child paths are assumed to be subdirectories of // this path. WorkPath string `yaml:"workPath"` // Jobs is a map of directory names representing jobs to the configuration // for those jobs. Should be retrieved individually with GetJob. // // The job name (the key in this map) is the argument which should be passed // after the configuration argument. Jobs map[string]*Job `yaml:"jobs"` }
type Extract ¶
type Extract struct { // WorkPath is an exact path to the working directory all other paths are relative to. // Inherits from the parent configuration if unset. WorkPath string `yaml:"workPath"` // ArticlesPath is the filepath to the pages-articles-multistream // dump of Wikipedia. Must be compressed as the associated Index points // to specific bytes of the compressed format. ArticlesPath string `yaml:"articlesPath"` // IndexPath is the filepath to the index corresponding to the dump. // Accepts bz2 or uncompressed. IndexPath string `yaml:"indexPath"` // Namespace is the namespace ID to include articles from. Namespaces []int `yaml:"namespaces"` // OutPath is the filepath to store the database of extracted articles. OutPath string `yaml:"outPath"` }
func (*Extract) GetArticlesPath ¶
func (*Extract) GetIndexPath ¶
func (*Extract) GetOutPath ¶
func (*Extract) GetWorkPath ¶
func (*Extract) SetWorkPath ¶
type Job ¶
type Job struct { // SubCommand is the subCommand of wikopticon to run. Should correspond // one-to-one with a configuration type to unmarshall to. So the subCommand // "extract" should map to the "config.Extract" type. SubCommand string `yaml:"subCommand"` // Settings are the configuration used for a job. // Should be unmarshalled into a real config object with unmarshall. Settings map[string]interface{} `yaml:"settings"` }
Job represents a generic task to run with all specified configuration.
Click to show internal directories.
Click to hide internal directories.