Documentation
¶
Overview ¶
Package issuefinder sets up a process for finding all issues across the filesystem and live sites to allow for other tools to get fairly comprehensive information: where in the workflow an issue resides, which batches contain a certain LCCN, which issues have dupes, etc.
Index ¶
- type Finder
- func (f *Finder) Aggregate()
- func (f *Finder) FindInProcessIssues() (*Searcher, error)
- func (f *Finder) FindSFTPIssues(path, orgCode string) (*Searcher, error)
- func (f *Finder) FindScannedIssues(path string) (*Searcher, error)
- func (f *Finder) FindWebBatches(hostname, cachePath string) (*Searcher, error)
- func (f *Finder) Serialize(outFilename string) error
- type Namespace
- type Searcher
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Finder ¶
type Finder struct { Searchers map[Namespace]*Searcher Batches []*schema.Batch Titles schema.TitleList Issues schema.IssueList Errors apperr.List // This little var helps us answer the age-old question: for a given unique // issue, where is it in the workflow? IssueNamespace map[*schema.Issue]Namespace }
Finder groups all the searchers together, allowing for aggregation of issue, title, and batch data from all sources while keeping the groups separate for the specific use-cases (e.g., SFTP issues shouldn't be scanned when trying to figure out if a given issue is live). A Finder doesn't have any critical context on its own, and can be reproduced from data stored in its Searchers.
func Deserialize ¶
Deserialize attempts to read and deserialize the given filename into a Finder, returning the Finder if successful, nil and an error otherwise
func (*Finder) Aggregate ¶
func (f *Finder) Aggregate()
Aggregate puts all searchers' data into the Finder for global use. This must be called if batches, issues, titles, or errors are added to a searcher directly (rather than via FindXXX methods).
func (*Finder) FindInProcessIssues ¶
FindInProcessIssues creates and runs an in-process issues (issues which are in the workflow dir and have been indexed) searcher, aggregates its data, and returns any errors encountered
func (*Finder) FindSFTPIssues ¶
FindSFTPIssues creates and runs an SFTP Searcher, aggregates its data, and returns any errors encountered
func (*Finder) FindScannedIssues ¶
FindScannedIssues creates and runs a scanned-issue Searcher, aggregates its data, and returns any errors encountered
func (*Finder) FindWebBatches ¶
FindWebBatches creates and runs a website batch Searcher, aggregates its data, and returns any errors encountered
type Namespace ¶
type Namespace uint8
Namespace is a special type of identifier for searchers to have well-defined namespacing for the different types of searches / locations. If two different locations need the same namespace, a different top-level Finder should be used (e.g., finding "web" issues on live versus staging)
type Searcher ¶
type Searcher struct { Namespace Namespace Location string Issues schema.IssueList Batches []*schema.Batch Titles schema.TitleList // Errors is the list of errors which aren't specific to something like an // issue or a batch; e.g., a bad MARC Org Code directory in the scan path Errors apperr.List // contains filtered or unexported fields }
Searcher is the central component of the issuefinder package, running the filesystem and web queries and providing an API to get the results
func NewSearcher ¶
NewSearcher instantiates a Searcher on its own, and typically isn't needed, but could be useful for specific one-off scripts
func (*Searcher) FindInProcessIssues ¶
FindInProcessIssues aggregates all issues which have been indexed in the database
func (*Searcher) FindSFTPIssues ¶
FindSFTPIssues aggregates all the uploaded born-digital PDFs
func (*Searcher) FindScannedIssues ¶
FindScannedIssues aggregates all the in-house scans waiting for processing
func (*Searcher) FindWebBatches ¶
FindWebBatches reads through the JSON from the batch API URL (using Searcher's Location as the web root) and grabs "next" page until there is no next page. Each batch is then read from the JSON cache path, or read from the site and cached. The disk cache speeds up the tool's future runs by only having to request what's been batched since a prior run.
As with other searches, this returns an error only on unexpected behaviors, like the site not responding.