issuefinder

package

v2.5.1+incompatible Latest Latest Go to latest Published: Jun 21, 2018 License: Apache-2.0 Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/uoregon-libraries/newspaper-curation-app

Links

Open Source Insights

Documentation ¶

Overview ¶

Package issuefinder sets up a process for finding all issues across the filesystem and live sites to allow for other tools to get fairly comprehensive information: where in the workflow an issue resides, which batches contain a certain LCCN, which issues have dupes, etc.

Index ¶

type Finder
- func Deserialize(filename string) (*Finder, error)
- func New() *Finder
type Namespace
type Searcher
- func NewSearcher(ns Namespace, loc string) *Searcher

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Finder ¶

type Finder struct {
	Searchers map[Namespace]*Searcher
	Batches   []*schema.Batch
	Titles    schema.TitleList
	Issues    schema.IssueList
	Errors    apperr.List

	// This little var helps us answer the age-old question: for a given unique
	// issue, where is it in the workflow?
	IssueNamespace map[*schema.Issue]Namespace
}

Finder groups all the searchers together, allowing for aggregation of issue, title, and batch data from all sources while keeping the groups separate for the specific use-cases (e.g., SFTP issues shouldn't be scanned when trying to figure out if a given issue is live). A Finder doesn't have any critical context on its own, and can be reproduced from data stored in its Searchers.

func Deserialize ¶

func Deserialize(filename string) (*Finder, error)

Deserialize attempts to read and deserialize the given filename into a Finder, returning the Finder if successful, nil and an error otherwise

func New ¶

func New() *Finder

New instantiates a new Finder read to spawn searchers

func (*Finder) Aggregate ¶

func (f *Finder) Aggregate()

Aggregate puts all searchers' data into the Finder for global use. This must be called if batches, issues, titles, or errors are added to a searcher directly (rather than via FindXXX methods).

func (*Finder) FindInProcessIssues ¶

func (f *Finder) FindInProcessIssues() (*Searcher, error)

FindInProcessIssues creates and runs an in-process issues (issues which are in the workflow dir and have been indexed) searcher, aggregates its data, and returns any errors encountered

func (*Finder) FindSFTPIssues ¶

func (f *Finder) FindSFTPIssues(path, orgCode string) (*Searcher, error)

FindSFTPIssues creates and runs an SFTP Searcher, aggregates its data, and returns any errors encountered

func (*Finder) FindScannedIssues ¶

func (f *Finder) FindScannedIssues(path string) (*Searcher, error)

FindScannedIssues creates and runs a scanned-issue Searcher, aggregates its data, and returns any errors encountered

func (*Finder) FindWebBatches ¶

func (f *Finder) FindWebBatches(hostname, cachePath string) (*Searcher, error)

FindWebBatches creates and runs a website batch Searcher, aggregates its data, and returns any errors encountered

func (*Finder) Serialize ¶

func (f *Finder) Serialize(outFilename string) error

Serialize writes the Finder's state to the given filename or returns an error

type Namespace ¶

type Namespace uint8

Namespace is a special type of identifier for searchers to have well-defined namespacing for the different types of searches / locations. If two different locations need the same namespace, a different top-level Finder should be used (e.g., finding "web" issues on live versus staging)

const (
	Website Namespace = iota
	SFTPUpload
	ScanUpload
	InProcess
)

These are the allowed namespaces for searcher, based on our current app's workflow location types

type Searcher ¶

type Searcher struct {
	Namespace Namespace
	Location  string

	Issues  schema.IssueList
	Batches []*schema.Batch
	Titles  schema.TitleList

	// Errors is the list of errors which aren't specific to something like an
	// issue or a batch; e.g., a bad MARC Org Code directory in the scan path
	Errors apperr.List
	// contains filtered or unexported fields
}

Searcher is the central component of the issuefinder package, running the filesystem and web queries and providing an API to get the results

func NewSearcher ¶

func NewSearcher(ns Namespace, loc string) *Searcher

NewSearcher instantiates a Searcher on its own, and typically isn't needed, but could be useful for specific one-off scripts

func (*Searcher) FindInProcessIssues ¶

func (s *Searcher) FindInProcessIssues() error

FindInProcessIssues aggregates all issues which have been indexed in the database

func (*Searcher) FindSFTPIssues ¶

func (s *Searcher) FindSFTPIssues(orgCode string) error

FindSFTPIssues aggregates all the uploaded born-digital PDFs

func (*Searcher) FindScannedIssues ¶

func (s *Searcher) FindScannedIssues() error

FindScannedIssues aggregates all the in-house scans waiting for processing

func (*Searcher) FindWebBatches ¶

func (s *Searcher) FindWebBatches(cachePath string) error

FindWebBatches reads through the JSON from the batch API URL (using Searcher's Location as the web root) and grabs "next" page until there is no next page. Each batch is then read from the JSON cache path, or read from the site and cached. The disk cache speeds up the tool's future runs by only having to request what's been batched since a prior run.

As with other searches, this returns an error only on unexpected behaviors, like the site not responding.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL