Documentation
¶
Overview ¶
Package schema houses simple data types for titles, issues, batches, etc. Types which live here are generally meant to be very general-case rather than trying to hold all possible information for all possible use cases.
Index ¶
- Constants
- func CondensedDate(rawDate string) string
- func IssueDateEdition(rawDate string, edition int) string
- func IssueKey(lccn, rawDate string, edition int) string
- func TrimCommonPrefixes(s string) string
- type Batch
- type DuplicateIssueError
- type File
- type Issue
- func (i *Issue) CheckDupes(lookup *Lookup)
- func (i *Issue) DateEdition() string
- func (i *Issue) ErrDuped(dupe *Issue)
- func (i *Issue) ErrFolderContents(extra string)
- func (i *Issue) ErrInvalidFolderName(extra string)
- func (i *Issue) ErrNoFiles()
- func (i *Issue) ErrReadFailure(err error)
- func (i *Issue) ErrTooNew(hours int)
- func (i *Issue) FindFiles()
- func (i *Issue) IsLive() bool
- func (i *Issue) Key() string
- func (i *Issue) LastModified() time.Time
- func (i *Issue) TSV() string
- func (i *Issue) WorkflowIdentification() string
- type IssueError
- type IssueList
- type IssueMap
- type Key
- type Lookup
- type Title
- type TitleList
- type WorkflowStep
Constants ¶
const ( // WSNil should only be used to indicate a workflow step is irrelevant or else unset WSNil WorkflowStep = "" WSSFTP = "SFTPUpload" WSScan = "ScanUpload" WSAwaitingProcessing = "AwaitingProcessing" WSAwaitingPageReview = "AwaitingPageReview" WSReadyForMetadataEntry = "ReadyForMetadataEntry" WSAwaitingMetadataReview = "AwaitingMetadataReview" WSReadyForMETSXML = "ReadyForMETSXML" WSReadyForBatching = "ReadyForBatching" WSInProduction = "InProduction" )
All possible statuses an issue could have
Variables ¶
This section is empty.
Functions ¶
func CondensedDate ¶
CondensedDate returns the date in a consistent format for use in issue key TSV output
func IssueDateEdition ¶
IssueDateEdition returns the combination of condensed date (no hyphens) and two-digit edition number for use in issue keys and other places we need the "local" unique string
func IssueKey ¶
IssueKey centralizes the generation of our unique "key" for an issue using the lccn + date + edition
func TrimCommonPrefixes ¶
TrimCommonPrefixes strips "The", "A", and "An" from the string if they're at the beginning, and removes leading spaces
Types ¶
type Batch ¶
type Batch struct { // MARCOrgCode tells us the organization responsible for the images in the batch MARCOrgCode string // A batch's keyword is normally short, such as "horsetail", but our in-house // batches have much longer keywords to ensure uniqueness Keyword string // Usually 1, but I've seen "_ver02" batches occasionally Version int // Issues links the issues which are part of this batch Issues IssueList // Location is where this batch can be found, either a URL or filesystem path Location string Errors apperr.List }
Batch represents high-level batch information
func ParseBatchname ¶
ParseBatchname creates a Batch by splitting up the full name string
type DuplicateIssueError ¶
type DuplicateIssueError struct { *IssueError Location string Name string IsLive bool }
DuplicateIssueError implements apperr.Error for duped issue situations, and holds onto extra information for figuring out how to handle the dupe
type Issue ¶
type Issue struct { MARCOrgCode string Title *Title RawDate string // This is the date as seen on the filesystem when the issue was uploaded Edition int Batch *Batch Files []*File Errors apperr.List // Location is where this issue can be found, either a URL or filesystem path Location string WorkflowStep WorkflowStep // contains filtered or unexported fields }
Issue is an extremely basic encapsulation of an issue's high-level data
func (*Issue) CheckDupes ¶
CheckDupes centralizes the logic for seeing if an issue has a duplicate in a given lookup, adding a duplication error if there is a dupe and that dupe is considered to be more "canonical" than this issue. e.g., if there's an issue in the metadata entry stage and another in the sftp upload, the upload is considered the dupe, not the one in metadata entry.
func (*Issue) DateEdition ¶
DateEdition returns the combination of condensed date (no hyphens) and two-digit edition number for use in issue keys and other places we need the "local" unique string
func (*Issue) ErrFolderContents ¶
ErrFolderContents tells us the issue's files on disk are invalid in some way
func (*Issue) ErrInvalidFolderName ¶
ErrInvalidFolderName adds an Error for invalid folder name formats
func (*Issue) ErrNoFiles ¶
func (i *Issue) ErrNoFiles()
ErrNoFiles adds an error stating the issue folder is empty
func (*Issue) ErrReadFailure ¶
ErrReadFailure indicates the issue's folder wasn't able to be read
func (*Issue) ErrTooNew ¶
ErrTooNew adds an error for issues which are too new to be processed. hours should be set to the minimum number of hours an issue should be untouched before being considered "safe".
func (*Issue) FindFiles ¶
func (i *Issue) FindFiles()
FindFiles clears the issue's file list and then reads everything in the issue directory, appending it to the now-empty list. This will silently fail when the issue's location is invalid, not readable, or isn't an absolute path beginning with "/". This is only meant for issues already discovered on the filesystem.
func (*Issue) IsLive ¶
IsLive returns true if the issue both has a batch *and* the batch appears to be on the live site
func (*Issue) LastModified ¶
LastModified tells us when *any* change happened in an issue's folder. This will return a meaningless value on live issues.
func (*Issue) TSV ¶
TSV gives us something which can be used to uniquely identify all aspects of this issue's data for reporting and/or data verification
func (*Issue) WorkflowIdentification ¶
WorkflowIdentification returns a human-readable explanation of where an issue lives currently is in the workflow - currently used for adding to "likely duplicate of ..."
type IssueError ¶
IssueError implements apperr.Error and forms the base for all issue errors
func (*IssueError) Error ¶
func (e *IssueError) Error() string
func (*IssueError) Message ¶
func (e *IssueError) Message() string
Message returns the long, human-friendly error message
func (*IssueError) Propagate ¶
func (e *IssueError) Propagate() bool
Propagate returns whether the error should flag the object's parent as also having an error
type Key ¶
Key defines the precise issue (or subset of issues) we want to find. Note that the structure here is very specific to this issue finder, so we don't expect (or even want) reuse.
func ParseSearchKey ¶
ParseSearchKey attempts to read the given string, returning an error if the string isn't a valid search key, otherwise returning a proper issueSearchKey
type Lookup ¶
type Lookup struct { sync.RWMutex // Issue lets us find issues by key; we should usually have only one // issue per key, but the live site could have something that's still sitting // in the "ready for ingest" area, or the page backup area. Issue IssueMap // issueNoEdition is a lookup containing all issues for a given partial // key, where the partial key contains everything except an Issue edition IssueNoEdition IssueMap // issueNoDay looks up issues without day number or edition IssueNoDay IssueMap // issueNoMonth looks up issues without month, day number, or edition IssueNoMonth IssueMap // issueNoYear looks up issues without any date information IssueNoYear IssueMap }
Lookup aggregates issue lists to create very granularly searchable data
type Title ¶
type Title struct { LCCN string Name string PlaceOfPublication string Errors apperr.List // Issues contains the list of issues associated with a single title; though // this can be derived by iterating over all the issues, it's useful to store // them here, too Issues IssueList // Location is where the title was found on disk or web; not actual Title metadata Location string // contains filtered or unexported fields }
Title is a publisher's information, unique per LCCN
func (*Title) GenericTitle ¶
GenericTitle returns a title with the same generic information, but none of the data which is tied to a specific title on the filesystem or website: location and issue list
type TitleList ¶
type TitleList []*Title
TitleList is a simple slice of titles for easier built-in sorting and identifying a unique list of all titles
func (TitleList) SortByName ¶
func (list TitleList) SortByName()
SortByName sorts the titles by their name, using location and lccn when names are the same
type WorkflowStep ¶
type WorkflowStep string
WorkflowStep describes the location within the workflow any issue can exist - this is basically a more comprehensive list than what's in the database in order to capture every possible location: live batches, sftped issues awaiting processing, etc.