finc

package
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 29, 2024 License: GPL-3.0 Imports: 10 Imported by: 0

Documentation

Overview

Copyright 2015 by Leipzig University Library, http://ub.uni-leipzig.de
                  The Finc Authors, http://finc.info
                  Martin Czygan, <martin.czygan@uni-leipzig.de>

This file is part of some open source application.

Some open source application is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Some open source application is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Foobar. If not, see <http://www.gnu.org/licenses/>.

@license GPL-3.0+ <http://spdx.org/licenses/GPL-3.0+>

Copyright 2015 by Leipzig University Library, http://ub.uni-leipzig.de
                  The Finc Authors, http://finc.info
                  Martin Czygan, <martin.czygan@uni-leipzig.de>

This file is part of some open source application.

Some open source application is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Some open source application is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with Foobar. If not, see <http://www.gnu.org/licenses/>.

@license GPL-3.0+ <http://spdx.org/licenses/GPL-3.0+>

TODO.

Index

Constants

View Source
const (
	IntermediateSchemaRecordType = "is"
	AIRecordType                 = "ai"
	IntermediateSchemaVersion    = "0.9"
)

Variables

View Source
var (
	SubjectMapping = assetutil.MustLoadStringSliceMap("assets/finc/subjects.json")
	LanguageMap    = assetutil.MustLoadStringMap("assets/finc/iso-639-3-language.json")
	AIAccessFacet  = "Electronic Resources"

	FormatDe105  = assetutil.MustLoadStringMap("assets/finc/formats/de105.json")
	FormatDe14   = assetutil.MustLoadStringMap("assets/finc/formats/de14.json")
	FormatDe15   = assetutil.MustLoadStringMap("assets/finc/formats/de15.json")
	FormatDe520  = assetutil.MustLoadStringMap("assets/finc/formats/de520.json")
	FormatDe540  = assetutil.MustLoadStringMap("assets/finc/formats/de540.json")
	FormatDeCh1  = assetutil.MustLoadStringMap("assets/finc/formats/dech1.json")
	FormatDed117 = assetutil.MustLoadStringMap("assets/finc/formats/ded117.json")
	FormatDeGla1 = assetutil.MustLoadStringMap("assets/finc/formats/degla1.json")
	FormatDel152 = assetutil.MustLoadStringMap("assets/finc/formats/del152.json")
	FormatDel189 = assetutil.MustLoadStringMap("assets/finc/formats/del189.json")
	FormatDeZi4  = assetutil.MustLoadStringMap("assets/finc/formats/dezi4.json")
	FormatDeZwi2 = assetutil.MustLoadStringMap("assets/finc/formats/dezwi2.json")
	FormatFinc   = assetutil.MustLoadStringMap("assets/finc/formats/finc.json")
	FormatNrw    = assetutil.MustLoadStringMap("assets/finc/formats/nrw.json")
)
View Source
var (
	NotAssigned     = "" // was "not assigned", refs #7092
	NonAlphaNumeric = regexp.MustCompile("/[^A-Za-z0-9]+/")
)
View Source
var AuthorReplacer = strings.NewReplacer(
	"AUTHOR INDEX", "",
	"AUTHOR Index", "",
	"Anonymous", "",
	"Author Index", "",
	"No authorship indicated", "",
	"Not Available, Not Available", "",
	"anonym", "",
	"keine Angabe", "",
)

AuthorReplacer is a special cleaner for author names.

Functions

This section is empty.

Types

type Author

type Author struct {
	ID           string `json:"x.id,omitempty"`
	Name         string `json:"rft.au,omitempty"`
	LastName     string `json:"rft.aulast,omitempty"`
	FirstName    string `json:"rft.aufirst,omitempty"`
	Initial      string `json:"rft.auinit,omitempty"`
	FirstInitial string `json:"rft.auinit1,omitempty"`
	MiddleName   string `json:"rft.auinitm,omitempty"`
	Suffix       string `json:"rft.ausuffix,omitempty"`

	// Organization or corporation that is the author or creator of the book;
	// "Mellon Foundation", for example. (Table 14: Z39.88-2004 Matrix
	// Constraint Definition of KEV Metadata Format for "book", Excerpt,
	// https://groups.niso.org/apps/group_public/download.php/14833/z39_88_2004_r2010.pdf#page=55).
	Corporate string `json:"rft.aucorp,omitempty"`
}

Author representes an author, "inspired" by OpenURL.

func (*Author) String

func (author *Author) String() string

String returns a formatted author string. TODO(miku): make this complete.

type Exporter

type Exporter interface {
	// Export turns an intermediate schema into bytes. Lower level
	// representation than ExportSchema.Convert. Allows JSON, XML, Marc,
	// Formeta and other formats.
	Export(is IntermediateSchema, withFullrecord bool) ([]byte, error)
}

Exporter implements a basic export method that serializes an intermediate schema.

type Formeta

type Formeta struct{}

func (*Formeta) Export

func (s *Formeta) Export(is IntermediateSchema, _ bool) ([]byte, error)

type IntermediateSchema

type IntermediateSchema struct {
	Format          string   `json:"finc.format,omitempty"`
	MegaCollections []string `json:"finc.mega_collection,omitempty"`
	ID              string   `json:"finc.id,omitempty"`
	RecordID        string   `json:"finc.record_id,omitempty"`
	SourceID        string   `json:"finc.source_id,omitempty"`

	Database     string `json:"ris.db,omitempty"`
	DataProvider string `json:"ris.dp,omitempty"`
	RefType      string `json:"ris.type,omitempty"`

	ArticleNumber string `json:"rft.artnum,omitempty"`
	ArticleTitle  string `json:"rft.atitle,omitempty"`

	BookTitle    string   `json:"rft.btitle,omitempty"`
	Chronology   string   `json:"rft.chron,omitempty"`
	Edition      string   `json:"rft.edition,omitempty"`
	EISBN        []string `json:"rft.eisbn,omitempty"`
	EISSN        []string `json:"rft.eissn,omitempty"`
	EndPage      string   `json:"rft.epage,omitempty"`
	Genre        string   `json:"rft.genre,omitempty"`
	ISBN         []string `json:"rft.isbn,omitempty"`
	ISSN         []string `json:"rft.issn,omitempty"`
	Issue        string   `json:"rft.issue,omitempty"`
	JournalTitle string   `json:"rft.jtitle,omitempty"`
	PageCount    string   `json:"rft.tpages,omitempty"`
	Pages        string   `json:"rft.pages,omitempty"`
	Part         string   `json:"rft.part,omitempty"`
	Places       []string `json:"rft.place,omitempty"`
	Publishers   []string `json:"rft.pub,omitempty"`
	Quarter      string   `json:"rft.quarter,omitempty"`

	// TODO(miku): we do not need both dates
	RawDate string    `json:"rft.date,omitempty"`
	Date    time.Time `json:"x.date,omitempty"`

	Season     string `json:"rft.ssn,omitempty"`
	Series     string `json:"rft.series,omitempty"`
	ShortTitle string `json:"rft.stitle,omitempty"`
	StartPage  string `json:"rft.spage,omitempty"`
	Volume     string `json:"rft.volume,omitempty"`

	Abstract  string   `json:"abstract,omitempty"`
	Authors   []Author `json:"authors,omitempty"`
	DOI       string   `json:"doi,omitempty"`
	Languages []string `json:"languages,omitempty"`
	URL       []string `json:"url,omitempty"`
	Version   string   `json:"version,omitempty"`

	ArticleSubtitle string   `json:"x.subtitle,omitempty"`
	Fulltext        string   `json:"x.fulltext,omitempty"`
	Headings        []string `json:"x.headings,omitempty"`
	Subjects        []string `json:"x.subjects,omitempty"`
	Type            string   `json:"x.type,omitempty"`

	// Indicator can hold update related information, e.g. in GBI the filedate
	Indicator string `json:"x.indicator,omitempty"`
	// Packages can hold set information, e.g. in GBI the licenced package or GBI database
	Packages []string `json:"x.packages,omitempty"`
	// Labels can carry a list of marks for a given records, e.g. ISILs
	Labels []string `json:"x.labels,omitempty"`

	// OpenAccess, refs. #8986, prototype
	OpenAccess bool     `json:"x.oa,omitempty"`
	License    []string `json:"x.license,omitempty"`

	// Footnote, via solr schema, refs #13653
	Footnotes []string `json:"x.footnotes,omitempty"`
}

IntermediateSchema abstract and collects the values of various input formats. Goal is to simplify further processing by using a single format, from which the next artifacts can be derived, e.g. records for solr indices. This format can be viewed as a catch-all format. The dotted notation hints at the origin of the field, e.g. OpenURL, RIS, finc.

Notes on the format:

* The x namespace is experimental. * RawDate must be in ISO8601 (YYYY-MM-DD) format. * Version is mandatory. * Headings and Subjects are not bound to any format yet. * Use plural for slices, if possible.

TODO(miku): Clean up naming and date parsing.

func NewIntermediateSchema

func NewIntermediateSchema() *IntermediateSchema

NewIntermediateSchema creates a new intermediate schema document with the current version.

func (*IntermediateSchema) AbstractCleaned added in v0.1.372

func (is *IntermediateSchema) AbstractCleaned() string

AbstractCleaned returns the abstract with HTML tags, refs #19964.

func (*IntermediateSchema) Allfields

func (is *IntermediateSchema) Allfields() string

Allfields returns a combination of various fields.

func (*IntermediateSchema) ISBNList

func (is *IntermediateSchema) ISBNList() []string

ISBNList returns a deduplicated list of all ISBN and EISBN.

func (*IntermediateSchema) ISSNList

func (is *IntermediateSchema) ISSNList() []string

ISSNList returns a deduplicated list of all ISSN and EISSN.

func (*IntermediateSchema) Imprint

func (is *IntermediateSchema) Imprint() (s string)

Imprint MARC 260 a, b, c (trad.)

func (*IntermediateSchema) SortableAuthor

func (is *IntermediateSchema) SortableAuthor() string

SortableAuthor is loosely based on getSortableAuthor in SOLRMARC.

func (*IntermediateSchema) SortableTitle

func (is *IntermediateSchema) SortableTitle() string

SortableTitle is loosely based on getSortableTitle in SOLRMARC.

type Solr5Vufind3

type Solr5Vufind3 struct {
	AuthorFacet          []string `json:"author_facet,omitempty"`
	AuthorCorporate      []string `json:"author_corporate,omitempty"`
	Authors              []string `json:"author,omitempty"`
	AuthorSort           string   `json:"author_sort,omitempty"`
	SecondaryAuthors     []string `json:"author2,omitempty"`
	Allfields            string   `json:"allfields,omitempty"`
	DOI                  []string `json:"doi_str_mv,omitempty"` // recommended via https://vufind.org/wiki/development:architecture:solr_index_schema
	Edition              string   `json:"edition,omitempty"`
	FacetAvail           []string `json:"facet_avail"`
	FincClassFacet       []string `json:"finc_class_facet,omitempty"`
	FincClassMv          []string `json:"fincclass_txtF_mv,omitempty"` // refs #25035, #17265
	Footnotes            []string `json:"footnote,omitempty"`
	Formats              []string `json:"format,omitempty"`
	Fullrecord           string   `json:"fullrecord,omitempty"`
	Fulltext             string   `json:"fulltext,omitempty"`
	HierarchyParentTitle []string `json:"hierarchy_parent_title,omitempty"`
	ID                   string   `json:"id,omitempty"`
	Institutions         []string `json:"institution,omitempty"`
	Imprint              string   `json:"imprint,omitempty"`
	ImprintStrMv         []string `json:"imprint_str_mv,omitempty"`
	ISSN                 []string `json:"issn,omitempty"`
	ISSNStrMv            []string `json:"issn_str_mv,omitempty"` // refs. #21393
	ISBN                 []string `json:"isbn,omitempty"`
	ISBNStrMv            []string `json:"isbn_str_mv,omitempty"` // refs. #21393
	Languages            []string `json:"language,omitempty"`
	MegaCollections      []string `json:"mega_collection,omitempty"`
	MatchStr             string   `json:"match_str"`    // do not omit, refs. #21403#note-15
	MatchStrMv           []string `json:"match_str_mv"` // do not omit, refs. #21403#note-15
	PublishDateSort      string   `json:"publishDateSort,omitempty"`
	Publishers           []string `json:"publisher,omitempty"`
	RecordID             string   `json:"record_id,omitempty"`
	RecordType           string   `json:"recordtype,omitempty"`
	RecordFormat         string   `json:"record_format,omitempty"`
	Series               []string `json:"series,omitempty"`
	SourceID             string   `json:"source_id,omitempty"`
	Subtitle             string   `json:"title_sub,omitempty"`
	Title                string   `json:"title,omitempty"`
	TitleFull            string   `json:"title_full,omitempty"`
	TitleShort           string   `json:"title_short,omitempty"`
	TitleSort            string   `json:"title_sort,omitempty"`
	Topics               []string `json:"topic,omitempty"`
	URL                  []string `json:"url,omitempty"`
	PublishDate          []string `json:"publishDate,omitempty"`
	Physical             []string `json:"physical,omitempty"`
	Description          string   `json:"description,omitempty"`
	Collections          []string `json:"collection,omitempty"` // index/wiki/Kollektionsfacette

	ContainerIssue     string `json:"container_issue,omitempty"`
	ContainerStartPage string `json:"container_start_page,omitempty"`
	ContainerTitle     string `json:"container_title,omitempty"`
	ContainerVolume    string `json:"container_volume,omitempty"`

	FormatDe105  []string `json:"format_de105,omitempty"`
	FormatDe14   []string `json:"format_de14,omitempty"`
	FormatDe15   []string `json:"format_de15,omitempty"`
	FormatDe520  []string `json:"format_de520,omitempty"`
	FormatDe540  []string `json:"format_de540,omitempty"`
	FormatDeCh1  []string `json:"format_dech1,omitempty"`
	FormatDed117 []string `json:"format_ded117,omitempty"`
	FormatDeGla1 []string `json:"format_degla1,omitempty"`
	FormatDel152 []string `json:"format_del152,omitempty"`
	FormatDel189 []string `json:"format_del189,omitempty"`
	FormatDeZi4  []string `json:"format_dezi4,omitempty"`
	FormatDeZwi2 []string `json:"format_dezwi2,omitempty"`
	FormatFinc   []string `json:"format_finc,omitempty"`
	FormatNrw    []string `json:"format_nrw,omitempty"`
	BranchNrw    string   `json:"branch_nrw,omitempty"` // refs #11605
}

Solr5Vufind3 is the basic solr 5 schema as of 2016-04-14. It is based on VuFind 3. Same as Solr5Vufind3v12, but with fullrecord field, refs. #8031. TODO(martin): add support for classfinc.toml

func (*Solr5Vufind3) Export

func (s *Solr5Vufind3) Export(is IntermediateSchema, withFullrecord bool) ([]byte, error)

Export fulfuls finc.Exporter interface, so we can plug this into cmd/span-export. Takes an intermediate schema and returns serialized JSON.

type StrippedSchema

type StrippedSchema struct {
	DOI      string   `json:"doi"`
	Labels   []string `json:"x.labels"`
	SourceID string   `json:"finc.source_id"`
}

StrippedSchema is a snippet of an IntermediateSchema.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL