Documentation
¶
Overview ¶
Copyright 2015 by Leipzig University Library, http://ub.uni-leipzig.de The Finc Authors, http://finc.info Martin Czygan, <martin.czygan@uni-leipzig.de>
This file is part of some open source application.
Some open source application is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Some open source application is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Foobar. If not, see <http://www.gnu.org/licenses/>.
@license GPL-3.0+ <http://spdx.org/licenses/GPL-3.0+>
+build linux darwin
Package oaimi implements a few helpers to mirror OAI repositories. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low- barrier mechanism for repository interoperability.
This project aims to make it simple to create a local, single file view of the repository metadata. It comes with a command line tool, called `oaimi`.
Basic usage:
$ oaimi http://digitalcommons.unmc.edu/do/oai/ > metadata.xml Copyright 2015 by Leipzig University Library, http://ub.uni-leipzig.de The Finc Authors, http://finc.info Martin Czygan, <martin.czygan@uni-leipzig.de>
This file is part of some open source application.
Some open source application is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Some open source application is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Foobar. If not, see <http://www.gnu.org/licenses/>.
@license GPL-3.0+ <http://spdx.org/licenses/GPL-3.0+>
Copyright 2015 by Leipzig University Library, http://ub.uni-leipzig.de The Finc Authors, http://finc.info Martin Czygan, <martin.czygan@uni-leipzig.de>
This file is part of some open source application.
Some open source application is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Some open source application is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Foobar. If not, see <http://www.gnu.org/licenses/>.
@license GPL-3.0+ <http://spdx.org/licenses/GPL-3.0+>
Copyright 2012, Google Inc. All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.
Copyright 2015 by Leipzig University Library, http://ub.uni-leipzig.de The Finc Authors, http://finc.info Martin Czygan, <martin.czygan@uni-leipzig.de>
This file is part of some open source application.
Some open source application is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Some open source application is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Foobar. If not, see <http://www.gnu.org/licenses/>.
@license GPL-3.0+ <http://spdx.org/licenses/GPL-3.0+>
Copyright 2015 by Leipzig University Library, http://ub.uni-leipzig.de The Finc Authors, http://finc.info Martin Czygan, <martin.czygan@uni-leipzig.de>
This file is part of some open source application.
Some open source application is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Some open source application is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Foobar. If not, see <http://www.gnu.org/licenses/>.
@license GPL-3.0+ <http://spdx.org/licenses/GPL-3.0+>
Index ¶
- Constants
- Variables
- func WriteFileAtomic(filename string, data []byte, perm os.FileMode) error
- type BatchingClient
- type CachingClient
- type Client
- type HttpRequestDoer
- type Identify
- type ListIdentifiers
- type ListMetadataFormats
- type ListRecords
- type ListSets
- type MaybeCompressedFile
- type OAIError
- type RepositoryInfo
- type Request
- type Response
- type TimeShiftFunc
- type Window
- type WriterClient
Constants ¶
const CompressThreshold = 1024
const Version = "0.2.10"
Version
Variables ¶
var ( ErrFileNotWriteable = errors.New("not opened for writing") ErrFileNotReadable = errors.New("not opened for reading") )
var ( ErrNoEndpoint = errors.New("an endpoint is required") ErrNoVerb = errors.New("no verb") ErrBadVerb = errors.New("bad verb") ErrCannotCreatePath = errors.New("cannot create path") ErrNoHost = errors.New("no host") ErrMissingFromOrUntil = errors.New("missing from or until") // ErrTooManyRequests might be encountered with broken resumptiontoken implementations. ErrTooManyRequests = errors.New("too many requests") // Verbose logs actions Verbose = false // UserAgent to use for requests UserAgent = fmt.Sprintf("oaimi/%s (https://github.com/miku/oaimi)", Version) // DefaultEarliestDate is used, if the repository does not supply one. DefaultEarliestDate = time.Date(1970, 1, 1, 0, 0, 0, 0, time.UTC) // CutoffDate is used, if the repository reports some earliest date, but which looks unrealistic like year 0007. CutoffDate = time.Date(1458, 1, 1, 0, 0, 0, 0, time.UTC) // DefaultFormat should be supported by most endpoints. DefaultFormat = "oai_dc" // DefaultCacheDir DefaultCacheDir = ".oaimicache" // DefaultClient should suffice for most use cases. DefaultClient = NewClient() // OAIVerbMap (4. Protocol Requests and Responses) OAIVerbMap = map[string]bool{ "Identify": true, "ListIdentifiers": true, "ListSets": true, "ListMetadataFormats": true, "ListRecords": true, "GetRecord": true, } )
Functions ¶
Types ¶
type BatchingClient ¶ added in v0.2.1
type BatchingClient struct { // MaxRequests, zero means no limit. Default of 1024 will prevent endless // loop due to broken resumptionToken implementations (e.g. // http://goo.gl/KFb9iM). MaxRequests int // contains filtered or unexported fields }
BatchingClient takes a single OAI request but will do more the one HTTP request to fulfill it, if necessary.
func NewBatchingClient ¶ added in v0.2.1
func NewBatchingClient() BatchingClient
NewBatchingClient returns a client that batches HTTP requests and uses a resilient HTTP client.
type CachingClient ¶ added in v0.2.1
type CachingClient struct { // RootTag is an optional root element. RootTag string // NameSpaces allow to add custom XML namespace declarations to the root element. NameSpaces map[string]string // CacheDir stores the directory, where all the downloads go. CacheDir string // contains filtered or unexported fields }
CachingClient will write XML to a given writer. This client encapsulates cache logic which helps to make subsequent requests fast. A root element is optional.
func NewCachingClient ¶ added in v0.2.1
func NewCachingClient(w io.Writer) CachingClient
NewCachingClient creates a new client, with a default location for cached files. All XML responses will be written to the given io.Writer.
func NewCachingClientDir ¶ added in v0.2.1
func NewCachingClientDir(w io.Writer, dir string) CachingClient
NewCachingClient creates a new client, with a default location for cached files. All XML responses will be written to the given io.Writer.
func (CachingClient) Do ¶ added in v0.2.1
func (c CachingClient) Do(req Request) error
Do executes a given request. If the request is not yet cached, the content is retrieved and persisted. Requests are internally split up into weekly windows to reduce load and to latency in case of errors.
func (CachingClient) RequestCacheDir ¶ added in v0.2.1
func (c CachingClient) RequestCacheDir(req Request) (string, error)
RequestCacheDir returns the cache directory for a given request.
type Client ¶ added in v0.2.1
type Client struct {
// contains filtered or unexported fields
}
Client is a simple client, that can turn a OAI request into a OAI response.
func NewClient ¶ added in v0.2.1
func NewClient() Client
NewClient create a default client with resilient HTTP client.
func NewClientDoer ¶ added in v0.2.1
func NewClientDoer(doer HttpRequestDoer) Client
NewClient creates a new OAI client with a user supplied http client, e.g. pester.Client, http.DefaultClient.
type HttpRequestDoer ¶ added in v0.2.1
HttpRequestDoer lets us use pester, DefaultClient or other HTTP client implementations interchangably.
type Identify ¶ added in v0.2.4
type Identify struct { Name string `xml:"repositoryName,omitempty" json:"name,omitempty"` URL string `xml:"baseURL,omitempty" json:"url,omitempty"` Version string `xml:"protocolVersion,omitempty" json:"version,omitempty"` AdminEmail string `xml:"adminEmail,omitempty" json:"email,omitempty"` EarliestDatestamp string `xml:"earliestDatestamp,omitempty" json:"earliest,omitempty"` DeletePolicy string `xml:"deletedRecord,omitempty" json:"delete,omitempty"` Granularity string `xml:"granularity,omitempty" json:"granularity,omitempty"` Description struct { Friends []string `xml:"friends>baseURL,omitempty" json:"friends,omitempty"` Identifier struct { Scheme string `xml:"scheme,omitempty" json:"scheme,omitempty"` RepositoryIdentifier string `xml:"repositoryIdentifier,omitempty" json:"repositoryIdentifier,omitempty"` Delimiter string `xml:"delimiter,omitempty" json:"delimiter,omitempty"` SampleIdentifier string `xml:"sampleIdentifier,omitempty" json:"sampleIdentifier,omitempty"` } `xml:"oai-identifier,omitempty" json:"identifier,omitempty"` } `xml:"description,omitempty" json:"description,omitempty"` }
Identify response.
type ListIdentifiers ¶ added in v0.2.4
type ListIdentifiers struct { Header []header `xml:"header"` Token resumptionToken `xml:"resumptionToken"` }
ListIdentifiers response.
type ListMetadataFormats ¶ added in v0.2.4
type ListMetadataFormats struct { xml.Name `xml:"ListMetadataFormats" json:"formats"` Formats []struct { Prefix string `xml:"metadataPrefix" json:"prefix"` Schema string `xml:"schema" json:"schema"` } `xml:"metadataFormat" json:"format"` }
ListMetadataFormats response.
type ListRecords ¶ added in v0.2.4
type ListRecords struct { Records []struct { Header header `xml:"header"` Metadata struct { Verbatim string `xml:",innerxml"` } `xml:"metadata"` } `xml:"record"` Token resumptionToken `xml:"resumptionToken"` }
ListRecords response.
type ListSets ¶ added in v0.2.4
type ListSets struct { Sets []struct { Spec string `xml:"setSpec" json:"spec,omitempty"` Name string `xml:"setName" json:"name,omitempty"` Description string `xml:"setDescription>dc>description" json:"description,omitempty"` } `xml:"set" json:"set"` Token resumptionToken `xml:"resumptionToken"` }
ListSets response.
type MaybeCompressedFile ¶ added in v0.2.9
type MaybeCompressedFile struct {
// contains filtered or unexported fields
}
func CreateMaybeCompressedFile ¶ added in v0.2.9
func CreateMaybeCompressedFile(filename string) *MaybeCompressedFile
CreateMaybeCompressedFile creates a file, that may be compressed, if a certain amount of data is written to it.
func OpenMaybeCompressedFile ¶ added in v0.2.9
func OpenMaybeCompressedFile(filename string) (*MaybeCompressedFile, error)
OpenMaybeCompressedFile returns a file, that may be transparently decompressed on the fly.
func (*MaybeCompressedFile) Close ¶ added in v0.2.9
func (f *MaybeCompressedFile) Close() error
func (*MaybeCompressedFile) Name ¶ added in v0.2.9
func (f *MaybeCompressedFile) Name() string
type RepositoryInfo ¶ added in v0.1.9
type RepositoryInfo struct { Endpoint string `json:"endpoint,omitempty"` Elapsed float64 `json:"elapsed,omitempty"` About Identify `json:"about,omitempty"` Formats ListMetadataFormats `json:"formats,omitempty"` Sets ListSets `json:"sets,omitempty"` Errors []error `json:"errors,omitempty"` }
RepositoryInfo holds some information about the repository.
func AboutEndpoint ¶ added in v0.2.4
func AboutEndpoint(endpoint string, timeout time.Duration) (*RepositoryInfo, error)
AboutEndpoint returns information about a repository. Execution time limited by timeout.
func (RepositoryInfo) MarshalJSON ¶ added in v0.2.4
func (ri RepositoryInfo) MarshalJSON() ([]byte, error)
MarshalJSON formats the RepositoryInfo a bit terser than the default serialization.
type Request ¶
type Request struct { Endpoint string Verb string From time.Time Until time.Time Set string Prefix string Identifier string ResumptionToken string }
Request can hold any parameter, that you want to send to an OAI server.
func (*Request) URL ¶
URL returns the absolute URL for a given request. Catches basic errors like missing endpoint or bad verb.
func (*Request) UseDefaults ¶ added in v0.2.1
func (r *Request) UseDefaults()
UseDefaults will fill in default values for From, Until and Prefix if they are missing.
type Response ¶
type Response struct { xml.Name `xml:"response"` Date string `xml:"responseDate"` Request struct { Verb string `xml:"verb,attr"` Endpoint string `xml:",chardata"` } `xml:"request,omitempty"` Error struct { Code string `xml:"code,attr"` Message string `xml:",chardata"` } `xml:"error"` ListIdentifiers ListIdentifiers `xml:"ListIdentifiers,omitempty"` ListMetadataFormats ListMetadataFormats `xml:"ListMetadataFormats,omitempty" json:"sets"` ListSets ListSets `xml:"ListSets,omitempty" json:"sets"` ListRecords ListRecords `xml:"ListRecords,omitempty"` Identify Identify `xml:"Identify,omitempty" json:"identity,omitempty"` }
Response can hold most answers to an request to a OAI server.
type WriterClient ¶ added in v0.2.1
type WriterClient struct { // RootTag is used as synthetic root element. RootTag string // MaxRequests, zero means no limit. Default of 4096 will prevent endless // loop due to broken resumptionToken implementations (e.g. // http://goo.gl/KFb9iM). Zero means no limit. MaxRequests int // contains filtered or unexported fields }
WriterClient can execute requests, but writes results to a given writer.
func NewWriterClient ¶ added in v0.2.1
func NewWriterClient(w io.Writer) WriterClient
func (WriterClient) Do ¶ added in v0.2.1
func (c WriterClient) Do(req Request) error
Do will execute a request and write all XML to the writer.