Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( // ErrBlobNotFound can be used for unfetchable blobs. ErrBlobNotFound = errors.New("blob not found") ErrBackendsFailed = errors.New("all backends failed") )
Functions ¶
This section is empty.
Types ¶
type BlobServer ¶
type BlobServer struct {
BaseURL string
}
BlobServer implements access to a running microblob instance.
type FetchGroup ¶
type FetchGroup struct {
Backends []Fetcher
}
FetchGroup allows to run a index data fetch operation in a cascade over a couple of backends.
func (*FetchGroup) Fetch ¶
func (g *FetchGroup) Fetch(id string) ([]byte, error)
Fetch constructs a URL from a template and retrieves the blob.
func (*FetchGroup) Ping ¶
func (g *FetchGroup) Ping() error
Ping is a healthcheck. Solr typically responds with 404 on the URL without any handler; http://localhost:8085/solr/biblio/admin/ping
type Pinger ¶
type Pinger interface {
Ping() error
}
Pinger allows to perform a simple health check.
type Response ¶
type Response struct { ID string `json:"id"` DOI string `json:"doi"` Citing []json.RawMessage `json:"citing,omitempty"` Cited []json.RawMessage `json:"cited,omitempty"` Unmatched struct { Citing []json.RawMessage `json:"citing,omitempty"` Cited []json.RawMessage `json:"cited,omitempty"` } `json:"unmatched"` Extra struct { Took float64 `json:"took"` UnmatchedCitingCount int `json:"unmatched_citing_count"` UnmatchedCitedCount int `json:"unmatched_cited_count"` CitingCount int `json:"citing_count"` CitedCount int `json:"cited_count"` Cached bool `json:"cached"` } `json:"extra"` }
Response contains a subset of index data fused with citation data. Citing and cited documents are unparsed. For unmatched docs, we keep only transmit the DOI, e.g. as {"doi": "10.123/123"}.
type Server ¶
type Server struct { IdentifierDatabase *sqlx.DB OciDatabase *sqlx.DB IndexData Fetcher // Router to register routes on. Router *mux.Router // StopWatch is a builtin, simplistic tracer. StopWatchEnabled bool // Cache related configuration. We only want to cache expensive requests, // e.g. requests that too longer than CacheTriggerDuration to compute. CacheEnabled bool CacheTriggerDuration time.Duration CacheDefaultExpiration time.Duration CacheTTL time.Duration // contains filtered or unexported fields }
Server wraps three data sources required for index and citation data fusion. The IdentifierDatabase is a map from local identifier (e.g. 0-1238201) to DOI, the OciDatabase contains citing and cited relationsships from OCI/COCI citation corpus and IndexData allows to fetch a metadata blob from a service, e.g. a key value store like microblob, sqlite3, solr, elasticsearch or in memory store.
TODO: The server should be able to work with multiple Fetcher instances, e.g. to roll over to a new version or to use one for different data stores.
server | v fetcher | |_________ .... v | fetcher[main] `-> fetcher[ai] | | v v db[main] db[ai] (daily) (monthly)
type SolrBlob ¶
type SolrBlob struct {
BaseURL string
}
SolrBlob implements access to a running microblob instance. The base url would be something like http://localhost/solr/biblio (e.g. without the select part of the path).
func (*SolrBlob) Ping ¶
Ping is a healthcheck. Solr typically responds with 404 on the URL without any handler; http://localhost:8085/solr/biblio/admin/ping
type SqliteBlob ¶
SqliteBlob serves index documents from sqlite database.
type StopWatch ¶
StopWatch allows to record events over time and render them in a pretty table. Example log output (via stopwatch.LogTable()).
2021/09/29 17:22:40 timings for hTHc
> XVlB 0 0s 0.00 started query for: ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTIxMC9qYy4yMDExLTAzODU > XVlB 1 134.532µs 0.00 found doi for id: 10.1210/jc.2011-0385 > XVlB 2 67.918529ms 0.24 found 0 outbound and 4628 inbound edges > XVlB 3 32.293723ms 0.12 mapped 4628 dois back to ids > XVlB 4 3.358704ms 0.01 recorded unmatched ids > XVlB 5 68.636671ms 0.25 fetched 2567 blob from index data store > XVlB 6 105.771005ms 0.38 encoded JSON > XVlB - - - - > XVlB S 278.113164ms 1.00 total
By default a stopwatch is disabled, which means all functions will be noops, use SetEnabled to toggle mode.
func (*StopWatch) LogTable ¶
func (s *StopWatch) LogTable()
LogTable write a table using standard library log facilities.
func (*StopWatch) SetEnabled ¶
SetEnabled enables or disables the stopwatch. If disabled, any call will be a noop.