Sibyl
Search endpoint and indexer for Deichman Public Library.
🗒 Detailed documenation | 🌴 Schema
Purpose
- Provide an easy to use search endpoint for the applications
- Indexes data from multiple source systems
- Contains (business) rules for ranking content
Relations to other components
- Search API used by deichman.no, Fuge and Tjenestekatalog
- Euler sends requests to Sibyl when data is changed
- Reads data from Koha DB
- Reads data from MariaDB SQL DB
- Used as a queue and cache
- Same instance as Koha is running on, but different database
- Reads data from Euler
Technology used
- Go (for the API and indexing orchestration)
- ElasticSearch as a search engine
Technical
- Changes are queued in memory while awaiting processing
- Volatile, and data is lost in case of shutdown
- NOTE: Verify this, there are some mechanisms to put the queue in a database for persistence.
API
Status
GET /_status : status endpoint for queue length and cache size
Index admin
GET /admin/setup_indexes : list supported indices
POST /admin/setup_indexes : (re-)create all indices
POST /admin/setup_index/{name} : (re-)create given index
GET /admin/index_from_cache : populate all indices from cache
GET /admin/index_from_cache/{name} : populate give index from cache
POST /admin/build_autocomplete : build auto-complete index
Cache update (including reindexing)
GET /resource : update cache for all resources and queue for indexing
POST /resource : update cache for given resources and queue for indexing
DELETE /resource : delete given resources from cache and index
GET /resource/{type} : update cache for given resource type and queue for indexing
Search
GET /search/authority : search authority index
GET /search/branch : search branch index
GET /search/campaign : search campaign index
GET /search/page : list all pages
GET /search/editorialcontent/{type} : search for editorial content
GET /search/service : search service index
GET /search/library_event : search library_event index
GET /search/publication : search publication index
POST /search/publicationByIDs : search publications by IDs
POST /search/{type}/_search : proxy to ES
Autocomplete
GET /search/autocomplete : autocomplete suggestions based on db data
Environment variables
INTERNAL_URL_ELASTICSEARCH
- ElasticSearch URL used for search queries and indexing
INTERNAL_URL_EULER
- Euler URL used when fetching data to index
SIBYL_INDEX_WORKERS
- number of goroutines processing indexing requests in parallel
SIBYL_INDEXER_FREQ
- Frequency at which Sibyl checks if the queue contains any documents that need indexing
SIBYL_KOHA_COLLECTOR_FREQ
- Frequency at which Sibyl fetches biblio data from Koha
SIBYL_PROCESSING_MAX_IDLE
- Max idle time between ad hoc indexing requests before Sibyl starts processing
SPARQL_ENDPOINT
- SPARQL endpoint used when fetching data to index