Discover Packages
github.com/gnames/bhlindex
bhlindex
command
Version:
v0.10.0
Opens a new window with list of versions in this module.
Published: Jun 23, 2020
License: MIT, MIT
Opens a new window with license information.
Imports: 1
Opens a new window with list of imports.
Imported by: 0
Opens a new window with list of known importers.
README
README
¶
Files in this directory contain binary release of bhlindex tool. The bhlindex
tool finds and records scientific names occuring on >50 million pages
collected by Biodiversity Heritage Library .
Requirements
Laptop or server with >= 8GB of system memory and 500GB of free disk space.
SSD storage is recommended.
Empty postgresql with a user that is able to create new databases.
Biodiversity Heritage Library textual files. For testing purposes use files
located at testdata directory of this project
You have to setup environment variables that configure access to BHL files
and the database server.
POSTGRES_DB
: Database created for bhlindex
POSTGRES_HOST
: IP address or hostname where Potgresql database is installed
POSTGRES_USER
: user that has an access to the POSTGRES_DB
BHL_DIR
: root of BHL directory that contains $BHL_DIR
/ocr/bhl1, $BHL_DIR
/ocr/bhl2 etc.
PREF_SOURCES
: IDs of data sources from http://resolver.globalnames.org/data_sources .
They have to be a list of integers separated by comma, for example
PREF_SOURCES=1,2,3
The variable with the values for development
environment can be found at .env.dev file . To export the variables
into bash or zsh:
source .env.dev
Password for the Postgres user should either be empty, or set via
.pgpass
file .
Usage
To check the github commit version and date of compilation use
./bhlindex version
To create the index execute
./production.sh
If you want to read envronment variable from a file
source /dir/to/env_file ./production.sh
Expand ▾
Collapse ▴
Documentation
¶
There is no documentation for this package.
Source Files
¶
Directories
¶
Click to show internal directories.
Click to hide internal directories.