Skyhub
Skyhub provides a way of accessing the libgen scimag / sci-hub torrent archive
on a one-off basis. It stands up what looks like a local copy of
sci-hub. Visiting pages causes it to download the relevant article from the
torrents before serving it up.
Library Documentation
Installation
Quick Start
# make sure go v1.14+ is installed and $GOPATH/bin is in your PATH
cd /tmp
GO111MODULE=on go get -v github.com/frrad/skyhub/cmd/skyhub@master
mkdir "$HOME/skyhub"
echo "DOI,ID\n10.7554/elife.32822,70494267" > $HOME/skyhub/index.csv
skyhub &
LINK="http://localhost:5000/by-doi/10.7554/elife.32822"
open $LINK || xdg-open $LINK
http://localhost:5000/by-doi/10.7554/elife.32822
Slow Start
Skyhub expects to be able to find a file called index.csv
containing a list of
DOI,scihub_id
pairs ordered by DOI in your ~/skyhub
directory. The full
uncompressed index weigh in at ~3G, but any subset of it should work. For
instance the one-line index we created in the Quick Start:
DOI,ID
10.7554/elife.32822,70494267
works just fine if you only want to be able to access this DOI.
If you want to build a more complete index, see this
repo for a Makefile to build a complete index
from the libgen database dump.
By default, skyhub will attempt to use its bundled torrents.toml
file to load
torrents by their infohash. However, if you have actual .torrent
files it can
make the process of loading new torrents much faster. Skyhub will prefer to use
any .torrent
files it finds in your ~/skyhub/torrentfiles
. You can download
all available torrents by navigating to ~/skyhub/torrentfiles
and running
make
.
You can find information about skyhub while it's running by visiting the
/status
endpoint.
How it works
You can do random access inside zip files if you have some metadata. So, in
order to retrieve a paper, you can just download 2 16MB "chunks" from the
torrent. The first is for the metadata on the zip file you want. The offset info
in the first chunk helps identify the second chunk which contains the actual
paper.