document-benchmark
This is a Go application (originally written by Dvir Volk) which supports reading, indexing and searching using two search engines:
with the following datasets:
- Wikipedia Abstract Data Dumps: from English-language Wikipedia:Database page abstracts. This use case generates 3 TEXT fields per document.
- pmc: Full text benchmark with academic papers from PMC.
Getting Started
Download Standalone binaries ( no Golang needed )
If you don't have go on your machine and just want to use the produced binaries you can download the following prebuilt bins:
https://github.com/RediSearch/RediSearchBenchmark/releases/latest
Here's how bash script to download and try it:
wget -c https://github.com/RediSearch/RediSearchBenchmark/releases/latest/download/document-benchmark-$(uname -mrs | awk '{ print tolower($1) }')-$(dpkg --print-architecture).tar.gz -O - | tar -xz
# give it a try
./document-benchmark --help
Installation in a Golang env
The easiest way to get and install the benchmark utility with a Go Env is to use
go get
and then go install
:
# Fetch this repo
go get github.com/RediSearch/RediSearchBenchmark
cd $GOPATH/src/github.com/RediSearch/RediSearchBenchmark
make
Try it out
To try it out locally we can use docker in the following manner to spin up both a Redis and Elastic environments:
sudo sysctl -w vm.max_map_count=262144
docker run -d -p 9200:9200 -p 9300:9300 -e "ELASTIC_PASSWORD=password" docker.elastic.co/elasticsearch/elasticsearch:8.3.3
docker run -d -p 6379:6379 redis/redis-stack:edge
- Retrieve the wikipedia dataset, and populate with 1000000 documents:
wget https://s3.amazonaws.com/benchmarks.redislabs/redisearch/datasets/enwiki-abstract/enwiki-latest-abstract.xml
- Populate into RediSearch:
./bin/document-benchmark -hosts "127.0.0.1:6379" -engine redis -file enwiki-latest-abstract.xml -maxdocs 100000
- Populate into ElasticSearch:
./bin/document-benchmark -hosts "https://127.0.0.1:9200" -engine elastic -password "password" -file enwiki-latest-abstract.xml -maxdocs 100000
- Run the RediSearch benchmark:
./bin/document-benchmark -hosts "127.0.0.1:6379" -engine redis -benchmark search -file enwiki-latest-abstract.xml
- Run the ElasticSearch benchmark:
./bin/document-benchmark -hosts "https://127.0.0.1:9200" -engine elastic -password "password" -file enwiki-latest-abstract.xml -benchmark search