pure-kv
Easy-to-use embedded in-memory key-value storage with ttl and RPC interface.
Features:
- doesn't depend on any third-party library;
- uses concurrent hash-map inside;
- time-to-live could be set to keys;
- could be used both as an embedded storage and as a remote RPC server (RPC client provided);
- could be dumped to disk and restored back with the full state;
Install
go get github.com/gasparian/pure-kv
Run server
make
./purekv --port 6666 \
--shards 32 \
--ttl_garbage_timeout 60000 \
Usage
The only thing you need to know about pure-kv
- you can perform just set/get/del with time-to-live and keep byte arrays only. So you can serialize your data with gob
, for example, and put the result in the store.
Embedded
You can use store directly in your go code.
Here is an example:
import (
"log"
"time"
"github.com/gasparian/pure-kv/pkg/purekv"
)
// Create new store
store := purekv.NewStore(
32, // number of shards for concurrent map
20000, // TTL garbage collector timeout in ms.
// setting 0 means turn off keys removal at all
)
defer store.Close() // Need to finalize background work
// Set key providing ttl of 1 s
err = store.Set("someKey", []byte{'a'}, 1000)
if err != nil {
log.Fatal(err)
}
// Get current size of store
storeSize := store.Size()
log.Printf("Number of keys in store: %v", storeSize)
// Get value that has been set before
val, err := Get("someKey")
if err != nil {
log.Fatal(err)
}
log.Printf("Value got from store: %v", val)
// Check that key became stale
time.Sleep(time.Duration(1) * time.Second)
ok = store.Has("someKey")
if ok {
log.Fatal("Key should be already cleaned")
}
// Setting a key with 0 ttl means that it
// will live until you delete it manually
store.Set("anotherKey", []byte{'a'}, 0)
store.Del("anotherKey")
// Check that it's not presented
ok = store.Has("anotherKey")
if ok {
log.Fatal("Key `anotherKey` should not be presented")
}
// Dump store to disk
path := "/path/to/db/dump"
err = store.Dump(path)
if err != nil {
log.Fatal(err)
}
// Load store back
storeLoaded, err := purekv.Load(path)
defer storeLoaded.Close()
if err != nil {
log.Fatal(err)
}
log.Printf("Store loaded successfully, size: %v", storeLoaded.Size())
// Drop all entries in store and check it
err = store.DelAll()
if err != nil {
log.Fatal(err)
}
storeSize = storeLoaded.Size()
if storeSize != 0 {
log.Fatal("Store should not contain entries after deletion")
}
RPC
Run server:
package main
import (
pkv "github.com/gasparian/pure-kv/internal/server"
)
func main() {
flag.Parse()
srv := pkv.InitServer(
6666, // port
32, // number of shards for concurrent map
60000, // TTL garbage collector timeout in ms.
)
srv.Start() // <-- blocks
}
Client usage example:
package main
import (
"log"
pkv "github.com/gasparian/pure-kv/internal/client"
)
func main() {
// creates client instance by providing server address and timetout in ms.
cli := pkv.New("0.0.0.0:6668", 500)
err := cli.Open()
defer cli.Close()
if err != nil {
log.Fatal(err)
}
// Creates new key-value pair with ttl of 1 sec.
err = cli.Set("someKey", []byte{'a'}, 1000)
if err != nil {
log.Fatal(err)
}
// Returns set value
tmpVal, ok := cli.Get("someKey")
if !ok {
log.Fatal("Can't get a value")
}
log.Println(val)
// Returns number of elements in store
size := cli.Size()
log.Println(size)
// Async. delete value from the bucket
err = cli.Del("someKey")
if err != nil {
log.Fatal(err)
}
// Recreates the entire db, deleting everything
cli.DelAll()
}
Tests
You can run tests by package or left argument empty to run all the tests:
make test
Run store
benchmark:
make benchmark
The main idea of this benchmark is to see how concurrent access and the number of shards in store affects it's performance.
Benchmark results
I ran benchmark test on my laptop with the following configuration:
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
gomaxprocs: 12
Below you can find graphs for simulation of 100 serial/concurrent sets/gets.
First case is when ttl
has been turned off:
Second case - when we use ttl
:
In both cases you can see an improvement when using concurrent access, but the effect of adding more shards in our concurrent map quickly disappears, after ~4-8 shards.
As you can see, the largest improvement belongs to concurrent sets: >30% for experiment without keys ttl and >200% for the experiment with ttl (max. 32 shards).
But at the same time it looks like "gets" are more "stable", and just a single shard (which is effectively just a map with mutex lock) already does the job.
The methodology of an experiment is simple and far from ideal, of course. Possibly we could see more improvement on a larger-scale experiments.
Things to improve
Now, keys with expiry times are stored in a min heap, and this heap is being update with new entries through channels, which are unique for every shard. And on store start-up some amount of goroutines are spawned (number is equal to the number of shards), which may be not the best solution (see the code here). On the other hand - these goroutines by default just waits for the new entry in the channel, which comes from the next Set
operation, in case ttl has been set, but it could a problem when there are a lot of sets and these goroutines could take more and more CPU time.