buckets

package
v0.0.0-...-147f0cf Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 27, 2023 License: MIT Imports: 6 Imported by: 0

README

Bucket Store

The bucketstore provides simple embedded and non-embedded key-value persistence for Hub services with the focus on simplicity. It has a standardized key-value API and can support multiple backends.

What the bucketstore is not, is a general purpose database. It is intended to be simple to use meet basic storage needs. If you need multiple indexes then this store is not the right choice.

Concept

The bucket store concept provided through the API of the bucket store is:

open store -> read/write bucket -> iterate using cursor

Where:

  • Store is a database instance per client. Client being a service that needs persistence.
  • Bucket is a collection of key-value pairs in the store. Supported operations are get (multiple), set (multiple), and delete.
  • Cursor is an iterator in a bucket to iterate to the first, last, next, previous and seek a specific key.

That is all there is to it. No magic.

Backends

Short description of the supported backends.

Note that the implementation hasn't been optimized for performance and that the default settings are being used. Especially Pebble and BoltDB have many capabilities and tricks to might be able to significantly improve read/write performance.

kvbtree

The kvbtree backend is an embedded in-memory store using a btree as a store per client. Data is serialized and persisted to one file per client. Data is periodically written to disk after modifications are made.

This backend is exceptionally fast and the fastest of the available backends for both reading and writing. A read and a write takes less than 1 usec per record, so a speed of a million read/writes per second is possible.

All the data is kept in memory, so the capacity depends on available memory. Kinda like Redis does. Data is not compressed. When the store is updated, a background process periodically takes a snapshot (shallow copy) and writes it to disk. In writing to disk it first writes to a temporary file and when successful, renames the temporary file. This avoids the risk of data corruption.

This store is best suited for limited amount of data, based on memory, that is frequently read and updated. The recommended data limit is 100MB. Testing has shows

pebble

The pebble backend is cockroachdb's persistence layer. It is all around awesome and probably a bit overkill.

It is also an embedded database which explains is very high speed. While kvbtree is around 5-10 faster, it is still very fast with both reading and writing taking approx 1-2usec per record (see BucketBench_test.go for details.)

Pebble's data size is pretty much limited to the available disk space. Got 1TB, well you can store 1TB without suffering too much of a performance penalty. (although this is only tested with about 10 million records). Data is compressed so actual disk space is likely to be less.

This store is best suited for large amounts of data. For example, the time series data of the history store.

bolts

The boltDB (bbolt implementation) backend is a solid transactional embedded database.

Its read speed is close to that of pebble. Read speed does tend to suffer for large amount of data, in the order of 1 million records or more.

However, writing is rather slow with about 5msec per write transaction. Write speed can be greatly increase by using SetMultiple. For example, writing 1000 key-value pairs take less than twice as long as writing a single key-value pair. Just as with reading, writing gets noticeable slower when reaching a high number of records.

BBolt's data size is also limited to available disk space. Data isn't compressed.

This store is best suited for compatiblity with other BoltDB databases or tools.

mongo - abandoned

The mongoDB backend is not complete. One of the main stumbling blocks is that mongodb only has a forward iterator. In addition, mongodb is a standalone server while the other options are embedded databases. The performance of MongoDB will therefore not be able to compete with the others.

redis - not planned

Redis is not an embedded store and requires external setup and maintenance. It is out of scope for this application.
That said, it has a well defined interface and superb performance so if a use-case comes up it can be considered.

Documentation

Overview

Package bucketstore is a storage library for use by the services. The bucket store is primarily used by the state service which provides this store as a service to multiple clients. This package defines an API to use the store with several implementations.

Index

Constants

View Source
const (
	BackendKVBTree = "kvbtree" // fastest and best for small to medium amounts of data (dependent on available memory)
	BackendBBolt   = "bbolt"   // slow on writes but otherwise a good choice
	BackendPebble  = "pebble"  // a good middle ground between performance and memory
)

Available embedded bucket store implementations with low memory overhead

Variables

This section is empty.

Functions

This section is empty.

Types

type BucketStoreInfo

type BucketStoreInfo struct {
	// DataSize contains the size of data in the store or bucket.
	// -1 if not available.
	DataSize int64

	// Engine describes the storage engine of the store, eg kvbtree, bbolt, pebble
	Engine string

	// The store or bucket identifier, eg thingID, appID
	Id string

	// NrRecords holds the number of records in the store or bucket.
	// -1 if not available.
	NrRecords int64
}

BucketStoreInfo information of the bucket or the store

type ClientCursors

type ClientCursors []IBucketCursor

type CursorCache

type CursorCache struct {
	// contains filtered or unexported fields
}

CursorCache manages a set of cursors that can be addressed remotely by key. Intended for servers that let remote clients iterate a cursor in the bucket store.

Added cursors are stored in a map by generated key along with some metadata such as its expiry and optionally the bucket that was allocated to use the iterator. The key is passed back to the client for use during iterations. The client must release the cursor it when done.

To prevent memory leaks due to not releasing a cursor, cursors are given a limited non-used lifespan, after which they are removed. The default is 1 minute.

Cursors are linked to their owner to prevent 'accidental' use by others. Only if the client's ID matches that of the cursor owner, it can be used.

func NewCursorCache

func NewCursorCache() *CursorCache

func (*CursorCache) Add

func (cc *CursorCache) Add(
	cursor IBucketCursor, bucket IBucket, clientID string, lifespan time.Duration) string

Add adds a cursor to the tracker and returns its key

cursor is the object holding the cursor
bucket instance created specifically for this cursor. optional.
clientID of the owner
lifespan of the cursor after last use

func (*CursorCache) Get

func (cc *CursorCache) Get(
	cursorKey string, clientID string, updateLastUsed bool) (cursor IBucketCursor, err error)

Get returns the cursor with the given key. An error is returned if the cursor is not found, has expired, or belongs to a different owner.

cursorKey obtained with Add()
clientID requesting the cursor
updateLastUsed resets the lifespan of the cursor to start now

func (*CursorCache) GetCursorsByOwner

func (cc *CursorCache) GetCursorsByOwner(ownerID string) []*CursorInfo

GetCursorsByOwner returns a list of cursors that are owned by a client. Intended to remove cursors whose owner has disconnected. It is up to the user to remove and release the cursor

func (*CursorCache) GetExpiredCursors

func (cc *CursorCache) GetExpiredCursors() []*CursorInfo

GetExpiredCursors returns a list of cursors that have expired It is up to the user to remove and release the cursor

func (*CursorCache) Release

func (cc *CursorCache) Release(clientID string, cursorKey string) error

Release releases the cursor and removes the cursor from the tracker If a bucket was included it will be closed as well.

func (*CursorCache) Start

func (cc *CursorCache) Start()

Start starts a background loop to remove expired cursors

func (*CursorCache) Stop

func (cc *CursorCache) Stop()

Stop the background auto-expiry loop if running

type CursorInfo

type CursorInfo struct {
	Key string
	// optional bucket instance this cursor operates on
	// if provided it will be released with the cursor
	Bucket IBucket
	// the stored cursor
	Cursor IBucketCursor
	// clientID of the cursor owner
	OwnerID string
	// last use of the cursor
	LastUsed time.Time
	// lifespan of cursor after last use
	Lifespan time.Duration
}

type IBucket

type IBucket interface {

	// Close the bucket and release its resources
	// If commit is true and transactions are support then this commits the transaction.
	// use false to rollback the transaction. For readonly buckets commit returns an error
	Close() error

	// Cursor creates a new bucket cursor for iterating the bucket
	// cursor.Close must be called after use to release any read transactions
	// 	ctx with the cursor application context, intended for including a
	// filter context when reading the cursor.
	Cursor(ctx context.Context) (cursor IBucketCursor, err error)

	// Delete removes the key-value pair from the bucket store
	// Returns nil if the key is deleted or doesn't exist.
	// Returns an error if the key cannot be deleted.
	Delete(key string) (err error)

	// Get returns the document for the given key
	// Returns nil and an error if the key isn't found in the bucket or the database cannot be read
	Get(key string) (value []byte, err error)

	// GetMultiple returns a batch of documents with existing keys
	// if a key does not exist it will not be included in the result.
	// An error is return if the database cannot be read.
	GetMultiple(keys []string) (keyValues map[string][]byte, err error)

	// ID returns the bucket's ID
	ID() string

	// Info returns the bucket information, when available
	Info() *BucketStoreInfo

	// Set sets a document with the given key
	// This stores a copy of value.
	// An error is returned if either the bucketID or the key is empty
	Set(key string, value []byte) error

	// SetMultiple sets multiple documents in a batch update
	// This stores a copy of docs.
	// If the transaction fails an error is returned and no changes are made.
	SetMultiple(docs map[string][]byte) (err error)
}

IBucket defines the interface to a store key-value bucket

type IBucketCursor

type IBucketCursor interface {
	// BucketID is the ID of the bucket this cursor iterates
	BucketID() string

	// Context returns the cursor application context
	Context() context.Context

	// First positions the cursor at the first key in the ordered list
	// valid is false if the bucket is empty
	First() (key string, value []byte, valid bool)

	// Last positions the cursor at the last key in the ordered list
	// valid is false if the bucket is empty
	Last() (key string, value []byte, valid bool)

	// Next moves the cursor to the next key from the current cursor
	// First() or Seek must have been called first.
	// valid is false if the iterator has reached the end and no valid value is returned.
	Next() (key string, value []byte, valid bool)

	// NextN moves the cursor to the next N places from the current cursor
	// and return a map with the N key-value pairs.
	// If the iterator reaches the end it returns the remaining items and itemsRemaining is false
	// If the cursor is already at the end, the resulting map is empty and itemsRemaining is also false.
	// Intended to speed up with batch iterations over rpc.
	NextN(steps uint) (docs map[string][]byte, itemsRemaining bool)

	// Prev moves the cursor to the previous key from the current cursor
	// Last() or Seek must have been called first.
	// valid is false if the iterator has reached the beginning and no valid value is returned.
	Prev() (key string, value []byte, valid bool)

	// PrevN moves the cursor back N places from the current cursor and returns a map with
	// the N key-value pairs.
	// Intended to speed up with batch iterations over rpc.
	// If the iterator reaches the beginning it returns the remaining items and itemsRemaining is false
	// If the cursor is already at the beginning, the resulting map is empty and itemsRemaining is also false.
	PrevN(steps uint) (docs map[string][]byte, itemsRemaining bool)

	// Release close the cursor and release its resources.
	// This invalidates all values obtained from the cursor
	Release()

	// Seek positions the cursor at the given searchKey and corresponding value.
	// If the key is not found, the next key is returned.
	// valid is false if the iterator has reached the end and no valid value is returned.
	Seek(searchKey string) (key string, value []byte, valid bool)
}

IBucketCursor provides the prev/next cursor based iterator on a range

type IBucketStore

type IBucketStore interface {
	// GetBucket returns a bucket to use.
	// This creates the bucket if it doesn't exist.
	// Use bucket.Close() to close the bucket and release its resources.
	GetBucket(bucketID string) (bucket IBucket)

	// Close the store and release its resources
	Close() error

	// Open the store
	Open() error
}

IBucketStore defines the interface to a simple key-value embedded bucket store.

  • organizes data into buckets
  • open/close buckets as a transaction, if transactions are available
  • get/set single or multiple key/value pairs
  • delete key/value
  • cursor based seek and iteration Streaming data into a bucket is not supported Various implementations are available to the services to use.

TODO: add refcount for multiple consumers of the store so it can be closed when done.

Directories

Path Synopsis
Package kvbtree
Package kvbtree
Package mongohs with MongoDB based history mongoClient This implements the HistoryStore.proto API
Package mongohs with MongoDB based history mongoClient This implements the HistoryStore.proto API

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL