rosedb

package module

v2.3.9 Latest Latest Go to latest Published: Oct 6, 2024 License: Apache-2.0 Imports: 22 Imported by: 5

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/rosedblabs/rosedb

Links

Open Source Insights

README ¶

English · 简体中文

What is ROSEDB

rosedb is a lightweight, fast and reliable key/value storage engine based on Bitcask storage model.

The design of Bitcask was inspired, in part, by log-structured filesystems and log file merging.

Status

rosedb is well tested and ready for production use. There are serveral projects using rosedb in production as a storage engine.

Didn't find the feature you want? Feel free to open an issue or PR, we are in active development.

Design overview

RoseDB log files are using the WAL(Write Ahead Log) as backend, which are append-only files with block cache.

wal: https://github.com/rosedblabs/wal

Key features

Strengths

Low latency per item read or written

This is due to the write-once, append-only nature of Bitcask database files.

High throughput, especially when writing an incoming stream of random items

Write operations to RoseDB generally saturate I/O and disk bandwidth, which is a good thing from a performance perspective. This saturation occurs for two reasons: because (1) data that is written to RoseDB doesn't need to be ordered on disk, and (2) the log-structured design of Bitcask allows for minimal disk head movement during writes.

Ability to handle datasets larger than RAM without degradation

Access to data in RoseDB involves direct lookup from an in-memory index data structure. This makes finding data very efficient, even when datasets are very large.

Single seek to retrieve any value

RoseDB's in-memory index data structure of keys points directly to locations on disk where the data lives. RoseDB never uses more than one disk seek to read a value and sometimes even that isn't necessary due to filesystem caching done by the operating system.

Predictable lookup and insert performance

For the reasons listed above, read operations from RoseDB have fixed, predictable behavior. This is also true of writes to RoseDB because write operations require, at most, one seek to the end of the current open file followed by and append to that file.

Fast, bounded crash recovery

Crash recovery is easy and fast with RoseDB because RoseDB files are append only and write once. The only items that may be lost are partially written records at the tail of the last file that was opened for writes. Recovery operations need to review the record and verify CRC data to ensure that the data is consistent.

Easy Backup

In most systems, backup can be very complicated. RoseDB simplifies this process due to its append-only, write-once disk format. Any utility that archives or copies files in disk-block order will properly back up or copy a RoseDB database.

Batch options which guarantee atomicity, consistency, and durability

RoseDB supports batch operations which are atomic, consistent, and durable. The new writes in batch are cached in memory before committing. If the batch is committed successfully, all the writes in the batch will be persisted to disk. If the batch fails, all the writes in the batch will be discarded.

Support iterator for forward and backward

RoseDB supports iterator for forward and backward. The iterator is based on the in-memory index data structure of keys, which points directly to locations on disk where the data lives. The iterator is very efficient, even when datasets are very large.

Support key watch

RoseDB supports key watch, you can get the notification if keys changed in db.

Support key expire

RoseDB supports key expire, you can set the expire time for keys.

Weaknesses

Keys must fit in memory

RoseDB keeps all keys in memory at all times, which means that your system must have enough memory to contain your entire keyspace, plus additional space for other operational components and operating- system-resident filesystem buffer space.

Gettings Started

Basic operations

package main

import "github.com/rosedblabs/rosedb/v2"

func main() {
	// specify the options
	options := rosedb.DefaultOptions
	options.DirPath = "/tmp/rosedb_basic"

	// open a database
	db, err := rosedb.Open(options)
	if err != nil {
		panic(err)
	}
	defer func() {
		_ = db.Close()
	}()

	// set a key
	err = db.Put([]byte("name"), []byte("rosedb"))
	if err != nil {
		panic(err)
	}

	// get a key
	val, err := db.Get([]byte("name"))
	if err != nil {
		panic(err)
	}
	println(string(val))

	// delete a key
	err = db.Delete([]byte("name"))
	if err != nil {
		panic(err)
	}
}

Batch operations

	// create a batch
	batch := db.NewBatch(rosedb.DefaultBatchOptions)

	// set a key
	_ = batch.Put([]byte("name"), []byte("rosedb"))

	// get a key
	val, _ := batch.Get([]byte("name"))
	println(string(val))

	// delete a key
	_ = batch.Delete([]byte("name"))

	// commit the batch
	_ = batch.Commit()

see the examples for more details.

Community

Welcome to join the Slack channel and Discussions to connect with RoseDB team developers and other users.

Contributors

Documentation ¶

Index ¶

Constants
Variables
type Batch
type BatchOptions
type DB
- func Open(options Options) (*DB, error)
type Event
type IndexRecord
type LogRecord
- func (lr *LogRecord) IsExpired(now int64) bool
type LogRecordType
type Options
type Stat
type WatchActionType
type Watcher
- func NewWatcher(capacity uint64) *Watcher

Constants ¶

View Source

const (
	B  = 1
	KB = 1024 * B
	MB = 1024 * KB
	GB = 1024 * MB
)

Variables ¶

View Source

var (
	ErrKeyIsEmpty      = errors.New("the key is empty")
	ErrKeyNotFound     = errors.New("key not found in database")
	ErrDatabaseIsUsing = errors.New("the database directory is used by another process")
	ErrReadOnlyBatch   = errors.New("the batch is read only")
	ErrBatchCommitted  = errors.New("the batch is committed")
	ErrBatchRollbacked = errors.New("the batch is rollbacked")
	ErrDBClosed        = errors.New("the database is closed")
	ErrMergeRunning    = errors.New("the merge operation is running")
	ErrWatchDisabled   = errors.New("the watch is disabled")
)

View Source

var DefaultBatchOptions = BatchOptions{
	Sync:     true,
	ReadOnly: false,
}

View Source

var DefaultOptions = Options{
	DirPath:           tempDBDir(),
	SegmentSize:       1 * GB,
	Sync:              false,
	BytesPerSync:      0,
	WatchQueueSize:    0,
	AutoMergeCronExpr: "",
}

Functions ¶

This section is empty.

Types ¶

type Batch ¶

type Batch struct {
	// contains filtered or unexported fields
}

Batch is a batch operations of the database. If readonly is true, you can only get data from the batch by Get method. An error will be returned if you try to use Put or Delete method.

If readonly is false, you can use Put and Delete method to write data to the batch. The data will be written to the database when you call Commit method.

Batch is not a transaction, it does not guarantee isolation. But it can guarantee atomicity, consistency and durability(if the Sync options is true).

You must call Commit method to commit the batch, otherwise the DB will be locked.

func (*Batch) Commit ¶

func (b *Batch) Commit() error

Commit commits the batch, if the batch is readonly or empty, it will return directly.

It will iterate the pendingWrites and write the data to the database, then write a record to indicate the end of the batch to guarantee atomicity. Finally, it will write the index.

func (*Batch) Delete ¶

func (b *Batch) Delete(key []byte) error

Delete marks a key for deletion in the batch.

func (*Batch) Exist ¶

func (b *Batch) Exist(key []byte) (bool, error)

Exist checks if the key exists in the database.

func (*Batch) Expire ¶ added in v2.3.2

func (b *Batch) Expire(key []byte, ttl time.Duration) error

Expire sets the ttl of the key.

func (*Batch) Get ¶

func (b *Batch) Get(key []byte) ([]byte, error)

Get retrieves the value associated with a given key from the batch.

func (*Batch) Persist ¶ added in v2.3.3

func (b *Batch) Persist(key []byte) error

Persist removes the ttl of the key.

func (*Batch) Put ¶

func (b *Batch) Put(key []byte, value []byte) error

Put adds a key-value pair to the batch for writing.

func (*Batch) PutWithTTL ¶ added in v2.3.1

func (b *Batch) PutWithTTL(key []byte, value []byte, ttl time.Duration) error

PutWithTTL adds a key-value pair with ttl to the batch for writing.

func (*Batch) Rollback ¶ added in v2.2.1

func (b *Batch) Rollback() error

Rollback discards an uncommitted batch instance. the discard operation will clear the buffered data and release the lock.

func (*Batch) TTL ¶ added in v2.3.2

func (b *Batch) TTL(key []byte) (time.Duration, error)

TTL returns the ttl of the key.

type BatchOptions ¶

type BatchOptions struct {
	// Sync has the same semantics as Options.Sync.
	Sync bool
	// ReadOnly specifies whether the batch is read only.
	ReadOnly bool
}

BatchOptions specifies the options for creating a batch.

type DB ¶

type DB struct {
	// contains filtered or unexported fields
}

DB represents a ROSEDB database instance. It is built on the bitcask model, which is a log-structured storage. It uses WAL to write data, and uses an in-memory index to store the key and the position of the data in the WAL, the index will be rebuilt when the database is opened.

The main advantage of ROSEDB is that it is very fast to write, read, and delete data. Because it only needs one disk IO to complete a single operation.

But since we should store all keys and their positions(index) in memory, our total data size is limited by the memory size.

So if your memory can almost hold all the keys, ROSEDB is the perfect storage engine for you.

func Open ¶

func Open(options Options) (*DB, error)

Open a database with the specified options. If the database directory does not exist, it will be created automatically.

Multiple processes can not use the same database directory at the same time, otherwise it will return ErrDatabaseIsUsing.

It will open the wal files in the database directory and load the index from them. Return the DB instance, or an error if any.

func (*DB) Ascend ¶ added in v2.3.0

func (db *DB) Ascend(handleFn func(k []byte, v []byte) (bool, error))

Ascend calls handleFn for each key/value pair in the db in ascending order.

func (*DB) AscendGreaterOrEqual ¶ added in v2.3.1

func (db *DB) AscendGreaterOrEqual(key []byte, handleFn func(k []byte, v []byte) (bool, error))

AscendGreaterOrEqual calls handleFn for each key/value pair in the db with keys greater than or equal to the given key.

func (*DB) AscendKeys ¶ added in v2.3.2

func (db *DB) AscendKeys(pattern []byte, filterExpired bool, handleFn func(k []byte) (bool, error))

AscendKeys calls handleFn for each key in the db in ascending order. Since our expiry time is stored in the value, if you want to filter expired keys, you need to set parameter filterExpired to true. But the performance will be affected. Because we need to read the value of each key to determine if it is expired.

func (*DB) AscendRange ¶ added in v2.3.1

func (db *DB) AscendRange(startKey, endKey []byte, handleFn func(k []byte, v []byte) (bool, error))

AscendRange calls handleFn for each key/value pair in the db within the range [startKey, endKey] in ascending order.

func (*DB) Close ¶

func (db *DB) Close() error

Close the database, close all data files and release file lock. Set the closed flag to true. The DB instance cannot be used after closing.

func (*DB) Delete ¶

func (db *DB) Delete(key []byte) error

Delete the specified key from the database. Actually, it will open a new batch and commit it. You can think the batch has only one Delete operation.

func (*DB) DeleteExpiredKeys ¶ added in v2.3.2

func (db *DB) DeleteExpiredKeys(timeout time.Duration) error

DeleteExpiredKeys scan the entire index in ascending order to delete expired keys. It is a time-consuming operation, so we need to specify a timeout to prevent the DB from being unavailable for a long time.

func (*DB) Descend ¶ added in v2.3.0

func (db *DB) Descend(handleFn func(k []byte, v []byte) (bool, error))

Descend calls handleFn for each key/value pair in the db in descending order.

func (*DB) DescendKeys ¶ added in v2.3.2

func (db *DB) DescendKeys(pattern []byte, filterExpired bool, handleFn func(k []byte) (bool, error))

DescendKeys calls handleFn for each key in the db in descending order. Since our expiry time is stored in the value, if you want to filter expired keys, you need to set parameter filterExpired to true. But the performance will be affected. Because we need to read the value of each key to determine if it is expired.

func (*DB) DescendLessOrEqual ¶ added in v2.3.1

func (db *DB) DescendLessOrEqual(key []byte, handleFn func(k []byte, v []byte) (bool, error))

DescendLessOrEqual calls handleFn for each key/value pair in the db with keys less than or equal to the given key.

func (*DB) DescendRange ¶ added in v2.3.1

func (db *DB) DescendRange(startKey, endKey []byte, handleFn func(k []byte, v []byte) (bool, error))

DescendRange calls handleFn for each key/value pair in the db within the range [startKey, endKey] in descending order.

func (*DB) Exist ¶

func (db *DB) Exist(key []byte) (bool, error)

Exist checks if the specified key exists in the database. Actually, it will open a new batch and commit it. You can think the batch has only one Exist operation.

func (*DB) Expire ¶ added in v2.3.2

func (db *DB) Expire(key []byte, ttl time.Duration) error

Expire sets the ttl of the key.

func (*DB) Get ¶

func (db *DB) Get(key []byte) ([]byte, error)

Get the value of the specified key from the database. Actually, it will open a new batch and commit it. You can think the batch has only one Get operation.

func (*DB) Merge ¶ added in v2.2.0

func (db *DB) Merge(reopenAfterDone bool) error

Merge merges all the data files in the database. It will iterate all the data files, find the valid data, and rewrite the data to the new data file.

Merge operation maybe a very time-consuming operation when the database is large. So it is recommended to perform this operation when the database is idle.

If reopenAfterDone is true, the original file will be replaced by the merge file, and db's index will be rebuilt after the merge completes.

func (*DB) NewBatch ¶

func (db *DB) NewBatch(options BatchOptions) *Batch

NewBatch creates a new Batch instance.

func (*DB) Persist ¶ added in v2.3.3

func (db *DB) Persist(key []byte) error

Persist removes the ttl of the key. If the key does not exist or expired, it will return ErrKeyNotFound.

func (*DB) Put ¶

func (db *DB) Put(key []byte, value []byte) error

Put a key-value pair into the database. Actually, it will open a new batch and commit it. You can think the batch has only one Put operation.

func (*DB) PutWithTTL ¶ added in v2.3.1

func (db *DB) PutWithTTL(key []byte, value []byte, ttl time.Duration) error

PutWithTTL a key-value pair into the database, with a ttl. Actually, it will open a new batch and commit it. You can think the batch has only one PutWithTTL operation.

func (*DB) Stat ¶

func (db *DB) Stat() *Stat

Stat returns the statistics of the database.

func (*DB) Sync ¶

func (db *DB) Sync() error

Sync all data files to the underlying storage.

func (*DB) TTL ¶ added in v2.3.2

func (db *DB) TTL(key []byte) (time.Duration, error)

TTL get the ttl of the key.

func (*DB) Watch ¶ added in v2.2.2

func (db *DB) Watch() (<-chan *Event, error)

type Event ¶ added in v2.2.2

type Event struct {
	Action  WatchActionType
	Key     []byte
	Value   []byte
	BatchId uint64
}

Event is the event that occurs when the database is modified. It is used to synchronize the watch of the database.

type IndexRecord ¶

type IndexRecord struct {
	// contains filtered or unexported fields
}

IndexRecord is the index record of the key. It contains the key, the record type and the position of the record in the wal. Only used in start up to rebuild the index.

type LogRecord ¶

type LogRecord struct {
	Key     []byte
	Value   []byte
	Type    LogRecordType
	BatchId uint64
	Expire  int64
}

LogRecord is the log record of the key/value pair. It contains the key, the value, the record type and the batch id It will be encoded to byte slice and written to the wal.

func (*LogRecord) IsExpired ¶ added in v2.3.2

func (lr *LogRecord) IsExpired(now int64) bool

IsExpired checks whether the log record is expired.

type LogRecordType ¶

type LogRecordType = byte

LogRecordType is the type of the log record.

const (
	// LogRecordNormal is the normal log record type.
	LogRecordNormal LogRecordType = iota
	// LogRecordDeleted is the deleted log record type.
	LogRecordDeleted
	// LogRecordBatchFinished is the batch finished log record type.
	LogRecordBatchFinished
)

type Options ¶

type Options struct {
	// DirPath specifies the directory path where the WAL segment files will be stored.
	DirPath string

	// SegmentSize specifies the maximum size of each segment file in bytes.
	SegmentSize int64

	// Sync is whether to synchronize writes through os buffer cache and down onto the actual disk.
	// Setting sync is required for durability of a single write operation, but also results in slower writes.
	//
	// If false, and the machine crashes, then some recent writes may be lost.
	// Note that if it is just the process that crashes (machine does not) then no writes will be lost.
	//
	// In other words, Sync being false has the same semantics as a write
	// system call. Sync being true means write followed by fsync.
	Sync bool

	// BytesPerSync specifies the number of bytes to write before calling fsync.
	BytesPerSync uint32

	// WatchQueueSize the cache length of the watch queue.
	// if the size greater than 0, which means enable the watch.
	WatchQueueSize uint64

	// AutoMergeEnable enable the auto merge.
	// auto merge will be triggered when cron expr is satisfied.
	// cron expression follows the standard cron expression.
	// e.g. "0 0 * * *" means merge at 00:00:00 every day.
	// it also supports seconds optionally.
	// when enable the second field, the cron expression will be like this: "0/10 * * * * *" (every 10 seconds).
	// when auto merge is enabled, the db will be closed and reopened after merge done.
	// do not set this shecule too frequently, it will affect the performance.
	// refer to https://en.wikipedia.org/wiki/Cron
	AutoMergeCronExpr string
}

Options specifies the options for opening a database.

type Stat ¶

type Stat struct {
	// Total number of keys
	KeysNum int
	// Total disk size of database directory
	DiskSize int64
}

Stat represents the statistics of the database.

type WatchActionType ¶ added in v2.2.2

type WatchActionType = byte

const (
	WatchActionPut WatchActionType = iota
	WatchActionDelete
)

type Watcher ¶ added in v2.2.2

type Watcher struct {
	// contains filtered or unexported fields
}

Watcher temporarily stores event information, as it is generated until it is synchronized to DB's watch.

If the event is overflow, It will remove the oldest data, even if event hasn't been read yet.

func NewWatcher ¶ added in v2.2.2

func NewWatcher(capacity uint64) *Watcher

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
examples
basic
batch
iterate
merge
ttl
watch
index
utils

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL