Documentation ¶
Overview ¶
Package badger implements an embeddable, simple and fast key-value database, written in pure Go. It is designed to be highly performant for both reads and writes simultaneously. Badger uses Multi-Version Concurrency Control (MVCC), and supports transactions. It runs transactions concurrently, with serializable snapshot isolation guarantees.
Badger uses an LSM tree along with a value log to separate keys from values, hence reducing both write amplification and the size of the LSM tree. This allows LSM tree to be served entirely from RAM, while the values are served from SSD.
Usage ¶
Badger has the following main types: DB, Txn, Item and Iterator. DB contains keys that are associated with values. It must be opened with the appropriate options before it can be accessed.
All operations happen inside a Txn. Txn represents a transaction, which can be read-only or read-write. Read-only transactions can read values for a given key (which are returned inside an Item), or iterate over a set of key-value pairs using an Iterator (which are returned as Item type values as well). Read-write transactions can also update and delete keys from the DB.
See the examples for more usage details.
Index ¶
- Constants
- Variables
- type DB
- func (db *DB) Backup(w io.Writer, since uint64) (uint64, error)
- func (db *DB) Close() (err error)
- func (db *DB) Load(r io.Reader) error
- func (db *DB) NewTransaction(update bool) *Txn
- func (db *DB) PurgeOlderVersions() error
- func (db *DB) PurgeVersionsBelow(key []byte, ts uint64) error
- func (db *DB) RunValueLogGC(discardRatio float64) error
- func (db *DB) Update(fn func(txn *Txn) error) error
- func (db *DB) View(fn func(txn *Txn) error) error
- type Item
- type Iterator
- type IteratorOptions
- type ManagedDB
- type Manifest
- type Options
- type Txn
- func (txn *Txn) Commit(callback func(error)) error
- func (txn *Txn) CommitAt(commitTs uint64, callback func(error)) error
- func (txn *Txn) Delete(key []byte) error
- func (txn *Txn) Discard()
- func (txn *Txn) Get(key []byte) (item *Item, rerr error)
- func (txn *Txn) NewIterator(opt IteratorOptions) *Iterator
- func (txn *Txn) Set(key, val []byte) error
- func (txn *Txn) SetWithMeta(key, val []byte, meta byte) error
- func (txn *Txn) SetWithTTL(key, val []byte, dur time.Duration) error
Examples ¶
Constants ¶
const (
// ManifestFilename is the filename for the manifest file.
ManifestFilename = "MANIFEST"
)
Variables ¶
var ( // ErrInvalidDir is returned when Badger cannot find the directory // from where it is supposed to load the key-value store. ErrInvalidDir = errors.New("Invalid Dir, directory does not exist") // ErrValueLogSize is returned when opt.ValueLogFileSize option is not within the valid // range. ErrValueLogSize = errors.New("Invalid ValueLogFileSize, must be between 1MB and 2GB") // ErrKeyNotFound is returned when key isn't found on a txn.Get. ErrKeyNotFound = errors.New("Key not found") // ErrTxnTooBig is returned if too many writes are fit into a single transaction. ErrTxnTooBig = errors.New("Txn is too big to fit into one request") // ErrConflict is returned when a transaction conflicts with another transaction. This can happen if // the read rows had been updated concurrently by another transaction. ErrConflict = errors.New("Transaction Conflict. Please retry") // ErrReadOnlyTxn is returned if an update function is called on a read-only transaction. ErrReadOnlyTxn = errors.New("No sets or deletes are allowed in a read-only transaction") // ErrDiscardedTxn is returned if a previously discarded transaction is re-used. ErrDiscardedTxn = errors.New("This transaction has been discarded. Create a new one") // ErrEmptyKey is returned if an empty key is passed on an update function. ErrEmptyKey = errors.New("Key cannot be empty") // ErrRetry is returned when a log file containing the value is not found. // This usually indicates that it may have been garbage collected, and the // operation needs to be retried. ErrRetry = errors.New("Unable to find log file. Please retry") // ErrThresholdZero is returned if threshold is set to zero, and value log GC is called. // In such a case, GC can't be run. ErrThresholdZero = errors.New( "Value log GC can't run because threshold is set to zero") // ErrNoRewrite is returned if a call for value log GC doesn't result in a log file rewrite. ErrNoRewrite = errors.New( "Value log GC attempt didn't result in any cleanup") // ErrRejected is returned if a value log GC is called either while another GC is running, or // after DB::Close has been called. ErrRejected = errors.New("Value log GC request rejected") // ErrInvalidRequest is returned if the user request is invalid. ErrInvalidRequest = errors.New("Invalid request") // ErrManagedTxn is returned if the user tries to use an API which isn't // allowed due to external management of transactions, when using ManagedDB. ErrManagedTxn = errors.New( "Invalid API request. Not allowed to perform this action using ManagedDB") // ErrInvalidDump if a data dump made previously cannot be loaded into the database. ErrInvalidDump = errors.New("Data dump cannot be read") )
var DefaultIteratorOptions = IteratorOptions{ PrefetchValues: true, PrefetchSize: 100, Reverse: false, AllVersions: false, }
DefaultIteratorOptions contains default options when iterating over Badger key-value stores.
var DefaultOptions = Options{ DoNotCompact: false, LevelOneSize: 256 << 20, LevelSizeMultiplier: 10, TableLoadingMode: options.LoadToRAM, MaxLevels: 7, MaxTableSize: 64 << 20, NumCompactors: 3, NumLevelZeroTables: 5, NumLevelZeroTablesStall: 10, NumMemtables: 5, SyncWrites: true, ValueLogFileSize: 1 << 30, ValueThreshold: 20, }
DefaultOptions sets a list of recommended options for good performance. Feel free to modify these to suit your needs.
Functions ¶
This section is empty.
Types ¶
type DB ¶ added in v0.9.0
type DB struct { sync.RWMutex // Guards list of inmemory tables, not individual reads and writes. // contains filtered or unexported fields }
DB provides the various functions required to interact with Badger. DB is thread-safe.
func Open ¶ added in v0.9.0
Open returns a new DB object.
Example ¶
dir, err := ioutil.TempDir("", "badger") if err != nil { log.Fatal(err) } defer os.RemoveAll(dir) opts := DefaultOptions opts.Dir = dir opts.ValueDir = dir db, err := Open(opts) if err != nil { log.Fatal(err) } defer db.Close() err = db.View(func(txn *Txn) error { _, err := txn.Get([]byte("key")) // We expect ErrKeyNotFound fmt.Println(err) return nil }) if err != nil { log.Fatal(err) } txn := db.NewTransaction(true) // Read-write txn err = txn.Set([]byte("key"), []byte("value")) if err != nil { log.Fatal(err) } err = txn.Commit(nil) if err != nil { log.Fatal(err) } err = db.View(func(txn *Txn) error { item, err := txn.Get([]byte("key")) if err != nil { return err } val, err := item.Value() if err != nil { return err } fmt.Printf("%s\n", string(val)) return nil }) if err != nil { log.Fatal(err) }
Output: Key not found value
func (*DB) Backup ¶ added in v0.9.0
Backup dumps a protobuf-encoded list of all entries in the database into the given writer, that are newer than the specified version. It returns a timestamp indicating when the entries were dumped which can be passed into a later invocation to generate an incremental dump, of entries that have been added/modified since the last invocation of DB.Backup()
This can be used to backup the data in a database at a given point in time.
func (*DB) Close ¶ added in v0.9.0
Close closes a DB. It's crucial to call it to ensure all the pending updates make their way to disk.
func (*DB) Load ¶ added in v0.9.0
Load reads a protobuf-encoded list of all entries from a reader and writes them to the database. This can be used to restore the database from a backup made by calling DB.Dump().
DB.Load() should be called on a database that is not running any other concurrent transactions while it is running.
func (*DB) NewTransaction ¶ added in v0.9.0
NewTransaction creates a new transaction. Badger supports concurrent execution of transactions, providing serializable snapshot isolation, avoiding write skews. Badger achieves this by tracking the keys read and at Commit time, ensuring that these read keys weren't concurrently modified by another transaction.
For read-only transactions, set update to false. In this mode, we don't track the rows read for any changes. Thus, any long running iterations done in this mode wouldn't pay this overhead.
Running transactions concurrently is OK. However, a transaction itself isn't thread safe, and should only be run serially. It doesn't matter if a transaction is created by one goroutine and passed down to other, as long as the Txn APIs are called serially.
When you create a new transaction, it is absolutely essential to call Discard(). This should be done irrespective of what the update param is set to. Commit API internally runs Discard, but running it twice wouldn't cause any issues.
txn := db.NewTransaction(false) defer txn.Discard() // Call various APIs.
func (*DB) PurgeOlderVersions ¶ added in v0.9.0
PurgeOlderVersions deletes older versions of all keys.
This function could be called prior to doing garbage collection to clean up older versions that are no longer needed. The caller must make sure that there are no long-running read transactions running before this function is called, otherwise they will not work as expected.
func (*DB) PurgeVersionsBelow ¶ added in v0.9.0
PurgeVersionsBelow will delete all versions of a key below the specified version
func (*DB) RunValueLogGC ¶ added in v0.9.0
RunValueLogGC would trigger a value log garbage collection with no guarantees that a call would result in a space reclaim. Every run would in the best case rewrite only one log file. So, repeated calls may be necessary.
The way it currently works is that it would randomly pick up a value log file, and sample it. If the sample shows that we can discard at least discardRatio space of that file, it would be rewritten. Else, an ErrNoRewrite error would be returned indicating that the GC didn't result in any file rewrite.
We recommend setting discardRatio to 0.5, thus indicating that a file be rewritten if half the space can be discarded. This results in a lifetime value log write amplification of 2 (1 from original write + 0.5 rewrite + 0.25 + 0.125 + ... = 2). Setting it to higher value would result in fewer space reclaims, while setting it to a lower value would result in more space reclaims at the cost of increased activity on the LSM tree. discardRatio must be in the range (0.0, 1.0), both endpoints excluded, otherwise an ErrInvalidRequest is returned.
Only one GC is allowed at a time. If another value log GC is running, or DB has been closed, this would return an ErrRejected.
Note: Every time GC is run, it would produce a spike of activity on the LSM tree.
type Item ¶ added in v0.9.0
type Item struct {
// contains filtered or unexported fields
}
Item is returned during iteration. Both the Key() and Value() output is only valid until iterator.Next() is called.
func (*Item) EstimatedSize ¶ added in v0.9.0
EstimatedSize returns approximate size of the key-value pair.
This can be called while iterating through a store to quickly estimate the size of a range of key-value pairs (without fetching the corresponding values).
func (*Item) ExpiresAt ¶ added in v1.0.0
ExpiresAt returns a Unix time value indicating when the item will be considered expired. 0 indicates that the item will never expire.
func (*Item) Key ¶ added in v0.9.0
Key returns the key.
Key is only valid as long as item is valid, or transaction is valid. If you need to use it outside its validity, please copy it.
func (*Item) UserMeta ¶ added in v0.9.0
UserMeta returns the userMeta set by the user. Typically, this byte, optionally set by the user is used to interpret the value.
type Iterator ¶
type Iterator struct {
// contains filtered or unexported fields
}
Iterator helps iterating over the KV pairs in a lexicographically sorted order.
func (*Iterator) Close ¶
func (it *Iterator) Close()
Close would close the iterator. It is important to call this when you're done with iteration.
func (*Iterator) Item ¶
Item returns pointer to the current key-value pair. This item is only valid until it.Next() gets called.
func (*Iterator) Next ¶
func (it *Iterator) Next()
Next would advance the iterator by one. Always check it.Valid() after a Next() to ensure you have access to a valid it.Item().
func (*Iterator) Rewind ¶
func (it *Iterator) Rewind()
Rewind would rewind the iterator cursor all the way to zero-th position, which would be the smallest key if iterating forward, and largest if iterating backward. It does not keep track of whether the cursor started with a Seek().
func (*Iterator) Seek ¶
Seek would seek to the provided key if present. If absent, it would seek to the next smallest key greater than provided if iterating in the forward direction. Behavior would be reversed is iterating backwards.
func (*Iterator) ValidForPrefix ¶
ValidForPrefix returns false when iteration is done or when the current key is not prefixed by the specified prefix.
type IteratorOptions ¶
type IteratorOptions struct { // Indicates whether we should prefetch values during iteration and store them. PrefetchValues bool // How many KV pairs to prefetch while iterating. Valid only if PrefetchValues is true. PrefetchSize int Reverse bool // Direction of iteration. False is forward, true is backward. AllVersions bool // Fetch all valid versions of the same key. }
IteratorOptions is used to set options when iterating over Badger key-value stores.
This package provides DefaultIteratorOptions which contains options that should work for most applications. Consider using that as a starting point before customizing it for your own needs.
type ManagedDB ¶ added in v0.9.0
type ManagedDB struct {
*DB
}
ManagedDB allows end users to manage the transactions themselves. Transaction start and commit timestamps are set by end-user.
This is only useful for databases built on top of Badger (like Dgraph), and can be ignored by most users.
WARNING: This is an experimental feature and may be changed significantly in a future release. So please proceed with caution.
func OpenManaged ¶ added in v0.9.0
OpenManaged returns a new ManagedDB, which allows more control over setting transaction timestamps.
This is only useful for databases built on top of Badger (like Dgraph), and can be ignored by most users.
func (*ManagedDB) NewTransaction ¶ added in v0.9.0
NewTransaction overrides DB.NewTransaction() and panics when invoked. Use NewTransactionAt() instead.
func (*ManagedDB) NewTransactionAt ¶ added in v0.9.0
NewTransactionAt follows the same logic as DB.NewTransaction(), but uses the provided read timestamp.
This is only useful for databases built on top of Badger (like Dgraph), and can be ignored by most users.
type Manifest ¶
type Manifest struct { Levels []levelManifest Tables map[uint64]tableManifest // Contains total number of creation and deletion changes in the manifest -- used to compute // whether it'd be useful to rewrite the manifest. Creations int Deletions int }
Manifest represnts the contents of the MANIFEST file in a Badger store.
The MANIFEST file describes the startup state of the db -- all LSM files and what level they're at.
It consists of a sequence of ManifestChangeSet objects. Each of these is treated atomically, and contains a sequence of ManifestChange's (file creations/deletions) which we use to reconstruct the manifest at startup.
func ReplayManifestFile ¶
ReplayManifestFile reads the manifest file and constructs two manifest objects. (We need one immutable copy and one mutable copy of the manifest. Easiest way is to construct two of them.) Also, returns the last offset after a completely read manifest entry -- the file must be truncated at that point before further appends are made (if there is a partial entry after that). In normal conditions, truncOffset is the file size.
type Options ¶
type Options struct { // 1. Mandatory flags // ------------------- // Directory to store the data in. Should exist and be writable. Dir string // Directory to store the value log in. Can be the same as Dir. Should // exist and be writable. ValueDir string // 2. Frequently modified flags // ----------------------------- // Sync all writes to disk. Setting this to true would slow down data // loading significantly. SyncWrites bool // How should LSM tree be accessed. TableLoadingMode options.FileLoadingMode // 3. Flags that user might want to review // ---------------------------------------- // The following affect all levels of LSM tree. MaxTableSize int64 // Each table (or file) is at most this size. LevelSizeMultiplier int // Equals SizeOf(Li+1)/SizeOf(Li). MaxLevels int // Maximum number of levels of compaction. // If value size >= this threshold, only store value offsets in tree. ValueThreshold int // Maximum number of tables to keep in memory, before stalling. NumMemtables int // The following affect how we handle LSM tree L0. // Maximum number of Level 0 tables before we start compacting. NumLevelZeroTables int // If we hit this number of Level 0 tables, we will stall until L0 is // compacted away. NumLevelZeroTablesStall int // Maximum total size for L1. LevelOneSize int64 // Size of single value log file. ValueLogFileSize int64 // Number of compaction workers to run concurrently. NumCompactors int // 4. Flags for testing purposes // ------------------------------ DoNotCompact bool // Stops LSM tree from compactions. // contains filtered or unexported fields }
Options are params for creating DB object.
This package provides DefaultOptions which contains options that should work for most applications. Consider using that as a starting point before customizing it for your own needs.
type Txn ¶ added in v0.9.0
type Txn struct {
// contains filtered or unexported fields
}
Txn represents a Badger transaction.
func (*Txn) Commit ¶ added in v0.9.0
Commit commits the transaction, following these steps:
1. If there are no writes, return immediately.
2. Check if read rows were updated since txn started. If so, return ErrConflict.
3. If no conflict, generate a commit timestamp and update written rows' commit ts.
4. Batch up all writes, write them to value log and LSM tree.
5. If callback is provided, Badger will return immediately after checking for conflicts. Writes to the database will happen in the background. If there is a conflict, an error will be returned and the callback will not run. If there are no conflicts, the callback will be called in the background upon successful completion of writes or any error during write.
If error is nil, the transaction is successfully committed. In case of a non-nil error, the LSM tree won't be updated, so there's no need for any rollback.
func (*Txn) CommitAt ¶ added in v0.9.0
CommitAt commits the transaction, following the same logic as Commit(), but at the given commit timestamp. This will panic if not used with ManagedDB.
This is only useful for databases built on top of Badger (like Dgraph), and can be ignored by most users.
func (*Txn) Delete ¶ added in v0.9.0
Delete deletes a key. This is done by adding a delete marker for the key at commit timestamp. Any reads happening before this timestamp would be unaffected. Any reads after this commit would see the deletion.
func (*Txn) Discard ¶ added in v0.9.0
func (txn *Txn) Discard()
Discard discards a created transaction. This method is very important and must be called. Commit method calls this internally, however, calling this multiple times doesn't cause any issues. So, this can safely be called via a defer right when transaction is created.
NOTE: If any operations are run on a discarded transaction, ErrDiscardedTxn is returned.
func (*Txn) Get ¶ added in v0.9.0
Get looks for key and returns corresponding Item. If key is not found, ErrKeyNotFound is returned.
func (*Txn) NewIterator ¶ added in v0.9.0
func (txn *Txn) NewIterator(opt IteratorOptions) *Iterator
NewIterator returns a new iterator. Depending upon the options, either only keys, or both key-value pairs would be fetched. The keys are returned in lexicographically sorted order. Using prefetch is highly recommended if you're doing a long running iteration. Avoid long running iterations in update transactions.
Example ¶
dir, err := ioutil.TempDir("", "badger") if err != nil { log.Fatal(err) } defer os.RemoveAll(dir) opts := DefaultOptions opts.Dir = dir opts.ValueDir = dir db, err := Open(opts) if err != nil { log.Fatal(err) } defer db.Close() bkey := func(i int) []byte { return []byte(fmt.Sprintf("%09d", i)) } bval := func(i int) []byte { return []byte(fmt.Sprintf("%025d", i)) } txn := db.NewTransaction(true) // Fill in 1000 items n := 1000 for i := 0; i < n; i++ { err := txn.Set(bkey(i), bval(i)) if err != nil { log.Fatal(err) } } err = txn.Commit(nil) if err != nil { log.Fatal(err) } opt := DefaultIteratorOptions opt.PrefetchSize = 10 // Iterate over 1000 items var count int err = db.View(func(txn *Txn) error { it := txn.NewIterator(opt) for it.Rewind(); it.Valid(); it.Next() { count++ } return nil }) if err != nil { log.Fatal(err) } fmt.Printf("Counted %d elements", count)
Output: Counted 1000 elements
func (*Txn) Set ¶ added in v0.9.0
Set adds a key-value pair to the database.
It will return ErrReadOnlyTxn if update flag was set to false when creating the transaction.
func (*Txn) SetWithMeta ¶ added in v1.0.0
SetWithMeta adds a key-value pair to the database, along with a metadata byte. This byte is stored alongside the key, and can be used as an aid to interpret the value or store other contextual bits corresponding to the key-value pair.
func (*Txn) SetWithTTL ¶ added in v1.0.0
SetWithTTL adds a key-value pair to the database, along with a time-to-live (TTL) setting. A key stored with with a TTL would automatically expire after the time has elapsed , and be eligible for garbage collection.
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
cmd
|
|
badger_info
badger_info Usage: badger_info --dir x [--value-dir y] This command prints information about the badger key-value store.
|
badger_info Usage: badger_info --dir x [--value-dir y] This command prints information about the badger key-value store. |
Package protos is a generated protocol buffer package.
|
Package protos is a generated protocol buffer package. |