golmdb

package module
v0.0.0-...-ea88c29 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 13, 2023 License: BSD-3-Clause Imports: 13 Imported by: 3

README

GoLMDB

This is a high-level binding to LMDB.

go get wellquite.org/golmdb@latest

This binding uses cgo and so to build, you'll need a working cgo environment: a supported C compiler suite, alongside the LMDB library and headers. LMDB is extremely widely used and is available on "all" platforms, so it shouldn't be difficult to get it on your platform. This binding has been predominately developed with version 0.9.29, which is the current version at the time of writing.

There are several Go bindings to LMDB available. All of them (that I can find) are fairly low-level and tend to mirror the C API into Go. This provides a lot of flexibility, but it leaves a lot of work to do too.

This binding is high-level. It does not attempt to support all the features of LMDB, nor expose the full low-level LMDB API. It provides:

  • batching of updates: read-write transactions will be batched together automatically up to some limit set by a parameter. This allows you to control (to some extent) the trade-off between latency and throughput: smaller batches will result in more fsyncs going on, but may reduce latency; larger batches may increase latency, but there will be fewer fsyncs.
  • automatic resizing: LMDB returns an error if its database file fills up. LMDB also has an API to increase the size of its database file. In general LMDB recommends starting with a huge file size for its database, and relying on the underlying filesystem supporting sparse files, which most modern file systems do. However, there's still always the risk that you end up putting in more data than you thought you would, so this binding automatically copes and increases the size when necessary.
  • minimal copy of data from Go to C and back again. In most cases, Puts of a key-value pair can be written directly to disk without further copies being taken. Reads can access the data on disk with just a single copy (i.e. the copy managed by the OS as part of the mmap) provided care is taken to not use the data beyond the lifetime of the transaction.

Many of the more advanced flags from LMDB are still available, for example flags to turn off syncing. These flags can make updates/writes appear to be much faster. But they are also foot-guns: they can be used safely in certain circumstances, but they're highly likely to blow your foot off and destroy your data if you're not careful. This binding will not save you from yourself! Refer back to the original LMDB docs if you're in any doubt.

There are a few more words about this over on my blog

Documentation

Index

Constants

Environment flags.

NB WriteMap is not exported because it's incompatible with nested transactions, and this binding internally relies on nested txns.

See http://www.lmdb.tech/doc/group__mdb__env.html and http://www.lmdb.tech/doc/group__mdb.html#ga32a193c6bf4d7d5c5d579e71f22e9340

View Source
const (
	ReverseKey = DatabaseFlag(C.MDB_REVERSEKEY)
	DupSort    = DatabaseFlag(C.MDB_DUPSORT)
	IntegerKey = DatabaseFlag(C.MDB_INTEGERKEY)
	DupFixed   = DatabaseFlag(C.MDB_DUPFIXED)
	IntegerDup = DatabaseFlag(C.MDB_INTEGERDUP)
	ReverseDup = DatabaseFlag(C.MDB_REVERSEDUP)
	Create     = DatabaseFlag(C.MDB_CREATE)
)

Database flags

The default (0 flags) is for the key to be of variable length, and to be sorted lexicographically ascending. Each key can only have one value. Note that any key cannot be longer than 511 bytes - changing this will require a custom compilation of LMDB itself.

ReverseKey sets the key to be of variable length as default, but sorted lexicographically descending.

IntegerKey makes the keys be interpreted as integers, most likely 32-bit unsigned ints on 32-bit systems, and 64-bit unsigned on 64-bit systems. The keys must all be of the same size. LMDB interprets these unsigned ints (for the purpose of sorting and searching) in "native endianness". The authors of Go, Rob Pike in particular, don't like this concept - https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html - so it's basically up to you to know that x86 is Little Endian, and so you should be using `binary.LittleEndian` to do the encoding. If you're running code on a Big Endian architecture then you'll need to know that.

DupSort says that a single key can have multiple values. Note when using this that the key _and_ the value together must be less than 511 bytes.

With DupSort, you can add other flags:

IntegerDup says that the values should be treated as unsigned ints. Not that this must be used in combination with DupSort, and it does not imply IntegerKey: the keys can still be variable-length byte slices if you wish.

Similarly, ReverseDup and DupFixed affect the sorting of values within a common key, and must be used in combination with DupSort, but do not imply anything about the nature of the keys.

See also http://www.lmdb.tech/doc/group__mdb__dbi__open.html

View Source
const (
	NoOverwrite = PutFlag(C.MDB_NOOVERWRITE)
	NoDupData   = PutFlag(C.MDB_NODUPDATA)
	Current     = PutFlag(C.MDB_CURRENT)

	Append    = PutFlag(C.MDB_APPEND)
	AppendDup = PutFlag(C.MDB_APPENDDUP)
)

Put flags

See http://www.lmdb.tech/doc/group__mdb__put.html

View Source
const (
	KeyExist        = LMDBError(C.MDB_KEYEXIST)
	NotFound        = LMDBError(C.MDB_NOTFOUND)
	PageNotFound    = LMDBError(C.MDB_PAGE_NOTFOUND)
	Corrupted       = LMDBError(C.MDB_CORRUPTED)
	PanicMDB        = LMDBError(C.MDB_PANIC)
	VersionMismatch = LMDBError(C.MDB_VERSION_MISMATCH)
	Invalid         = LMDBError(C.MDB_INVALID)
	MapFull         = LMDBError(C.MDB_MAP_FULL)
	DBsFull         = LMDBError(C.MDB_DBS_FULL)
	ReadersFull     = LMDBError(C.MDB_READERS_FULL)
	TLSFull         = LMDBError(C.MDB_TLS_FULL)
	TxnFull         = LMDBError(C.MDB_TXN_FULL)
	CursorFull      = LMDBError(C.MDB_CURSOR_FULL)
	PageFull        = LMDBError(C.MDB_PAGE_FULL)
	MapResized      = LMDBError(C.MDB_MAP_RESIZED)
	Incompatible    = LMDBError(C.MDB_INCOMPATIBLE)
	BadRSlot        = LMDBError(C.MDB_BAD_RSLOT)
	BadTxt          = LMDBError(C.MDB_BAD_TXN)
	BadValSize      = LMDBError(C.MDB_BAD_VALSIZE)
	BadDBI          = LMDBError(C.MDB_BAD_DBI)
)

Return codes

KeyExist and NotFound are return codes you may well encounter and expect to deal with in application code. The rest of them probably indicate something has gone terribly wrong.

See http://www.lmdb.tech/doc/group__errors.html

View Source
const Version = C.MDB_VERSION_STRING

The version of LMDB that has been linked against.

Variables

This section is empty.

Functions

This section is empty.

Types

type DBRef

type DBRef C.MDB_dbi

A single LMDB database can contain several top-level named "databases". These can be created and accessed by using the DBRef() method on ReadOnlyTxn and ReadWriteTxn. The DBRef is a reference to such a named top-level "database". They cannot be nested further, and you ideally only want to use a handful of these.

See http://www.lmdb.tech/doc/group__mdb.html#gac08cad5b096925642ca359a6d6f0562a

type DatabaseFlag

type DatabaseFlag C.uint

Used in calls to ReadOnlyTxn.DBRef()

type EnvironmentFlag

type EnvironmentFlag C.uint

Used in calls to NewLMDB() and NewManagedLMDB()

type LMDBClient

type LMDBClient struct {
	*actors.ClientBase
	// contains filtered or unexported fields
}

A client to the whole LMDB database. The client allows you to run Views (read-only transactions), Updates (read-write transactions), and close/terminate the database. A single client is safe for any number of go-routines to use concurrently.

func NewLMDB

func NewLMDB(log zerolog.Logger, path string, mode fs.FileMode, numReaders, numDBs uint, flags EnvironmentFlag, batchSize uint) (*LMDBClient, error)

NewLMDB opens an LMDB database at the given path, creating it if necessary, and returns a client to that LMDB database.

NoTLS is always added to the flags automatically. The value 0 is a perfectly sensible default. Using NoReadAhead will probably help if you expect your dataset to grow large (especially larger than RAM).

If the flags include ReadOnly then the database is opened in read-only mode, and all calls to Update will immediately return an error. When opening with ReadOnly, the database must already exist.

If the flags do not include ReadOnly then the database will be created if necessary. An actor will be spawned to run and batch Update transactions. The actor will use the batchSize parameter to control the maximum number of Update transactions that get batched together. This is a maximum: if the actor has received some smaller number of Update transactions and there are no further Update transactions queued up, then it'll run and commit what it's received without further delay. A reasonable starting value for batchSize is the number of go-routines that could concurrently submit Update transactions.

func NewManagedLMDB

func NewManagedLMDB(manager actors.ManagerClient, path string, mode fs.FileMode, numReaders, numDBs uint, flags EnvironmentFlag, batchSize uint) (*LMDBClient, error)

NewManagedLMDB opens an LMDB database at the given path, creating it if necessary, and returns a client to that LMDB database.

This is the same as NewLMDB, with the exception that the spawned actor (if it is spawned) is spawned as a child of the manager, rather than an unmanaged stand-alone actor.

func (*LMDBClient) Copy

func (self *LMDBClient) Copy(path string, compact bool) error

Copy the entire database to a new path, optionally compacting it.

See http://www.lmdb.tech/doc/group__mdb.html#ga3bf50d7793b36aaddf6b481a44e24244

This can be done with the database in use: it allows you to take backups of the dataset without stopping anything. However, as the docs note, this is essentially a read-only transaction to read the entire database and copy it out. If that takes a long time (because it's a large database) and there are updates to the database going on at the same time, then the original database can grow in size due to needing to keep the old data around so that the read-only transaction doing the copy sees a consistent snapshot of the entire database.

func (*LMDBClient) Sync

func (self *LMDBClient) Sync(force bool) error

Manually sync the database to disk.

See http://www.lmdb.tech/doc/group__mdb.html#ga85e61f05aa68b520cc6c3b981dba5037

Unless you're using MapAsync or NoSync or NoMetaSync flags when opening the LMDB database, you do not need to worry about calling this. If you are using any of those flags then LMDB will not be syncing data to disk on every transaction commit, which raises the possibility of data loss or corruption in the event of a crash or unexpected exit. Nevertheless, those flags are sometimes useful, for example when rapidly loading a data set into the database. An explicit call to Sync is then needed to flush everything through onto disk.

func (*LMDBClient) TerminateSync

func (self *LMDBClient) TerminateSync()

Terminates the actor for Update transactions (if it's running), and then shuts down the LMDB database.

You must make sure that all concurrently running transactions have finished before you call this method: this method will not wait for concurrent View transactions to finish (or prevent new ones from starting), and it will not wait for calls to Update to complete. It is your responsibility to make sure all users of the client are finished and shutdown before calling TerminateSync.

Note that this does not call mdb_env_sync. So if you've opened the database with NoSync or NoMetaSync or MapAsync then you will need to call Sync() before TerminateSync(); the Sync in TerminateSync merely refers to the fact this method is synchronous - it'll only return once the actor has fully terminated and the LMDB database has been closed.

func (*LMDBClient) Update

func (self *LMDBClient) Update(fun func(rwtxn *ReadWriteTxn) error) error

Run an Update: a read-write transaction. The fun will not be run in the current go-routine.

If the fun returns a nil error, then the transaction will be committed. If the fun returns any non-nil error then the transaction will be aborted. Any non-nil error returned by the fun is returned from this method.

If the fun is run and returns a nil error, then it may still be run more than once. In this case, its transaction will be aborted (and a fresh transaction created), before it is re-run. I.e. the fun will never see the state of the database *after* it has already been run.

If the fun is run and returns a non-nil error then it will not be re-run.

Only a single Update transaction can run at a time; golmdb will manage this for you. An Update transaction can proceed concurrently with one or more View transactions.

Nested transactions are not supported.

func (*LMDBClient) View

func (self *LMDBClient) View(fun func(rotxn *ReadOnlyTxn) error) (err error)

Run a View: a read-only transaction. The fun will be run in the current go-routine. Multiple concurrent calls to View can proceed concurrently. If there are write transactions going on concurrently, they may cause MapFull excetptions. If this happens, all read transactions will be interrupted and aborted, and will automatically be restarted.

As this is a read-only transaction, the transaction is aborted no matter what the fun returns. The error that the fun returns is returned from this method.

Nested transactions are not supported.

type LMDBError

type LMDBError C.int

An LMDB error. See the Return Codes in the Constants section.

func (LMDBError) Error

func (self LMDBError) Error() string

type PutFlag

type PutFlag C.uint

Used in calls to ReadWriteTxn.Put(), ReadWriteTxn.PutDupSort(), Cursor.Put(), and Cursor.PutDupSort()

type ReadOnlyCursor

type ReadOnlyCursor struct {
	// contains filtered or unexported fields
}

Cursors allow you to walk over a database, or sections of them.

func (*ReadOnlyCursor) Close

func (self *ReadOnlyCursor) Close()

Close the current cursor.

You should call Close() on each cursor before the end of the transaction in which it was created.

func (*ReadOnlyCursor) Count

func (self *ReadOnlyCursor) Count() (count uint64, err error)

Only for DupSort. Return the number values with the current key.

func (*ReadOnlyCursor) Current

func (self *ReadOnlyCursor) Current() (key, val []byte, err error)

Get the current key-value pair of the cursor.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) First

func (self *ReadOnlyCursor) First() (key, val []byte, err error)

Move to the first key-value pair of the database.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) FirstInSameKey

func (self *ReadOnlyCursor) FirstInSameKey() (val []byte, err error)

Only for DupSort. Move to the first key-value pair without changing the current key.

Do not write into the returned val byte slice. Doing so will cause a segfault.

func (*ReadOnlyCursor) Last

func (self *ReadOnlyCursor) Last() (key, val []byte, err error)

Move to the last key-value pair of the database.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) LastInSameKey

func (self *ReadOnlyCursor) LastInSameKey() (val []byte, err error)

Only for DupSort. Move to the last key-value pair without changing the current key.

Do not write into the returned val byte slice. Doing so will cause a segfault.

func (*ReadOnlyCursor) Next

func (self *ReadOnlyCursor) Next() (key, val []byte, err error)

Move to the next key-value pair.

For DupSort databases, move to the next value of the current key, if there is one, otherwise the first value of the next key.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) NextInSameKey

func (self *ReadOnlyCursor) NextInSameKey() (key, val []byte, err error)

Only for DupSort. Move to the next key-value pair, but only if the key is the same as the current key.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) NextKey

func (self *ReadOnlyCursor) NextKey() (key, val []byte, err error)

Only for DupSort. Move to the first key-value pair of the next key.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) Prev

func (self *ReadOnlyCursor) Prev() (key, val []byte, err error)

Move to the previous key-value pair.

For DupSort databases, move to the previous value of the current key, if there is one, otherwise the last value of the previous key.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) PrevInSameKey

func (self *ReadOnlyCursor) PrevInSameKey() (key, val []byte, err error)

Only for DupSort. Move to the previous key-value pair, but only if the key is the same as the current key.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) PrevKey

func (self *ReadOnlyCursor) PrevKey() (key, val []byte, err error)

Only for DupSort. Move to the last key-value pair of the previous key.

Do not write into the returned key or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) SeekExactKey

func (self *ReadOnlyCursor) SeekExactKey(key []byte) (val []byte, err error)

Move to the key-value pair indicated by the given key.

If the exact key doesn't exist, returns NotFound.

For DupSort databases, move to the first value of the given key.

Do not write into the returned val byte slice. Doing so will cause a segfault.

func (*ReadOnlyCursor) SeekExactKeyAndValue

func (self *ReadOnlyCursor) SeekExactKeyAndValue(keyIn, valIn []byte) (err error)

Only for DupSort. Move to the key-value pair indicated.

If the exact key-value pair doesn't exist, return NotFound.

func (*ReadOnlyCursor) SeekGreaterThanOrEqualKey

func (self *ReadOnlyCursor) SeekGreaterThanOrEqualKey(keyIn []byte) (keyOut, val []byte, err error)

Move to the key-value pair indicated by the given key.

If the exact key doesn't exist, move to the nearest key greater than the given key.

Do not write into the returned keyOut or val byte slices. Doing so will cause a segfault.

func (*ReadOnlyCursor) SeekGreaterThanOrEqualKeyAndValue

func (self *ReadOnlyCursor) SeekGreaterThanOrEqualKeyAndValue(keyIn, valIn []byte) (valOut []byte, err error)

Only for DupSort. Move to the key-value pair indicated.

If the exact key-value pair doesn't exist, move to the nearest value in the same key greater than the given value. I.e. this will not move to a greater key, only a greater value.

If there is no such value within the current key, return NotFound.

type ReadOnlyTxn

type ReadOnlyTxn struct {
	// contains filtered or unexported fields
}

func (*ReadOnlyTxn) DBRef

func (self *ReadOnlyTxn) DBRef(name string, flags DatabaseFlag) (DBRef, error)

DBRef gets a reference to a named database within the LMDB. If you provide the flag Create then it'll be created if it doesn't already exist (provided you're in an Update transaction).

If you call this from an Update and it succeeds, then once that txn commits, the DBRef can be used by other transactions (both Updates and Views) until it is terminated/closed.

If you call this from a View and it succeeds, then the DBRef is only valid until the end of that View transaction.

See http://www.lmdb.tech/doc/group__mdb.html#gac08cad5b096925642ca359a6d6f0562a

func (*ReadOnlyTxn) Get

func (self *ReadOnlyTxn) Get(db DBRef, key []byte) ([]byte, error)

Get the value corresponding to the key from the database.

The returned bytes are owned by the database. Do not modify them. They are valid only until a subsequent update operation, or the end of the transaction. If you need the value around longer than that, you must take a copy.

See http://www.lmdb.tech/doc/group__mdb.html#ga8bf10cd91d3f3a83a34d04ce6b07992d

func (*ReadOnlyTxn) NewCursor

func (self *ReadOnlyTxn) NewCursor(db DBRef) (*ReadOnlyCursor, error)

Create a new read-only cursor.

You should call Close() on each cursor before the end of the transaction. The exact rules for cursor lifespans are more complex, and are documented at http://www.lmdb.tech/doc/group__mdb.html#ga9ff5d7bd42557fd5ee235dc1d62613aa but it's simplest if you treat each cursor as scoped to the lifespan of its transaction, and you explicitly Close() each cursor before the end of the transaction.

See http://www.lmdb.tech/doc/group__mdb.html#ga9ff5d7bd42557fd5ee235dc1d62613aa

type ReadWriteCursor

type ReadWriteCursor struct {
	ReadOnlyCursor
}

A ReadWriteCursor extends ReadOnlyCursor with methods for mutating the database.

func (*ReadWriteCursor) Delete

func (self *ReadWriteCursor) Delete(flags PutFlag) error

Delete the key-value pair at the cursor.

The only possible flag is NoDupData which is only for DupSort databases, and means "delete all values for the current key".

See http://www.lmdb.tech/doc/group__mdb.html#ga26a52d3efcfd72e5bf6bd6960bf75f95

type ReadWriteTxn

type ReadWriteTxn struct {
	ReadOnlyTxn
}

A ReadWriteTxn extends ReadOnlyTxn with methods for mutating the database.

func (*ReadWriteTxn) Delete

func (self *ReadWriteTxn) Delete(db DBRef, key, val []byte) error

Delete a key-value pair from the database.

The val is only necessary if you're using DupSort. If not, it's fine to use nil as val.

See http://www.lmdb.tech/doc/group__mdb.html#gab8182f9360ea69ac0afd4a4eaab1ddb0

func (*ReadWriteTxn) Drop

func (self *ReadWriteTxn) Drop(db DBRef) error

Drop the database. Not only are all key-value pairs removed from the database, but the database itself is removed, which means calling DBRef(name,0) will fail: the database will need to be recreated before it can be used again.

See http://www.lmdb.tech/doc/group__mdb.html#gab966fab3840fc54a6571dfb32b00f2db

func (*ReadWriteTxn) Empty

func (self *ReadWriteTxn) Empty(db DBRef) error

Empty the database. All key-value pairs are removed from the database.

See http://www.lmdb.tech/doc/group__mdb.html#gab966fab3840fc54a6571dfb32b00f2db

func (*ReadWriteTxn) NewCursor

func (self *ReadWriteTxn) NewCursor(db DBRef) (*ReadWriteCursor, error)

Create a new read-write cursor.

You should call Close() on each cursor before the end of the transaction. The exact rules for cursor lifespans are more complex, and are documented at http://www.lmdb.tech/doc/group__mdb.html#ga9ff5d7bd42557fd5ee235dc1d62613aa but it's simplest if you treat each cursor as scoped to the lifespan of its transaction, and you explicitly Close() each cursor before the end of the transaction.

See http://www.lmdb.tech/doc/group__mdb.html#ga9ff5d7bd42557fd5ee235dc1d62613aa

func (*ReadWriteTxn) Put

func (self *ReadWriteTxn) Put(db DBRef, key, val []byte, flags PutFlag) error

Put a key-value pair into the database.

See http://www.lmdb.tech/doc/group__mdb.html#ga4fa8573d9236d54687c61827ebf8cac0

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL