Documentation ¶
Overview ¶
Package dbm (experimental/WIP) implements a simple database engine, a hybrid of a hierarchical[1] and/or a key-value one[2].
A dbm database stores arbitrary data in named multidimensional arrays and/or named flat Files. It aims more for small DB footprint rather than for access speed. Dbm was written for a project running on an embedded ARM Linux system.
Experimental release notes ¶
This is an experimental release. However, it is now nearly feature complete.
Key collating respecting client supplied locale is not yet implemented. Planned when exp/locale materializes. Because of this, the dbm API doesn't yet allow to really define other than default collating of keys. At least some sort of client defined collating will be incorporated after Go 1.1 release.
No serious attempts to profile and/or improve performance were made (TODO).
WARNING: THE DBM API IS SUBJECT TO CHANGE. WARNING: THE DBM FILE FORMAT IS SUBJECT TO CHANGE. WARNING: NOT READY FOR USE IN PRODUCTION.
Targeted use cases ¶
ATM using disk based dbm DBs with 2PC/WAL/recovery enabled is supposed to be safe (modulo any unknown bugs).
Concurrent access ¶
All of the dbm API is (intended to be) safe for concurrent use by multiple goroutines. However, data races stemming from, for example, one goroutine seeing a value in a tree and another deleting it before the first one gets back to process it, must be handled outside of dbm. Still any CRUD operations, as in this date race example, are atomic and safe per se and will not corrupt the database structural integrity. Non coordinated updates of a DB may corrupt its semantic and/or schema integrity, though. Failed DB updates performed not within a structural transaction may corrupt the DB.
Also please note that passing racy arguments to an otherwise concurrent safe API makes that API act racy as well.
Scalars ¶
Keys and values of an Array are multi-valued and every value must be a "scalar". Types called "scalar" are:
nil (the typeless one) bool all integral types: [u]int8, [u]int16, [u]int32, [u]int, [u]int64 all floating point types: float32, float64 all complex types: complex64, complex128 []byte (64kB max) string (64kb max)
Collating ¶
Values in an Array are always ordered in the collating order of the respective keys. For details about the collating order please see lldb.Collate. There's a plan for a mechanism respecting user-supplied locale applied to string collating, but the required API differences call for a whole different package perhaps emerging in the future.
Multidimensional sparse arrays ¶
A multidimensional array can have many subscripts. Each subscript must be one of the bellow types:
nil (typeless) bool int int8 int16 int32 int64 uint byte uint8 uint16 uint32 uint64 float32 float64 complex64 complex128 []byte string
The "outer" ordering is: nil, bool, number, []byte, string. IOW, nil is "smaller" than anything else except other nil, numbers collate before []byte, []byte collate before strings, etc.
By using single item subscripts the multidimensional array "degrades" to a plain key-value map. As the arrays are named, both models can coexist in the same database. Dbm arrays are modeled after those of MUMPS[3], so the acronym is for DB/M instead of Data Base Manager[4]. For a more detailed discussion of multidimensional arrays please see [5]. Some examples from the same source rewritten and/or modified for dbm. Note: Error values and error checking is not present in the bellow examples.
This is a MUMPS statement
^Stock("slip dress", 4, "blue", "floral") = 3
This is its dbm equivalent
db.Set(3, "Stock", "slip dress", 4, "blue", "floral")
Dump of "Stock"
"slip dress", 4, "blue", "floral" → 3 ---- db.Get("Stock", "slip dress", 4, "blue", "floral") → 3
Or for the same effect:
stock := db.Array("Stock") stock.Set(3, "slip dress", 4, "blue", "floral")
Dump of "Stock"
"slip dress", 4, "blue", "floral" → 3 ---- db.Get("Stock", "slip dress", 4, "blue", "floral") → 3 stock.Get("slip dress", 4, "blue", "floral") → 3
Or
blueDress := db.Array("Stock", "slip dress", 4, "blue") blueDress.Set(3, "floral")
Dump of "Stock"
"slip dress", 4, "blue", "floral" → 3 ---- db.Get("Stock", "slip dress", 4, "blue", "floral") → 3 blueDress.Get("floral") → 3
Similarly:
invoiceNum := 314159 customer := "Google" when := time.Now().UnixNano() parts := []struct{ num, qty, price int }{ {100001, 2, 300}, {100004, 5, 600}, } invoice := db.Array("Invoice") invoice.Set(when, invoiceNum, "Date") invoice.Set(customer, invoiceNum, "Customer") invoice.Set(len(parts), invoiceNum, "Items") // # of Items in the invoice for i, part := range parts { invoice.Set(part.num, invoiceNum, "Items", i, "Part") invoice.Set(part.qty, invoiceNum, "Items", i, "Quantity") invoice.Set(part.price, invoiceNum, "Items", i, "Price") }
Dump of "Invoice"
314159, "Customer" → "Google" 314159, "Date" → 1363864307518685049 314159, "Items" → 2 314159, "Items", 0, "Part" → 100001 314159, "Items", 0, "Price" → 300 314159, "Items", 0, "Quantity" → 2 314159, "Items", 1, "Part" → 100004 314159, "Items", 1, "Price" → 600 314159, "Items", 1, "Quantity" → 5 ---- db.Get("Invoice", invoiceNum, "Customer") → customer db.Get("Invoice", invoiceNum, "Date") → when ... invoice.Get(invoiceNum, "Customer") → customer invoice.Get(invoiceNum, "Date") → time.Then().UnixName invoice.Get(invoiceNum, "Items") → len(parts) invoice.Get(invoiceNum, "Items", 0, "Part") → parts[0].part invoice.Get(invoiceNum, "Items", 0, "Quantity") → parts[0].qty invoice.Get(invoiceNum, "Items", 0, "Price") → parts[0].price invoice.Get(invoiceNum, "Items", 1, "Part") → parts[1].part ...
Or for the same effect
invoice := db.Array("Invoice", invoiceNum) invoice.Set(when, "Date") invoice.Set(customer, "Customer") items := invoice.Array("Items") items.Set(len(parts)) // # of Items in the invoice for i, part := range parts { items.Set(part.num, i, "Part") items.Set(part.qty, i, "Quantity") items.Set(part.price, i, "Price") }
Dump of "Invoice"
314159, "Customer" → "Google" 314159, "Date" → 1363865032036475263 314159, "Items" → 2 314159, "Items", 0, "Part" → 100001 314159, "Items", 0, "Price" → 300 314159, "Items", 0, "Quantity" → 2 314159, "Items", 1, "Part" → 100004 314159, "Items", 1, "Price" → 600 314159, "Items", 1, "Quantity" → 5 ---- db.Get("Invoice", invoiceNum, "Customer") → customer ... invoice.Get("Customer") → customer invoice.Get("Date") → time.Then().UnixName items.Get() → len(parts) items.Get(0, "Part") → parts[0].part items.Get(0, "Quantity") → parts[0].qty items.Get(0, "Price") → parts[0].price items.Get(1, "Part") → parts[1].part ...
Values are not limited to a single item. The DB "schema" used above can be changed to use a "record" for the invoice item details:
invoice := db.Array("Invoice", invoiceNum) invoice.Set(when, "Date") invoice.Set(customer, "Customer") items := invoice.Array("Items") items.Set(len(parts)) // # of Items in the invoice for i, part := range parts { items.Set([]interface{}{part.num, part.qty, part.price}, i) }
Dump of "Invoice"
314159, "Customer" → "Google" 314159, "Date" → 1363865958506983228 314159, "Items" → 2 314159, "Items", 0 → []interface{100001, 2, 300} 314159, "Items", 1 → []interface{100004, 5, 600} ---- items.Get() → len(parts) items.Get(0) → []interface{parts[0].num, parts[0].qty, parts[O].price} items.Get(1) → []interface{parts[1].num, parts[1].qty, parts[1].price} ...
Naming issues ¶
Array and File names can by any string value, including en empty string or a non UTF-8 string. Names are limited in size to approximately 64 kB. For compatibility with future dbm versions and/or with other dbm based products, it is recommended to use only array names which are a valid and exported[6] Go identifier or rooted names.
Rooted names ¶
Rooted name is a pathname beginning in a slash ('/'). The base name of such path should be (by recommendation) again a valid and exported Go identifier.
Name spaces ¶
Arrays namespace and Files namespace are disjoint. Entities in any namespace having a rooted name with prefix '/tmp/' are removed from the DB automatically on Open.
Access denied errors ¶
Attemtps to mutate Arrays or Files or any other forbidden action return lldb.ErrPERM.
ACID Finite State Machine ¶
For Options.ACID == ACIDFull and GracePeriod != 0 the state transition table for transaction collecting is:
+------------+-----------------+---------------+-----------------+ |\ Event | | | | | \--------\ | enter | leave | timeout | | State \| | | | +------------+-----------------+---------------+-----------------+ | idle | BeginUpdate | panic | panic | | | nest = 1 | | | | | start timer | | | | | S = collecting | | | +------------+-----------------+---------------+-----------------+ | collecting | nest++ | nest-- | S = collecting- | | | | if nest == 0 | triggered | | | | S = idle- | | | | | armed | | +------------+-----------------+---------------+-----------------+ | idle- | nest = 1 | panic | EndUpdate | | aremd | S = collecting- | | S = idle | | | armed | | | +------------+-----------------+---------------+-----------------+ | collecting-| nest++ | nest-- | S = collecting- | | armed | | if nest == 0 | triggered | | | | S = idle- | | | | | armed | | +------------+-----------------+---------------+-----------------+ | collecting-| nest++ | nest-- | panic | | triggered | | if nest == 0 | | | | | EndUpdate | | | | | S = idle | | +------------+-----------------+---------------+-----------------+ 'enter': Invoking any DB state mutating operation. 'leave': Returning from any DB state mutating operation.
NOTE: The collecting "interval" can be modified by invoking db.BeginUpdate and db.EndUpdate.
References ¶
Links fom the above godocs.
[1]: http://en.wikipedia.org/wiki/Hierarchical_database_model [2]: http://en.wikipedia.org/wiki/NoSQL#Key.E2.80.93value_store [3]: http://en.wikipedia.org/wiki/MUMPS [4]: http://en.wikipedia.org/wiki/Dbm [5]: http://www.intersystems.com/cache/technology/techguide/cache_tech-guide_02.html [6]: http://golang.org/pkg/go/ast/#IsExported
Index ¶
- Constants
- type Array
- func (a *Array) Array(subscripts ...interface{}) (r Array, err error)
- func (a *Array) Clear(subscripts ...interface{}) (err error)
- func (a *Array) Delete(subscripts ...interface{}) (err error)
- func (a *Array) Dump(w io.Writer) (err error)
- func (a *Array) Enumerator(asc bool) (en *Enumerator, err error)
- func (a *Array) Get(subscripts ...interface{}) (value interface{}, err error)
- func (a *Array) Inc(delta int64, subscripts ...interface{}) (val int64, err error)
- func (a *Array) Set(value interface{}, subscripts ...interface{}) (err error)
- func (a *Array) Slice(from, to []interface{}) (s *Slice, err error)
- func (a *Array) Tree() (tr *lldb.BTree, err error)
- type Bits
- type DB
- func (db *DB) Array(array string, subscripts ...interface{}) (a Array, err error)
- func (db *DB) Arrays() (a Array, err error)
- func (db *DB) BeginUpdate() (err error)
- func (db *DB) Clear(array string, subscripts ...interface{}) (err error)
- func (db *DB) Close() (err error)
- func (db *DB) Delete(array string, subscripts ...interface{}) (err error)
- func (db *DB) EndUpdate() (err error)
- func (db *DB) File(name string) (f File, err error)
- func (db *DB) Files() (a Array, err error)
- func (db *DB) Get(array string, subscripts ...interface{}) (value interface{}, err error)
- func (db *DB) HttpDir(root string) http.FileSystem
- func (db *DB) Inc(delta int64, array string, subscripts ...interface{}) (val int64, err error)
- func (db *DB) IsMem() bool
- func (db *DB) Name() string
- func (db *DB) PeakWALSize() int64
- func (db *DB) RemoveArray(array string) (err error)
- func (db *DB) RemoveFile(file string) (err error)
- func (db *DB) Rollback() (err error)
- func (db *DB) Set(value interface{}, array string, subscripts ...interface{}) (err error)
- func (db *DB) Size() (sz int64, err error)
- func (db *DB) Slice(array string, subscripts, from, to []interface{}) (s *Slice, err error)
- func (db *DB) Sync() (err error)
- func (db *DB) Verify(log func(error) bool, stats *lldb.AllocStats) (err error)
- type Enumerator
- type File
- func (f *File) Bits() *Bits
- func (f *File) Name() string
- func (f *File) PunchHole(off, size int64) (err error)
- func (f *File) ReadAt(b []byte, off int64) (n int, err error)
- func (f *File) ReadFrom(r io.Reader) (n int64, err error)
- func (f *File) Size() (sz int64, err error)
- func (f *File) Truncate(size int64) (err error)
- func (f *File) WriteAt(b []byte, off int64) (n int, err error)
- func (f *File) WriteTo(w io.Writer) (n int64, err error)
- type Options
- type Slice
Constants ¶
const ( // BeginUpdate/EndUpdate/Rollback will be no-ops. All operations // updating a DB will be written immediately including partial updates // during operation's progress. If any update fails, the DB can become // unusable. The same applies to DB crashes and/or any other non clean // DB shutdown. ACIDNone = iota // Enable transactions. BeginUpdate/EndUpdate/Rollback will be // effective. All operations on the DB will be automatically performed // within a transaction. Operations will thus either succeed completely // or have no effect at all - they will be rollbacked in case of any // error. If any update fails the DB will not be corrupted. DB crashes // and/or any other non clean DB shutdown may still render the DB // unusable. ACIDTransactions // Enable durability. Same as ACIDTransactions plus enables 2PC and // WAL. Updates to the DB will be first made permanent in a WAL and // only after that reflected in the DB. A DB will automatically recover // from crashes and/or any other non clean DB shutdown. Only last // uncommited transaction (transaction in progress ATM of a crash) can // get lost. // // NOTE: Options.GracePeriod may extend the span of a single // transaction to a batch of multiple transactions. // // NOTE2: Non zero GracePeriod requires GOMAXPROCS > 1 to work. Dbm // checks GOMAXPROCS in such case and if the value is 1 it // automatically sets GOMAXPROCS = 2. ACIDFull )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Array ¶
type Array struct {
// contains filtered or unexported fields
}
Array is a reference to a subtree of an array.
func MemArray ¶
MemArray returns an Array associated with a subtree of an anonymous array, determined by subscripts. MemArrays are resource limited as they are completely held in memory and are not automatically persisted.
func (*Array) Array ¶
Array returns an object associated with a subtree of array 'a', determined by subscripts.
func (*Array) Dump ¶
Dump outputs a human readable dump of a to w. Intended use is only for examples or debugging. Some type information is lost in the rendering, for example a float value '17.' and an integer value '17' may both output as '17'.
Note: Dump will lock the database until finished.
func (*Array) Enumerator ¶
func (a *Array) Enumerator(asc bool) (en *Enumerator, err error)
Enumerator returns a "raw" enumerator of the whole array. It's initially positioned on the first (asc is true) or last (asc is false) subscripts/value pair in the array.
This method is safe for concurrent use by multiple goroutines.
func (*Array) Get ¶
Get returns the value at subscripts in subtree 'a', or nil if no such value exists.
func (*Array) Inc ¶
Inc atomically increments the value at subscripts by delta and returns the new value. If the value doesn't exists before calling Inc or if the value is not an integer then the value is considered to be zero.
func (*Array) Set ¶
Set sets the value at subscripts in subtree 'a'. Any previous value, if existed, is overwritten by the new one.
type Bits ¶
type Bits struct {
// contains filtered or unexported fields
}
Bits is a File with a bit-manipulation set of methods. It can be useful as e.g. a bitmap index[1].
Mutating or reading single bits in a disk file is not a fast operation. Bits include a memory cache improving sequential scan/access by Get. The cache is coherent with writes/updates but _is not_ coherent with other Bits instances of the same underlying File. It is thus recommended to share a single *Bits instance between all writers and readers of the same bit file. Concurrent overlapping updates are safe, but the order of their execution is unspecified and they may even interleave. Coordination in the dbm client is needed in such case.
[1]: http://en.wikipedia.org/wiki/Bitmap_index
type DB ¶
type DB struct {
// contains filtered or unexported fields
}
func Create ¶
Create creates the named DB file mode 0666 (before umask). The file must not already exist. If successful, methods on the returned DB can be used for I/O; the associated file descriptor has mode os.O_RDWR. If there is an error, it will be of type *os.PathError.
For the meaning of opts please see documentation of Options.
func CreateMem ¶
CreateMem creates an in-memory DB not backed by a disk file. Memory DBs are resource limited as they are completely held in memory and are not automatically persisted.
For the meaning of opts please see documentation of Options.
func CreateTemp ¶
CreateTemp creates a new temporary DB in the directory dir with a basename beginning with prefix and name ending in suffix. If dir is the empty string, CreateTemp uses the default directory for temporary files (see os.TempDir). Multiple programs calling CreateTemp simultaneously will not choose the same file name for the DB. The caller can use Name() to find the pathname of the DB file. It is the caller's responsibility to remove the file when no longer needed.
For the meaning of opts please see documentation of Options.
func Open ¶
Open opens the named DB file for reading/writing. If successful, methods on the returned DB can be used for I/O; the associated file descriptor has mode os.O_RDWR. If there is an error, it will be of type *os.PathError.
For the meaning of opts please see documentation of Options.
func (*DB) Array ¶
Array returns an Array associated with a subtree of array, determined by subscripts.
func (*DB) Arrays ¶
Arrays returns a read-only meta array which registers other arrays by name as its keys. The associated values are meaningless but non-nil if the value exists.
func (*DB) BeginUpdate ¶
BeginUpdate increments a "nesting" counter (initially zero). Every call to BeginUpdate must be eventually "balanced" by exactly one of EndUpdate or Rollback. Calls to BeginUpdate may nest.
func (*DB) Close ¶
Close closes the DB, rendering it unusable for I/O. It returns an error, if any. Failing to call Close before exiting a program can render the DB unusable or, in case of using WAL/2PC, the last committed transaction may get lost.
Close is idempotent.
func (*DB) EndUpdate ¶
EndUpdate decrements the "nesting" counter. If it's zero after that then assume the "storage" has reached structural integrity (after a batch of partial updates). Invocation of an unbalanced EndUpdate is an error.
func (*DB) Files ¶
Files returns a read-only meta array which registers all Files in the DB by name as its keys. The associated values are meaningless but non-nil if the value exists.
func (*DB) HttpDir ¶
func (db *DB) HttpDir(root string) http.FileSystem
HttpDir returns an object implementing http.FileSystem using the DB file system restricted to a specific directory tree.
'root' must be an absolute path beginning with '/'.
func (*DB) Inc ¶
Inc atomically increments the value at subscripts of array by delta and returns the new value. If the value doesn't exists before calling Inc or if the value is not an integer then the value is considered to be zero.
func (*DB) PeakWALSize ¶
PeakWALSize reports the maximum size WAL has ever used.
func (*DB) RemoveArray ¶
RemoveArray removes array from the DB.
func (*DB) RemoveFile ¶
RemoveFile removes file from the DB.
func (*DB) Rollback ¶
Rollback cancels and undoes the innermost pending update level (if transactions are eanbled). Rollback decrements the "nesting" counter. Invocation of an unbalanced Rollback is an error.
func (*DB) Set ¶
Set sets the value at subscripts in array. Any previous value, if existed, is overwritten by the new one.
func (*DB) Slice ¶
Slice returns a new Slice of array, with a subscripts range of [from, to]. If from is nil it works as 'from lowest existing key'. If to is nil it works as 'to highest existing key'.
func (*DB) Sync ¶
Sync commits the current contents of the DB file to stable storage. Typically, this means flushing the file system's in-memory copy of recently written data to disk.
NOTE: There's no good reason to invoke Sync if db uses 2PC/WAL (see Options.ACID).
func (*DB) Verify ¶
Verify attempts to find any structural errors in DB wrt the organization of it as defined by lldb.Allocator. 'bitmap' is a scratch pad for necessary bookkeeping and will grow to at most to DB size/128 (0,78%). Any problems found are reported to 'log' except non verify related errors like disk read fails etc. If 'log' returns false or the error doesn't allow to (reliably) continue, the verification process is stopped and an error is returned from the Verify function. Passing a nil log works like providing a log function always returning false. Any non-structural errors, like for instance Filer read errors, are NOT reported to 'log', but returned as the Verify's return value, because Verify cannot proceed in such cases. Verify returns nil only if it fully completed verifying DB without detecting any error.
It is recommended to limit the number reported problems by returning false from 'log' after reaching some limit. Huge and corrupted DB can produce an overwhelming error report dataset.
The verifying process will scan the whole DB at least 3 times (a trade between processing space and time consumed). It doesn't read the content of free blocks above the head/tail info bytes. If the 3rd phase detects lost free space, then a 4th scan (a faster one) is performed to precisely report all of them.
Statistics are returned via 'stats' if non nil. The statistics are valid only if Verify succeeded, ie. it didn't reported anything to log and it returned a nil error.
type Enumerator ¶
type Enumerator struct {
// contains filtered or unexported fields
}
Enumerator provides visiting all K/V pairs in a DB/range.
func (*Enumerator) Next ¶
func (e *Enumerator) Next() (key, value []interface{}, err error)
Next returns the currently enumerated raw KV pair, if it exists and moves to the next KV in the key collation order. If there is no KV pair to return, err == io.EOF is returned.
This method is safe for concurrent use by multiple goroutines.
func (*Enumerator) Prev ¶
func (e *Enumerator) Prev() (key, value []interface{}, err error)
Prev returns the currently enumerated raw KV pair, if it exists and moves to the previous KV in the key collation order. If there is no KV pair to return, err == io.EOF is returned.
This method is safe for concurrent use by multiple goroutines.
type File ¶
type File Array
File is a database blob with a file-like API. Values in Arrays are limited in size to about 64kB. To put a larger value into an Array, the value can be written to a File and the path stored in the Array instead of the too big value.
func (*File) PunchHole ¶
PunchHole deallocates space inside a "file" in the byte range starting at off and continuing for size bytes. The Filer size (as reported by `Size()` does not change when hole punching, even when puching the end of a file off.
func (*File) ReadFrom ¶
ReadFrom is a helper to populate File's content from r. 'n' reports the number of bytes read from 'r'.
func (*File) WriteTo ¶
WriteTo is a helper to copy/persist File's content to w. If w is also an io.WriterAt then WriteTo may attempt to _not_ write any big, for some value of big, runs of zeros, i.e. it will attempt to punch holes, where possible, in `w` if that happens to be a freshly created or to zero length truncated OS file. 'n' reports the number of bytes written to 'w'.
type Options ¶
type Options struct { // See the ACID* constants documentation. ACID int // The write ahead log pathname. Applicable iff ACID == ACIDFull. May // be left empty in which case an unspecified pathname will be chosen, // which is computed from the DB name and which will be in the same // directory as the DB. Moving or renaming the DB while it is shut down // will break it's connection to the automatically computed name. // Moving both the files (the DB and the WAL) into another directory // with no renaming is safe. // // On opening an existing DB the WAL file must exist if it should be // used. If it is of zero size then a clean shutdown of the DB is // assumed, otherwise an automatic DB recovery is performed. // // On creating a new DB the WAL file must not exist or it must be // empty. It's not safe to write to a non empty WAL file as it may // contain unprocessed DB recovery data. WAL string // Time to collect transactions before committing them into the WAL. // Applicable iff ACID == ACIDFull. All updates are held in memory // during the grace period so it should not be more than few seconds at // most. // // Recommended value for GracePeriod is 1 second. // // NOTE: Using small GracePeriod values will make DB updates very slow. // Zero GracePeriod will make every single update a separate 2PC/WAL // transaction. Values smaller than about 100-200 milliseconds // (particularly for mechanical, rotational HDs) are not recommended // and they may not be always honored. GracePeriod time.Duration // contains filtered or unexported fields }
Options are passed to the DB create/open functions to amend the behavior of those functions. The compatibility promise is the same as of struct types in the Go standard library - introducing changes can be made only by adding new exported fields, which is backward compatible as long as client code uses field names to assign values of imported struct types literals.
type Slice ¶
type Slice struct {
// contains filtered or unexported fields
}
Slice represents a slice of an Array.
func (*Slice) Do ¶
Do calls f for every subscripts-value pair in s in ascending collation order of the subscripts. Do returns non nil error for general errors (eg. file read error). If f returns false or a non nil error then Do terminates and returns the value of error from f.
Note: f can get called with a subscripts-value pair which actually may no longer exist - if some other goroutine introduces such data race. Coordination required to avoid this situation, if applicable/desirable, must be provided by the client of dbm.