README
¶
Path Based Merkelized Radix Trie
TODOs
- Simplify trieview rootID tracking to only track the direct parent's rootID.
- Improve invariants around trieview commitment. Either:
- Guarantee atomicity of internal parent view commitments.
- Remove internal parent view commitments.
- Consider allowing a child view to commit into a parent view without committing to the base DB.
- Allow concurrent reads into the trieview.
- Remove
baseValuesCache
andbaseNodesCache
from the trieview. - Remove
parents
from the trieview.
- Remove
- Remove special casing around the root node from the physical structure of the hashed tree.
- Remove the implied prefix from the
dbNode
'schild
- Fix intermediate node eviction panic when encountering errors
- Analyze performance impact of needing to skip intermediate nodes when generating range and change proofs
- Consider moving nodes with values to a separate db prefix
- Replace naive concurrent hashing with a more optimized implementation
- Analyze performance of using database snapshots rather than in-memory history
- Improve intermediate node regeneration after ungraceful shutdown by reusing successfully written subtrees
Introduction
The Merkle Trie is a data structure that allows efficient and secure verification of the contents. It a combination of a Merkle Tree and a Radix Trie.
The trie contains Merkle Nodes
, which store key/value and children information.
Each Merkle Node
represents a key path into the trie. It stores the key, the value (if one exists), its ID, and the IDs of its children nodes. The children have keys that contain the current node's key path as a prefix, and the index of each child indicates the next nibble in that child's key. For example, if we have two nodes, Node 1 with key path 0x91A
and Node 2 with key path 0x91A4
, Node 2 is stored in index 0x4
of Node 1's children (since 0x4 is the first value after the common prefix).
To reduce the depth of nodes in the trie, a Merkle Node
utilizes path compression. Instead of having a long chain of nodes each containing only a single nibble of the key, we can "compress" the path by recording additional key information with each of a node's children. For example, if we have three nodes, Node 1 with key path 0x91A
, Node 2 with key path 0x91A4
, and Node 3 with key path 0x91A5132
, then Node 1 has a key of 0x91A
. Node 2 is stored at index 0x4
of Node 1's children since 4
is the next nibble in Node 2's key after skipping the common nibbles from Node 1's key. Node 3 is stored at index 0x5
of Node 1's children. Rather than have extra nodes for the remainder of Node 3's key, we instead store the rest of the key (132
) in Node 1's children info.
+-----------------------------------+
| Merkle Node |
| |
| ID: 0x0131 | an id representing the current node, derived from the node's value and all children ids
| Key: 0x91 | prefix of the key path, representing the location of the node in the trie
| Value: 0x00 | the value, if one exists, that is stored at the key path (pathPrefix + compressedPath)
| Children: | a map of children node ids for any nodes in the trie that have this node's key path as a prefix
| 0: [:0x00542F] | child 0 represents a node with key 0x910 with ID 0x00542F
| 1: [0x432:0xA0561C] | child 1 represents a node with key 0x911432 with ID 0xA0561C
| ... |
| 15: [0x9A67B:0x02FB093] | child 15 represents a node with key 0x91F9A67B with ID 0x02FB093
+-----------------------------------+
Design choices
Single node type
A Merkle Node
holds the IDs of its children, its value, as well as any path extension. This simplifies some logic and allows all of the data about a node to be loaded in a single database read. This trades off a small amount of storage efficiency (some fields may be nil
but are still stored for every node).
Locking
A trieView
is built atop another trie, which may be the underlying Database
or another trieView
.
It's important to guarantee atomicity/consistency of trie operations.
That is, if a view method is executing, the views/database underneath the view shouldn't be changing.
To prevent this, we need to use locking.
trieView
has a Mutex
named lock
that's held when its methods are executing.
Trie methods also grab the write lock
for all views that its built atop, and a read lock for the underlying Database
.
The exception is Commit
, which grabs a write lock for the Database
.
This is the only trieView
method that modifies the underlying Database
.
Locking the view stack ensures that while the method is executing, the underlying Database
doesn't change, and no view below it is committed.
To prevent deadlocks, trieView
and Database
never lock a view that is built atop itself.
That is, locking is always done from a view down to the underlying Database
, never the other way around.
In some of Database
's methods, we create a trieView
and call unexported methods on it without locking it.
We do so because the exported counterpart of the method read locks the Database
, which is already locked.
This pattern is safe because the Database
is locked, so no data under the view is changing, and nobody else has a reference to the view, so there can't be any concurrent access.
Database
has a RWMutex
named lock
. Its read operations don't store data in a map, so a read lock suffices for read operations.
trieView
's Commit
method explicitly grabs this lock.
Documentation
¶
Index ¶
- Constants
- Variables
- type ChangeProof
- type Config
- type Database
- func (db *Database) Close() error
- func (db *Database) CommitChangeProof(ctx context.Context, proof *ChangeProof) error
- func (db *Database) CommitRangeProof(ctx context.Context, start []byte, proof *RangeProof) error
- func (db *Database) Compact(start []byte, limit []byte) error
- func (db *Database) Delete(key []byte) error
- func (db *Database) Get(key []byte) ([]byte, error)
- func (db *Database) GetChangeProof(ctx context.Context, startRootID ids.ID, endRootID ids.ID, start []byte, ...) (*ChangeProof, error)
- func (db *Database) GetHistoricalView(ctx context.Context, rootID ids.ID) (ReadOnlyTrie, error)
- func (db *Database) GetMerkleRoot(_ context.Context) (ids.ID, error)
- func (db *Database) GetProof(ctx context.Context, key []byte) (*Proof, error)
- func (db *Database) GetRangeProof(ctx context.Context, start, end []byte, maxLength int) (*RangeProof, error)
- func (db *Database) GetRangeProofAtRoot(ctx context.Context, rootID ids.ID, start, end []byte, maxLength int) (*RangeProof, error)
- func (db *Database) GetValue(ctx context.Context, key []byte) ([]byte, error)
- func (db *Database) GetValues(ctx context.Context, keys [][]byte) ([][]byte, []error)
- func (db *Database) Has(k []byte) (bool, error)
- func (db *Database) HealthCheck(ctx context.Context) (interface{}, error)
- func (db *Database) Insert(ctx context.Context, k, v []byte) error
- func (db *Database) NewBatch() database.Batch
- func (db *Database) NewIterator() database.Iterator
- func (db *Database) NewIteratorWithPrefix(prefix []byte) database.Iterator
- func (db *Database) NewIteratorWithStart(start []byte) database.Iterator
- func (db *Database) NewIteratorWithStartAndPrefix(start, prefix []byte) database.Iterator
- func (db *Database) NewPreallocatedView(ctx context.Context, estimatedSize int) (TrieView, error)
- func (db *Database) NewView(ctx context.Context) (TrieView, error)
- func (db *Database) OnEviction(node *node)
- func (db *Database) Put(k, v []byte) error
- func (db *Database) Remove(ctx context.Context, key []byte) error
- type Decoder
- type Encoder
- type EncoderDecoder
- type KeyValue
- type Maybe
- type Proof
- type ProofNode
- type RangeProof
- type ReadOnlyTrie
- type SerializedPath
- func (s SerializedPath) AppendNibble(nibble byte) SerializedPath
- func (s SerializedPath) Equal(other SerializedPath) bool
- func (s SerializedPath) HasPrefix(prefix SerializedPath) bool
- func (s SerializedPath) HasStrictPrefix(prefix SerializedPath) bool
- func (s SerializedPath) NibbleVal(nibbleIndex int) byte
- type Trie
- type TrieView
Constants ¶
const EmptyPath path = ""
const NodeBranchFactor = 16
const (
RootPath = EmptyPath
)
Variables ¶
var ( ErrStartRootNotFound = errors.New("start root is not before end root in history") ErrRootIDNotPresent = errors.New("root id is not present in history") )
var ( ErrInvalidProof = errors.New("proof obtained an invalid root ID") ErrInvalidMaxLength = errors.New("expected max length to be > 0") ErrNonIncreasingValues = errors.New("keys sent are not in increasing order") ErrStateFromOutsideOfRange = errors.New("state key falls outside of the start->end range") ErrNonIncreasingProofNodes = errors.New("each proof node key must be a strict prefix of the next") ErrExtraProofNodes = errors.New("extra proof nodes in path") ErrDataInMissingRootProof = errors.New("there should be no state or deleted keys in a change proof that had a missing root") ErrNoMerkleProof = errors.New("empty key response must include merkle proof") ErrShouldJustBeRoot = errors.New("end proof should only contain root") ErrNoStartProof = errors.New("no start proof") ErrNoEndProof = errors.New("no end proof") ErrNoProof = errors.New("proof has no nodes") ErrProofNodeNotForKey = errors.New("the provided node has a key that is not a prefix of the specified key") )
var ( ErrCommitted = errors.New("view has been committed") ErrChangedBaseRoot = errors.New("the trie this view was based on has changed its root") ErrEditLocked = errors.New("view has been edit locked. Any view generated from this view would be corrupted by edits") ErrOddLengthWithValue = errors.New("the underlying db only supports whole number of byte keys, so cannot record changes with odd nibble length") ErrGetClosestNodeFailure = errors.New("GetClosestNode failed to return the closest node") ErrStartAfterEnd = errors.New("start key > end key") )
var (
Codec, Version = newCodec()
)
Functions ¶
This section is empty.
Types ¶
type ChangeProof ¶
type ChangeProof struct { // If false, the node that created this doesn't have // sufficient history to generate a change proof and // all other fields must be empty. // Otherwise at least one other field is non-empty. HadRootsInHistory bool // A proof that the smallest key in the requested range does/doesn't // exist in the trie with the requested start root. // Empty if no lower bound on the requested range was given. // Note that this may not be an entire proof -- nodes are omitted if // they are also in [EndProof]. StartProof []ProofNode // A proof that the largest key in [KeyValues] and [DeletedKeys] // does/doesn't exist in the trie with the requested start root. // Empty iff no upper bound on the requested range was given // and [KeyValues] and [DeletedKeys] are empty. EndProof []ProofNode // A subset of key-values that were added or had their values modified // between the requested start root (exclusive) and the requested // end root (inclusive). // Sorted by increasing key. KeyValues []KeyValue // A subset of keys that were removed from the trie between the requested // start root (exclusive) and the requested end root (inclusive). // Sorted by increasing key. DeletedKeys [][]byte }
func (*ChangeProof) Empty ¶
func (proof *ChangeProof) Empty() bool
func (*ChangeProof) Verify ¶
func (proof *ChangeProof) Verify( ctx context.Context, db *Database, start []byte, end []byte, expectedEndRootID ids.ID, ) error
Returns nil iff all of the following hold:
- [start] <= [end].
- [proof] is non-empty iff [proof.HadRootsInHistory].
- All keys in [proof.KeyValues] and [proof.DeletedKeys] are in [start, end].
- If [start] is empty, all keys are considered > [start].
- If [end] is empty, all keys are considered < [end].
- [proof.KeyValues] and [proof.DeletedKeys] are sorted in order of increasing key.
- [proof.StartProof] and [proof.EndProof] are well-formed.
- When the keys in [proof.KeyValues] are added to [db] and the keys in [proof.DeletedKeys] are removed from [db], the root ID of [db] is [expectedEndRootID].
Assumes [db.lock] isn't held.
type Config ¶
type Config struct { // The number of changes to the database that we store in memory in order to // serve change proofs. HistoryLength int ValueCacheSize int NodeCacheSize int // If [Reg] is nil, metrics are collected locally but not exported through // Prometheus. // This may be useful for testing. Reg prometheus.Registerer Tracer trace.Tracer }
type Database ¶
type Database struct {
// contains filtered or unexported fields
}
Can only be edited by committing changes from a trieView.
func (*Database) CommitChangeProof ¶
func (db *Database) CommitChangeProof(ctx context.Context, proof *ChangeProof) error
Commits the key/value pairs within the [proof] to the db.
func (*Database) CommitRangeProof ¶
Commits the key/value pairs within the [proof] to the db. [start] is the smallest key in the range this [proof] covers.
func (*Database) GetChangeProof ¶
func (db *Database) GetChangeProof( ctx context.Context, startRootID ids.ID, endRootID ids.ID, start []byte, end []byte, maxLength int, ) (*ChangeProof, error)
Returns a proof for a subset of the key/value changes in key range [start, end] that occurred between [startRootID] and [endRootID]. Returns at most [maxLength] key/value pairs.
func (*Database) GetHistoricalView ¶
Returns a view of the trie as it was when the merkle root was [rootID].
func (*Database) GetMerkleRoot ¶
Returns the ID of the root node of the merkle trie.
func (*Database) GetRangeProof ¶
func (db *Database) GetRangeProof( ctx context.Context, start, end []byte, maxLength int, ) (*RangeProof, error)
Returns a proof for the key/value pairs in this trie within the range [start, end].
func (*Database) GetRangeProofAtRoot ¶
func (db *Database) GetRangeProofAtRoot( ctx context.Context, rootID ids.ID, start, end []byte, maxLength int, ) (*RangeProof, error)
Returns a proof for the key/value pairs in this trie within the range [start, end] when the root of the trie was [rootID].
func (*Database) GetValue ¶
Get the value associated with [key]. Returns database.ErrNotFound if it doesn't exist.
func (*Database) HealthCheck ¶
func (*Database) NewIterator ¶
func (*Database) NewIteratorWithPrefix ¶
func (*Database) NewIteratorWithStart ¶
func (*Database) NewIteratorWithStartAndPrefix ¶
func (*Database) NewPreallocatedView ¶
Same as NewView except that the view will be preallocated to hold at least [estimatedSize] value changes at a time. If more changes are made, additional memory will be allocated. Assumes [db.lock] isn't held.
func (*Database) NewView ¶
Returns a new view on top of this trie. Changes made to the view will only be reflected in the original trie if Commit is called. Assumes [db.lock] isn't held.
func (*Database) OnEviction ¶
func (db *Database) OnEviction(node *node)
OnEviction persists intermediary nodes to [nodeDB] as they are evicted from [nodeCache]. Note that this is called by [db.nodeCache] with that cache's lock held, so the movement of the node from [db.nodeCache] to [db.nodeDB] is atomic. That is, as soon as [db.nodeCache] reports that it no longer has an evicted node, the node is guaranteed to be in [db.nodeDB].
type Encoder ¶
type Encoder interface { EncodeProof(version uint16, p *Proof) ([]byte, error) EncodeChangeProof(version uint16, p *ChangeProof) ([]byte, error) EncodeRangeProof(version uint16, p *RangeProof) ([]byte, error) // contains filtered or unexported methods }
TODO actually encode the version and remove version from the interface
type EncoderDecoder ¶
EncoderDecoder defines the interface needed by merkleDB to marshal and unmarshal relevant types.
type Maybe ¶
type Maybe[T any] struct { // contains filtered or unexported fields }
Maybe T = Some T | Nothing. A data wrapper that allows values to be something [Some T] or nothing Nothing. Maybe is used to wrap types: * That can't be represented by nil. * That use nil as a valid value instead of an indicator of a missing value. For more info see https://en.wikipedia.org/wiki/Option_type
type Proof ¶
type Proof struct { // Nodes in the proof path from root --> target key // (or node that would be where key is if it doesn't exist). // Must always be non-empty (i.e. have the root node). Path []ProofNode // This is a proof that [key] exists/doesn't exist. Key []byte }
An inclusion/exclustion proof of a key.
type RangeProof ¶
type RangeProof struct { // A proof that the smallest key in the requested range does/doesn't exist. // Note that this may not be an entire proof -- nodes are omitted if // they are also in [EndProof]. StartProof []ProofNode // A proof of the greatest key in [KeyValues], or, if this proof contains // no [KeyValues], just the root. // Empty if the request for this range proof gave no upper bound // on the range to fetch, unless this proof contains no [KeyValues] // and [StartProof] is empty. EndProof []ProofNode // This proof proves that the key-value pairs in [KeyValues] are in the trie. // Sorted by increasing key. KeyValues []KeyValue }
A proof that a given set of key-value pairs are in a trie.
func (*RangeProof) Verify ¶
func (proof *RangeProof) Verify( ctx context.Context, start []byte, end []byte, expectedRootID ids.ID, ) error
Returns nil iff all the following hold:
- [start] <= [end].
- [proof] is non-empty.
- All keys in [proof.KeyValues] are in the range [start, end].
- If [start] is empty, all keys are considered > [start].
- If [end] is empty, all keys are considered < [end].
- [proof.KeyValues] is sorted by increasing key.
- [proof.StartProof] and [proof.EndProof] are well-formed.
- One of the following holds:
- [end] and [proof.EndProof] are empty.
- [proof.StartProof], [start], [end], and [proof.KeyValues] are empty and [proof.EndProof] is just the root.
- [end] is non-empty and [proof.EndProof] is a valid proof of a key <= [end].
- [expectedRootID] is the root of the trie containing the given key-value pairs and start/end proofs.
type ReadOnlyTrie ¶
type ReadOnlyTrie interface { // get the value associated with the key // database.ErrNotFound if the key is not present GetValue(ctx context.Context, key []byte) ([]byte, error) // get the values associated with the keys // database.ErrNotFound if the key is not present GetValues(ctx context.Context, keys [][]byte) ([][]byte, []error) // get the merkle root of the Trie GetMerkleRoot(ctx context.Context) (ids.ID, error) // generate a proof of the value associated with a particular key, or a proof of its absence from the trie GetProof(ctx context.Context, bytesPath []byte) (*Proof, error) // generate a proof of up to maxLength smallest key/values with keys between start and end GetRangeProof(ctx context.Context, start, end []byte, maxLength int) (*RangeProof, error) // contains filtered or unexported methods }
Invariant: unexported methods (except lockStack) are only called when the trie's view stack is locked.
type SerializedPath ¶
SerializedPath contains a path from the trie. The trie branch factor is 16, so the path may contain an odd number of nibbles. If it did contain an odd number of nibbles, the last 4 bits of the last byte should be discarded.
func (SerializedPath) AppendNibble ¶
func (s SerializedPath) AppendNibble(nibble byte) SerializedPath
func (SerializedPath) Equal ¶
func (s SerializedPath) Equal(other SerializedPath) bool
func (SerializedPath) HasPrefix ¶
func (s SerializedPath) HasPrefix(prefix SerializedPath) bool
Returns true iff [prefix] is a prefix of [s] or equal to it.
func (SerializedPath) HasStrictPrefix ¶
func (s SerializedPath) HasStrictPrefix(prefix SerializedPath) bool
Returns true iff [prefix] is a prefix of [s] but not equal to it.
func (SerializedPath) NibbleVal ¶
func (s SerializedPath) NibbleVal(nibbleIndex int) byte
type Trie ¶
type Trie interface { ReadOnlyTrie // Delete a key from the Trie Remove(ctx context.Context, key []byte) error // Get a new view on top of this Trie NewPreallocatedView(ctx context.Context, estimatedChanges int) (TrieView, error) // Get a new view on top of this Trie NewView(ctx context.Context) (TrieView, error) // Insert a key/value pair into the Trie Insert(ctx context.Context, key, value []byte) error }
type TrieView ¶
type TrieView interface { Trie // Commit the changes from this Trie into the database. // Any views that this Trie is built on will also be committed, starting at // the oldest. Commit(ctx context.Context) error // contains filtered or unexported methods }
Invariant: unexported methods (except lockStack) are only called when the trie's view stack is locked.