merkledb

package
v1.9.11-rc.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 28, 2023 License: BSD-3-Clause Imports: 32 Imported by: 2

README

Path Based Merkelized Radix Trie

TODOs

  • Improve invariants around trieview commitment. Either:
    • Consider allowing a child view to commit into a parent view without committing to the base DB.
  • Allow concurrent reads into the trieview.
  • Remove special casing around the root node from the physical structure of the hashed tree.
  • Analyze performance impact of needing to skip intermediate nodes when generating range and change proofs
    • Consider moving nodes with values to a separate db prefix
  • Replace naive concurrent hashing with a more optimized implementation
  • Analyze performance of using database snapshots rather than in-memory history
  • Improve intermediate node regeneration after ungraceful shutdown by reusing successfully written subtrees

Introduction

The Merkle Trie is a data structure that allows efficient and secure verification of the contents. It is a combination of a Merkle Tree and a Radix Trie.

The trie contains Merkle Nodes, which store key/value and children information.

Each Merkle Node represents a key path into the trie. It stores the key, the value (if one exists), its ID, and the IDs of its children nodes. The children have keys that contain the current node's key path as a prefix, and the index of each child indicates the next nibble in that child's key. For example, if we have two nodes, Node 1 with key path 0x91A and Node 2 with key path 0x91A4, Node 2 is stored in index 0x4 of Node 1's children (since 0x4 is the first value after the common prefix).

To reduce the depth of nodes in the trie, a Merkle Node utilizes path compression. Instead of having a long chain of nodes each containing only a single nibble of the key, we can "compress" the path by recording additional key information with each of a node's children. For example, if we have three nodes, Node 1 with key path 0x91A, Node 2 with key path 0x91A4, and Node 3 with key path 0x91A5132, then Node 1 has a key of 0x91A. Node 2 is stored at index 0x4 of Node 1's children since 4 is the next nibble in Node 2's key after skipping the common nibbles from Node 1's key. Node 3 is stored at index 0x5 of Node 1's children. Rather than have extra nodes for the remainder of Node 3's key, we instead store the rest of the key (132) in Node 1's children info.

+-----------------------------------+
| Merkle Node                       | 
|                                   |
| ID: 0x0131                        |  an id representing the current node, derived from the node's value and all children ids
| Key: 0x91                         |  prefix of the key path, representing the location of the node in the trie
| Value: 0x00                       |  the value, if one exists, that is stored at the key path (pathPrefix + compressedPath)
| Children:                         |  a map of children node ids for any nodes in the trie that have this node's key path as a prefix 
|   0: [:0x00542F]                  |  child 0 represents a node with key 0x910 with ID 0x00542F
|   1: [0x432:0xA0561C]             |  child 1 represents a node with key 0x911432 with ID 0xA0561C
|   ...                             |
|   15: [0x9A67B:0x02FB093]         |  child 15 represents a node with key 0x91F9A67B with ID 0x02FB093
+-----------------------------------+ 

Design choices

Single node type

A Merkle Node holds the IDs of its children, its value, as well as any path extension. This simplifies some logic and allows all of the data about a node to be loaded in a single database read. This trades off a small amount of storage efficiency (some fields may be nil but are still stored for every node).

Validity

A trieView is built atop another trie, and that trie could change at any point. If it does, all descendants of the trie will be marked invalid before the edit of the trie occurs. If an operation is performed on an invalid trie, an ErrInvalid error will be returned instead of the expected result. When a view is committed, all of its sibling views (the views that share the same parent) are marked invalid and any child views of the view have their parent updated to exclude any committed views between them and the db.

Locking

A trieView is built atop another trie, which may be the underlying Database or another trieView. It's important to guarantee atomicity/consistency of trie operations. That is, if a view method is executing, the views/database underneath the view shouldn't be changing. To prevent this, we need to use locking.

trieView has a Mutex named lock that's held when most of its methods are executing. It also has a Mutex named invalidationLock that is held during methods that change the view's validity or tracking of child views' validity. Trie methods also grab the write lock for all views that its built atop, and a read lock for the underlying Database. The exception is Commit, which grabs a write lock for the Database. This is the only trieView method that modifies the underlying Database. Locking the view stack ensures that while the method is executing, the underlying Database doesn't change, and no view below it is committed.

To prevent deadlocks, trieView and Database never lock a view that is built atop itself. That is, locking is always done from a view down to the underlying Database, never the other way around. In some of Database's methods, we create a trieView and call unexported methods on it without locking it. We do so because the exported counterpart of the method read locks the Database, which is already locked. This pattern is safe because the Database is locked, so no data under the view is changing, and nobody else has a reference to the view, so there can't be any concurrent access. Additionally, any function that takes the invalidationLock should avoid taking the trieView.lock as this will likely trigger a deadlock as well.

Database has a RWMutex named lock. Its read operations don't store data in a map, so a read lock suffices for read operations. trieView's Commit method explicitly grabs this lock.

Documentation

Index

Constants

View Source
const (
	NodeBranchFactor = 16
	HashLength       = 32
)
View Source
const EmptyPath path = ""
View Source
const (
	RootPath = EmptyPath
)

Variables

View Source
var (
	ErrStartRootNotFound = errors.New("start root is not before end root in history")
	ErrRootIDNotPresent  = errors.New("root id is not present in history")
)
View Source
var (
	ErrInvalidProof                = errors.New("proof obtained an invalid root ID")
	ErrInvalidMaxLength            = errors.New("expected max length to be > 0")
	ErrNonIncreasingValues         = errors.New("keys sent are not in increasing order")
	ErrStateFromOutsideOfRange     = errors.New("state key falls outside of the start->end range")
	ErrNonIncreasingProofNodes     = errors.New("each proof node key must be a strict prefix of the next")
	ErrExtraProofNodes             = errors.New("extra proof nodes in path")
	ErrDataInMissingRootProof      = errors.New("there should be no state or deleted keys in a change proof that had a missing root")
	ErrNoMerkleProof               = errors.New("empty key response must include merkle proof")
	ErrShouldJustBeRoot            = errors.New("end proof should only contain root")
	ErrNoStartProof                = errors.New("no start proof")
	ErrNoEndProof                  = errors.New("no end proof")
	ErrNoProof                     = errors.New("proof has no nodes")
	ErrProofNodeNotForKey          = errors.New("the provided node has a key that is not a prefix of the specified key")
	ErrProofValueDoesntMatch       = errors.New("the provided value does not match the proof node for the provided key's value")
	ErrProofNodeHasUnincludedValue = errors.New("the provided proof has a value for a key within the range that is not present in the provided key/values")
)
View Source
var (
	ErrCommitted          = errors.New("view has been committed")
	ErrInvalid            = errors.New("the trie this view was based on has changed, rending this view invalid")
	ErrOddLengthWithValue = errors.New(
		"the underlying db only supports whole number of byte keys, so cannot record changes with odd nibble length",
	)
	ErrGetPathToFailure = errors.New("GetPathTo failed to return the closest node")
	ErrStartAfterEnd    = errors.New("start key > end key")
	ErrViewIsNotAChild  = errors.New("passed in view is required to be a child of the current view")
)
View Source
var (
	Codec, Version = newCodec()
)

Functions

This section is empty.

Types

type ChangeProof

type ChangeProof struct {
	// If false, the node that created this doesn't have
	// sufficient history to generate a change proof and
	// all other fields must be empty.
	// Otherwise at least one other field is non-empty.
	HadRootsInHistory bool
	// A proof that the smallest key in the requested range does/doesn't
	// exist in the trie with the requested start root.
	// Empty if no lower bound on the requested range was given.
	// Note that this may not be an entire proof -- nodes are omitted if
	// they are also in [EndProof].
	StartProof []ProofNode
	// A proof that the largest key in [KeyValues] and [DeletedKeys]
	// does/doesn't exist in the trie with the requested start root.
	// Empty iff no upper bound on the requested range was given
	// and [KeyValues] and [DeletedKeys] are empty.
	EndProof []ProofNode
	// A subset of key-values that were added or had their values modified
	// between the requested start root (exclusive) and the requested
	// end root (inclusive).
	// Sorted by increasing key.
	KeyValues []KeyValue
	// A subset of keys that were removed from the trie between the requested
	// start root (exclusive) and the requested end root (inclusive).
	// Sorted by increasing key.
	DeletedKeys [][]byte
}

func (*ChangeProof) Empty

func (proof *ChangeProof) Empty() bool

func (*ChangeProof) Verify

func (proof *ChangeProof) Verify(
	ctx context.Context,
	db *Database,
	start []byte,
	end []byte,
	expectedEndRootID ids.ID,
) error

Returns nil iff all of the following hold:

  • [start] <= [end].
  • [proof] is non-empty iff [proof.HadRootsInHistory].
  • All keys in [proof.KeyValues] and [proof.DeletedKeys] are in [start, end]. If [start] is empty, all keys are considered > [start]. If [end] is empty, all keys are considered < [end].
  • [proof.KeyValues] and [proof.DeletedKeys] are sorted in order of increasing key.
  • [proof.StartProof] and [proof.EndProof] are well-formed.
  • When the keys in [proof.KeyValues] are added to [db] and the keys in [proof.DeletedKeys] are removed from [db], the root ID of [db] is [expectedEndRootID].

Assumes [db.lock] isn't held.

type Config

type Config struct {
	// The number of changes to the database that we store in memory in order to
	// serve change proofs.
	HistoryLength  int
	ValueCacheSize int
	NodeCacheSize  int
	// If [Reg] is nil, metrics are collected locally but not exported through
	// Prometheus.
	// This may be useful for testing.
	Reg    prometheus.Registerer
	Tracer trace.Tracer
}

type Database

type Database struct {
	// contains filtered or unexported fields
}

Can only be edited by committing changes from a trieView.

func New

func New(ctx context.Context, db database.Database, config Config) (*Database, error)

New returns a new merkle database.

func (*Database) Close

func (db *Database) Close() error

func (*Database) CommitChangeProof

func (db *Database) CommitChangeProof(ctx context.Context, proof *ChangeProof) error

Commits the key/value pairs within the [proof] to the db.

func (*Database) CommitRangeProof

func (db *Database) CommitRangeProof(ctx context.Context, start []byte, proof *RangeProof) error

Commits the key/value pairs within the [proof] to the db. [start] is the smallest key in the range this [proof] covers.

func (*Database) Compact

func (db *Database) Compact(start []byte, limit []byte) error

func (*Database) Delete

func (db *Database) Delete(key []byte) error

func (*Database) Get

func (db *Database) Get(key []byte) ([]byte, error)

func (*Database) GetChangeProof

func (db *Database) GetChangeProof(
	ctx context.Context,
	startRootID ids.ID,
	endRootID ids.ID,
	start []byte,
	end []byte,
	maxLength int,
) (*ChangeProof, error)

Returns a proof for a subset of the key/value changes in key range [start, end] that occurred between [startRootID] and [endRootID]. Returns at most [maxLength] key/value pairs.

func (*Database) GetHistoricalView

func (db *Database) GetHistoricalView(ctx context.Context, rootID ids.ID) (ReadOnlyTrie, error)

Returns a view of the trie as it was when the merkle root was [rootID].

func (*Database) GetMerkleRoot

func (db *Database) GetMerkleRoot(_ context.Context) (ids.ID, error)

Returns the ID of the root node of the merkle trie.

func (*Database) GetProof

func (db *Database) GetProof(ctx context.Context, key []byte) (*Proof, error)

Returns a proof of the existence/non-existence of [key] in this trie.

func (*Database) GetRangeProof

func (db *Database) GetRangeProof(
	ctx context.Context,
	start,
	end []byte,
	maxLength int,
) (*RangeProof, error)

Returns a proof for the key/value pairs in this trie within the range [start, end].

func (*Database) GetRangeProofAtRoot

func (db *Database) GetRangeProofAtRoot(
	ctx context.Context,
	rootID ids.ID,
	start,
	end []byte,
	maxLength int,
) (*RangeProof, error)

Returns a proof for the key/value pairs in this trie within the range [start, end] when the root of the trie was [rootID].

func (*Database) GetValue

func (db *Database) GetValue(ctx context.Context, key []byte) ([]byte, error)

Get the value associated with [key]. Returns database.ErrNotFound if it doesn't exist.

func (*Database) GetValues

func (db *Database) GetValues(ctx context.Context, keys [][]byte) ([][]byte, []error)

func (*Database) Has

func (db *Database) Has(k []byte) (bool, error)

func (*Database) HealthCheck

func (db *Database) HealthCheck(ctx context.Context) (interface{}, error)

func (*Database) Insert

func (db *Database) Insert(ctx context.Context, k, v []byte) error

func (*Database) NewBatch

func (db *Database) NewBatch() database.Batch

func (*Database) NewIterator

func (db *Database) NewIterator() database.Iterator

func (*Database) NewIteratorWithPrefix

func (db *Database) NewIteratorWithPrefix(prefix []byte) database.Iterator

func (*Database) NewIteratorWithStart

func (db *Database) NewIteratorWithStart(start []byte) database.Iterator

func (*Database) NewIteratorWithStartAndPrefix

func (db *Database) NewIteratorWithStartAndPrefix(start, prefix []byte) database.Iterator

func (*Database) NewPreallocatedView

func (db *Database) NewPreallocatedView(ctx context.Context, estimatedSize int) (TrieView, error)

Returns a new view preallocated to hold at least [estimatedSize] value changes at a time. If more changes are made, additional memory will be allocated. The returned view is added to [db.childViews]. Assumes [db.lock] isn't held.

func (*Database) NewView

func (db *Database) NewView(ctx context.Context) (TrieView, error)

Returns a new view on top of this trie. Changes made to the view will only be reflected in the original trie if Commit is called. Assumes [db.lock] isn't held.

func (*Database) Put

func (db *Database) Put(k, v []byte) error

Inserts the key/value pair into the db.

func (*Database) Remove

func (db *Database) Remove(ctx context.Context, key []byte) error

type Decoder

type Decoder interface {
	DecodeProof(bytes []byte, p *Proof) (uint16, error)
	DecodeChangeProof(bytes []byte, p *ChangeProof) (uint16, error)
	DecodeRangeProof(bytes []byte, p *RangeProof) (uint16, error)
	// contains filtered or unexported methods
}

type Encoder

type Encoder interface {
	EncodeProof(version uint16, p *Proof) ([]byte, error)
	EncodeChangeProof(version uint16, p *ChangeProof) ([]byte, error)
	EncodeRangeProof(version uint16, p *RangeProof) ([]byte, error)
	// contains filtered or unexported methods
}

TODO actually encode the version and remove version from the interface

type EncoderDecoder

type EncoderDecoder interface {
	Encoder
	Decoder
}

EncoderDecoder defines the interface needed by merkleDB to marshal and unmarshal relevant types.

type KeyValue

type KeyValue struct {
	Key   []byte
	Value []byte
}

type Maybe

type Maybe[T any] struct {
	// contains filtered or unexported fields
}

Maybe T = Some T | Nothing. A data wrapper that allows values to be something [Some T] or nothing Nothing. Maybe is used to wrap types: * That can't be represented by nil. * That use nil as a valid value instead of an indicator of a missing value. For more info see https://en.wikipedia.org/wiki/Option_type

func Nothing

func Nothing[T any]() Maybe[T]

Returns a new Maybe[T] with no value.

func Some

func Some[T any](val T) Maybe[T]

Returns a new Maybe[T] with the value val.

func (Maybe[T]) IsNothing

func (m Maybe[T]) IsNothing() bool

Returns true iff [m] has a value.

func (Maybe[T]) Value

func (m Maybe[T]) Value() T

Returns the value of [m].

type Proof

type Proof struct {
	// Nodes in the proof path from root --> target key
	// (or node that would be where key is if it doesn't exist).
	// Must always be non-empty (i.e. have the root node).
	Path []ProofNode
	// This is a proof that [key] exists/doesn't exist.
	Key []byte

	// Nothing if [Key] isn't in the trie.
	// Otherwise the value corresponding to [Key].
	Value Maybe[[]byte]
}

An inclusion/exclustion proof of a key.

func (*Proof) Verify

func (proof *Proof) Verify(ctx context.Context, expectedRootID ids.ID) error

Returns nil if the trie given in [proof] has root [expectedRootID]. That is, this is a valid proof that [proof.Key] exists/doesn't exist in the trie with root [expectedRootID].

type ProofNode

type ProofNode struct {
	KeyPath SerializedPath
	// Nothing if this is an intermediate node.
	// The value in this node if its length < [HashLen].
	// The hash of the value in this node otherwise.
	ValueOrHash Maybe[[]byte]
	Children    map[byte]ids.ID
}

type RangeProof

type RangeProof struct {
	// A proof that the smallest key in the requested range does/doesn't exist.
	// Note that this may not be an entire proof -- nodes are omitted if
	// they are also in [EndProof].
	StartProof []ProofNode

	// A proof of the greatest key in [KeyValues], or, if this proof contains
	// no [KeyValues], just the root.
	// Empty if the request for this range proof gave no upper bound
	// on the range to fetch, unless this proof contains no [KeyValues]
	// and [StartProof] is empty.
	EndProof []ProofNode

	// This proof proves that the key-value pairs in [KeyValues] are in the trie.
	// Sorted by increasing key.
	KeyValues []KeyValue
}

A proof that a given set of key-value pairs are in a trie.

func (*RangeProof) Verify

func (proof *RangeProof) Verify(
	ctx context.Context,
	start []byte,
	end []byte,
	expectedRootID ids.ID,
) error

Returns nil iff all the following hold:

  • [start] <= [end].
  • [proof] is non-empty.
  • All keys in [proof.KeyValues] are in the range [start, end]. If [start] is empty, all keys are considered > [start]. If [end] is empty, all keys are considered < [end].
  • [proof.KeyValues] is sorted by increasing key.
  • [proof.StartProof] and [proof.EndProof] are well-formed.
  • One of the following holds: [end] and [proof.EndProof] are empty. [proof.StartProof], [start], [end], and [proof.KeyValues] are empty and [proof.EndProof] is just the root. [end] is non-empty and [proof.EndProof] is a valid proof of a key <= [end].
  • [expectedRootID] is the root of the trie containing the given key-value pairs and start/end proofs.

type ReadOnlyTrie

type ReadOnlyTrie interface {

	// get the value associated with the key
	// database.ErrNotFound if the key is not present
	GetValue(ctx context.Context, key []byte) ([]byte, error)

	// get the values associated with the keys
	// database.ErrNotFound if the key is not present
	GetValues(ctx context.Context, keys [][]byte) ([][]byte, []error)

	// get the merkle root of the Trie
	GetMerkleRoot(ctx context.Context) (ids.ID, error)

	// generate a proof of the value associated with a particular key, or a proof of its absence from the trie
	GetProof(ctx context.Context, bytesPath []byte) (*Proof, error)

	// generate a proof of up to maxLength smallest key/values with keys between start and end
	GetRangeProof(ctx context.Context, start, end []byte, maxLength int) (*RangeProof, error)
	// contains filtered or unexported methods
}

Invariant: unexported methods (except lockStack) are only called when the trie's view stack is locked.

type SerializedPath

type SerializedPath struct {
	NibbleLength int
	Value        []byte
}

SerializedPath contains a path from the trie. The trie branch factor is 16, so the path may contain an odd number of nibbles. If it did contain an odd number of nibbles, the last 4 bits of the last byte should be discarded.

func (SerializedPath) AppendNibble

func (s SerializedPath) AppendNibble(nibble byte) SerializedPath

func (SerializedPath) Equal

func (s SerializedPath) Equal(other SerializedPath) bool

func (SerializedPath) HasPrefix

func (s SerializedPath) HasPrefix(prefix SerializedPath) bool

Returns true iff [prefix] is a prefix of [s] or equal to it.

func (SerializedPath) HasStrictPrefix

func (s SerializedPath) HasStrictPrefix(prefix SerializedPath) bool

Returns true iff [prefix] is a prefix of [s] but not equal to it.

func (SerializedPath) NibbleVal

func (s SerializedPath) NibbleVal(nibbleIndex int) byte

type Trie

type Trie interface {
	ReadOnlyTrie

	// Delete a key from the Trie
	Remove(ctx context.Context, key []byte) error

	// Get a new view on top of this Trie
	NewPreallocatedView(ctx context.Context, estimatedChanges int) (TrieView, error)

	// Get a new view on top of this Trie
	NewView(ctx context.Context) (TrieView, error)

	// Insert a key/value pair into the Trie
	Insert(ctx context.Context, key, value []byte) error
	// contains filtered or unexported methods
}

type TrieView

type TrieView interface {
	Trie

	// Commit the changes from this Trie into the database.
	// Any views that this Trie is built on will also be committed, starting at
	// the oldest.
	CommitToDB(ctx context.Context) error
}

Invariant: unexported methods (except lockStack) are only called when the trie's view stack is locked.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL