Documentation ¶
Overview ¶
Package mast provides an immutable, versioned, diffable map implementation of a Merkle Search Tree (MST). Masts can be huge (not limited to memory). Masts can be stored in anything, like a filesystem, KV store, or blob store. Masts are designed to be safely concurrently accessed by multiple threads/hosts. And like Merkle DAGs, Masts are designed to be easy to cache and synchronize.
Uses ¶
- Efficient storage of multiple versions of materialized views
- Diffing of versions integrates CDC/streaming
- Efficient copy-on-write alternative to Go builtin map
What are MSTs ¶
Mast is an implementation of the structure described in the awesome paper, "Merkle Search Trees: Efficient State-Based CRDTs in Open Networks", by Alex Auvolat and François Taïani, 2019 (https://hal.inria.fr/hal-02303490/document).
MSTs are similar to persistent B-Trees, except node splits are chosen deterministically based on key and tree height, rather than node size, so MSTs eliminate the dependence on insertion/deletion order that B-Trees have for equivalence, making MSTs an interesting choice for conflict-free replicated data types (CRDTs). (MSTs are also simpler to implement than B-Trees, needing less complicated rebalancing, and no rotations.)
MSTs are like other Merkle structures in that two instances can easily be compared to confirm equality or find differences, since equal node hashes indicate equal contents.
Concurrency ¶
A Mast can be Clone()d for sharing between threads. Cloning creates a new version that can evolve independently, yet shares all the unmodified subtrees with its parent, and as such are relatively cheap to create.
Inspiration ¶
The immutable data types in Clojure, Haskell, ML and other functional languages really do make it easier to "reason about" systems; easier to test, provide a foundation to build more quickly on.
https://github.com/bodil/im-rs, "Blazing fast immutable collection datatypes for Rust", by Bodil Stokke, is an exemplar: the diff algorithm and use of property testing, are instructive, and Chunks and PoolRefs fill gaps in understanding of Rust's ownership model for library writers coming from GC'd languages.
Index ¶
- Constants
- Variables
- func DefaultKeyCompare(marshaler func(interface{}) ([]byte, error)) func(i, i2 interface{}) (int, error)
- func DefaultLayer(marshaler func(interface{}) ([]byte, error)) func(i interface{}, branchFactor uint) (uint8, error)
- type CreateRemoteOptions
- type Cursor
- func (c *Cursor) Backward(ctx context.Context) error
- func (c *Cursor) Ceil(ctx context.Context, key interface{}) error
- func (c *Cursor) Forward(ctx context.Context) error
- func (c *Cursor) Get() (interface{}, interface{}, bool)
- func (c *Cursor) Max(ctx context.Context) error
- func (c *Cursor) Min(ctx context.Context) error
- func (c *Cursor) String() string
- type Diff
- type DiffCursor
- type DiffType
- type Key
- type Mast
- func (m *Mast) BranchFactor() uint
- func (m *Mast) Clone(ctx context.Context) (Mast, error)
- func (m *Mast) Cursor(ctx context.Context) (*Cursor, error)
- func (m *Mast) Delete(ctx context.Context, key, value interface{}) error
- func (m *Mast) DiffIter(ctx context.Context, oldMast *Mast, ...) error
- func (m *Mast) DiffLinks(ctx context.Context, oldMast *Mast, ...) error
- func (m *Mast) Get(ctx context.Context, k, value interface{}) (bool, error)
- func (m *Mast) Height() uint8
- func (m *Mast) Insert(ctx context.Context, key, value interface{}) error
- func (m *Mast) IsDirty() bool
- func (m *Mast) Iter(ctx context.Context, f func(interface{}, interface{}) error) error
- func (m *Mast) MakeRoot(ctx context.Context) (*Root, error)
- func (m *Mast) SeekIter(ctx context.Context, k interface{}, f func(interface{}, interface{}) error) error
- func (m *Mast) Size() uint64
- func (m *Mast) StartDiff(ctx context.Context, oldMast *Mast) (*DiffCursor, error)
- type Node
- type NodeCache
- type Persist
- type RemoteConfig
- type Root
Examples ¶
Constants ¶
const DefaultBranchFactor = 16
DefaultBranchFactor is how many entries per node a tree will normally have.
Variables ¶
var ( V1Marshaler = nodeFormat("v1marshaler") V115Binary = nodeFormat("v1.1.5binary") )
var ( // ErrIterDone is the error returned by Iter and SeekIter to stop the iteration ErrIterDone = errors.New("iter done") )
var ErrNoMoreDiffs = errors.New("no more differences")
Functions ¶
func DefaultKeyCompare ¶ added in v1.2.13
Types ¶
type CreateRemoteOptions ¶
type CreateRemoteOptions struct { // BranchFactor, or number of entries per node. 0 means use DefaultBranchFactor. BranchFactor uint // NodeFormat, defaults to more-compact "v1.1.5binary" for new trees, can be set to "v1marshaler" to make nodes compatible with pre-v1.1.5 code. NodeFormat nodeFormat }
CreateRemoteOptions sets initial parameters for the tree, that would be painful to change after the tree has data.
type Cursor ¶ added in v1.2.11
type Cursor struct {
// contains filtered or unexported fields
}
Cursor can be used to seek around a tree.
func (*Cursor) Backward ¶ added in v1.2.11
Backward moves the cursor to the entry with th enext-smaller key.
func (*Cursor) Ceil ¶ added in v1.2.11
Ceil moves the cursor to the entry with the given key, or if not present, the entry with the next-larger key.
func (*Cursor) Forward ¶ added in v1.2.11
Forward moves the cursor to the entry with the next-larger key.
func (*Cursor) Get ¶ added in v1.2.11
Get returns the key and value of the entry at the cursor, if there is an entry, or !ok if there is no entry.
func (*Cursor) Max ¶ added in v1.2.11
Max moves the cursor to the largest key in the subtree under the current position.
type Diff ¶ added in v1.2.23
type Diff struct { Key interface{} Type DiffType OldValue interface{} NewValue interface{} }
type DiffCursor ¶ added in v1.2.23
type DiffCursor struct {
// contains filtered or unexported fields
}
type Key ¶
type Key interface { // Layer can deterministically compute its ideal layer (distance from leaves) in a tree with the given branch factor. Layer(branchFactor uint) uint8 // Order returns -1 if the argument is less than than this one, 1 if greater, and 0 if equal. Order(Key) int }
A Key has a sort order and deterministic maximum distance from leaves.
type Mast ¶
type Mast struct {
// contains filtered or unexported fields
}
Mast encapsulates data and parameters for the in-memory portion of a Merkle Search Tree.
func NewInMemory ¶
func NewInMemory() Mast
NewInMemory returns a new tree for use as an in-memory data structure (i.e. that isn't intended to be remotely persisted).
func (*Mast) BranchFactor ¶ added in v0.5.0
BranchFactor returns the ideal number of entries that are stored per node.
func (*Mast) Clone ¶ added in v0.3.4
Clone() returns a new Mast that shares all the source's data but can evolve independently (copy-on-write).
func (*Mast) Cursor ¶ added in v1.2.11
Cursor obtains a cursor set to the smallest value in the root node.
func (*Mast) DiffIter ¶
func (m *Mast) DiffIter( ctx context.Context, oldMast *Mast, f func(added, removed bool, key, addedValue, removedValue interface{}, ) (bool, error), ) error
DiffIter invokes the given callback for every entry that is different from the given tree. The iteration will stop if the callback returns keepGoing==false or an error. Callback invocation with added==removed==false signifies entries whose values have changed.
Example ¶
ctx := context.Background() v1 := NewInMemory() v1.Insert(ctx, 0, "foo") v1.Insert(ctx, 100, "asdf") v2, err := v1.Clone(ctx) if err != nil { panic(err) } v2.Insert(ctx, 0, "bar") v2.Delete(ctx, 100, "asdf") v2.Insert(ctx, 200, "qwerty") v2.DiffIter(ctx, &v1, func(added, removed bool, key, addedValue, removedValue interface{}) (bool, error) { if !added && !removed { fmt.Printf("changed '%v' from '%v' to '%v'\n", key, removedValue, addedValue) } else if removed { fmt.Printf("removed '%v' value '%v'\n", key, removedValue) } else if added { fmt.Printf("added '%v' value '%v'\n", key, addedValue) } return true, nil })
Output: changed '0' from 'foo' to 'bar' removed '100' value 'asdf' added '200' value 'qwerty'
func (*Mast) DiffLinks ¶
func (m *Mast) DiffLinks( ctx context.Context, oldMast *Mast, f func(removed bool, link interface{}) (bool, error), ) error
DiffLinks invokes the given callback for every node that is different from the given tree. The iteration will stop if the callback returns keepGoing==false or an error.
func (*Mast) Get ¶
Get gets the value of the entry with the given key and stores it at the given value pointer. Returns false if the tree doesn't contain the given key.
func (*Mast) Height ¶ added in v0.5.0
Height returns the number of levels between the leaves and root.
func (*Mast) IsDirty ¶ added in v0.5.0
IsDirty signifies that in-memory values have been Set() or merged that haven't been Save()d.
func (*Mast) Iter ¶
Iter iterates over the entries of a tree, invoking the given callback for every entry's key and value.
func (*Mast) MakeRoot ¶
MakeRoot makes a new persistent root, after ensuring all the changed nodes have been written to the persistent store.
func (*Mast) SeekIter ¶ added in v1.1.4
func (m *Mast) SeekIter(ctx context.Context, k interface{}, f func(interface{}, interface{}) error) error
Seekiter is similar to Iter, but the difference is to find the first position greater than or equal to the key and start the iteration
type Node ¶ added in v1.2.15
type Node struct { Key []interface{} Value []interface{} Link []interface{} `json:",omitempty"` }
type NodeCache ¶ added in v0.3.2
type NodeCache interface { // Add adds a freshly-persisted node to the cache. Add(key, value interface{}) // Contains indicates the node with the given key has already been persisted. Contains(key interface{}) bool // Get retrieves the already-deserialized node with the given hash, if cached. Get(key interface{}) (value interface{}, ok bool) }
NodeCache caches the immutable nodes from a remote storage source. It is also used to avoid re-storing nodes, so care should be taken to switch/invalidate NodeCache when the Persist is changed.
func NewNodeCache ¶ added in v0.3.2
NewNodeCache creates a new LRU-based node cache of the given size. One cache can be shared by any number of trees.
type Persist ¶
type Persist interface { // Store makes the given bytes accessible by the given name. The given string identity corresponds // to the content which is immutable (never modified). Store(context.Context, string, []byte) error // Load retrieves the previously-stored bytes by the given name. Load(context.Context, string) ([]byte, error) // NodeURLPrefix returns some string that identifies the // container this Persist uses, to enable NodeCaches to // distinguish identical nodes on different servers. NodeURLPrefix() string }
Persist is the interface for loading and storing (serialized) tree nodes. The given string identity corresponds to the content which is immutable (never modified).
func NewInMemoryStore ¶
func NewInMemoryStore() Persist
NewInMemoryStore provides a Persist that stores serialized nodes in a map, usually for testing.
type RemoteConfig ¶
type RemoteConfig struct { // KeysLike is an instance of the type keys will be deserialized as. KeysLike interface{} // KeysLike is an instance of the type values will be deserialized as. ValuesLike interface{} // StoreImmutablePartsWith is used to store and load serialized nodes. StoreImmutablePartsWith Persist // Unmarshal function, defaults to JSON Unmarshal func([]byte, interface{}) error // Marshal function, defaults to JSON Marshal func(interface{}) ([]byte, error) // UnmarshalerUsesRegisteredTypes indicates that the unmarshaler will know how to deserialize an // interface{} for a key/value in an entry. By default, JSON decoding doesn't do this, so is done // in two stages, the first to a JsonRawMessage, the second to the actual key/value type. UnmarshalerUsesRegisteredTypes bool // NodeCache caches deserialized nodes and may be shared across multiple trees. NodeCache NodeCache KeyCompare func(_, _ interface{}) (int, error) }
RemoteConfig controls how nodes are persisted and loaded.
type Root ¶
type Root struct { Link *string Size uint64 Height uint8 BranchFactor uint NodeFormat string `json:"NodeFormat,omitempty"` }
Root identifies a version of a tree whose nodes are accessible in the persistent store.
func NewRoot ¶
func NewRoot(remoteOptions *CreateRemoteOptions) *Root
NewRoot creates an empty tree whose nodes will be persisted remotely according to remoteOptions.