store

package

v3.0.1-0...-844edf2 Latest Latest Go to latest Published: Jun 23, 2022 License: Apache-2.0 Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

README ¶

Prefix-Sum B-Tree specification

This module implements a B-Tree suitable for efficiently computing a random prefix sum of data, while allowing the data to be efficiently updated.

The prefix sums for N elements x_1, x_2, ... x_N, each with a weight field, is the sequence y_1, y_2, ... y_N, where y_i = sum_{0 <= j <= i} x_j.weight. This data structure allows one to edit, insert, and delete entries in the x sequence efficiently, and efficiently retrieve the prefix sum at any index, where efficiently is O(log(N)) state operations. (Note that in the cosmos SDK stack, a state operation is itself liable to take O(log(N)) time)

This is built for the use-case of we have a series of data that is sorted by time. Each given time has an associated weight field. We want to be able to very quickly find the total weight for all leaves with times less than or equal to t. The actual implementation is agnostic to what is the field we sort by.

Data structure idea

The idea underlying this can be decomposed into two parts:

Do some extra (O(log(N))) work when modifying the data, to allow efficiently computing any prefix sum
Allow the data entries to be inserted into and deleted from, while remaining sorted

The solution for 1. is to build a balanced tree on top of the data. Every inner node in the tree will be augmented with an "accumulated weight" field, which contains the sum of the weights of all entries below it. Notice that upon updating the weight of any leaf, the weights of all nodes that are "above" this node can be updated efficiently. (As there ought to only be log(N) such nodes) Furthermore, the root of the tree's augmented value is the sum of all weights in the tree.

Then to query the jth prefix sum, you first identify the path to the jth node in the tree. You keep a running tally of "prefix sum thus far", which is initialized to the sum of all weights in the tree. Then as you walk the path from the tree root to the jth leaf, whenever there are siblings on the right, you subtract their weight from your running total. The weight when you arrive at the leaf is then the prefix sum.

Lets illustrate this with a binary tree.

binary tree

If we want the prefix sum for leaf 12 we compute it as 1.weight - 7.weight - 13.weight. (We took a subtraction every time we took a left)

Now notice that this solution works for any tree type, where efficiency just depends on number of siblings * depth and as long as that is parameterized to be O(log(N)) we maintain our definition of efficient.

Thus we immediately get a solution to 2., by using a tree that supports efficient inserts, and deletions, while still maintaining log depth and a constant number of siblings. We opt for using a B+ tree for this, as it performs the task well and does not require rebalance operations. (Which can be costly in an adversarial environment)

Implementation Details

The B-Tree implementation under osmosis/store is designed specifically for allowing efficient computation of a random prefix sum, with the underlying data being updatable as explained above.

Every Leaf has a Weight field, and the address its stored at in state is the key which we want to sort by. The implementation sorts leaves as byteslices, Leafs are sorted under their byteslice key, and the branch nodes have accumulation for each childs.

A node is pointed by a node struct, used internally.

type node struct {
    t Tree
    level uint8
    key []byte
}

A node struct is a pointer to the key-value pair under nodeKey(node.level, node.key)

A leaf node is simply a uint64 integer value stored under nodeKey(0, key). The key is arbitrary length of byte slice.

type Leaf struct {
    Value uint64
}

A branch node consists of keys and accumulation of the children nodes.

type Branch struct {
    Children []Child    
}

type Child struct {
    Key []byte
    Acc uint64
}

The following constraints are valid for all branch nodes:

For c in node.Branch().Children, the node corresponding to c is stored under nodeKey(node.level-1, c.Key).
For c in node.Branch().Children, c.Acc is the sum of all c'.Acc where c' is in c.Children. If c' is leaf node, substitute c'.Acc to Leaf.Value.
For c in node.Branch().Children, c.Key is equal or greater than node.key and lesser than node.rightSibling().key.
There are no duplicate child stored in more than one of node's .Children.

Example

Here is an example tree data:

- Level 2 nil
  - Level 1 0xaaaa 
    - Level 0 0xaaaa Value 10
    - Level 0 0xaaaa01 Value 20
    - Level 0 0xaabb Value 30
  - Level 1 0xbb44
    - Level 0 0xbb55 Value 100
    - Level 0 0xbe Value 200
  - Level 1 0xeeaaaa
    - Level 0 0xef1234 Value 300
    - Level 0 0xffff Value 400

The branch nodes will have the following childrens:

require.Equal(store.Get(nodeKey(2, nil)), Children{{0xaaaa, 60}, {0xbb44, 300}, {0xeeaaaa, 700}})
require.Equal(store.Get(nodeKey(1, 0xaaaa)), Children{{0xaaaa, 10}, {0xaaaa01, 20}, {0xaabb, 30}})
require.Equal(store.Get(nodeKey(1, 0xbb44)), Children{{0xbb55, 100}, {0xbe, 200}})
require.Equal(store.Get(nodeKey(1, 0xeeaaaa)), Children{{0xef1234, 300}, {0xffff, 400}})

Documentation ¶

Index ¶

type Tree
- func NewTree(store store.KVStore, m uint8) Tree

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Tree ¶

type Tree struct {
	// contains filtered or unexported fields
}

Tree is an augmented B+ tree implementation. Branches have m sized key index slice. Each key index represents the starting index of the child node's index(inclusive), and the ending index of the previous node of the child node's index(exclusive). TODO: We should abstract out the leaves of this tree to allow more data aside from the accumulation value to go there.

func NewTree ¶

func NewTree(store store.KVStore, m uint8) Tree

func (Tree) DebugVisualize ¶

func (t Tree) DebugVisualize()

DebugVisualize prints the entire tree to stdout

func (Tree) Get ¶

func (t Tree) Get(key []byte) (res sdk.Int)

Get returns the (sdk.Int) accumulation value at a given leaf.

func (Tree) Iterator ¶

func (t Tree) Iterator(begin, end []byte) store.Iterator

func (Tree) PrefixSum ¶

func (t Tree) PrefixSum(key []byte) sdk.Int

Prefix sum returns the total weight of all leaves with keys <= to the provided key.

func (Tree) Remove ¶

func (t Tree) Remove(key []byte)

func (Tree) ReverseIterator ¶

func (t Tree) ReverseIterator(begin, end []byte) store.Iterator

func (Tree) Set ¶

func (t Tree) Set(key []byte, acc sdk.Int)

func (Tree) SplitAcc ¶

func (t Tree) SplitAcc(key []byte) (sdk.Int, sdk.Int, sdk.Int)

func (Tree) SubsetAccumulation ¶

func (t Tree) SubsetAccumulation(start []byte, end []byte) sdk.Int

SubsetAccumulation returns the total value of all leaves with keys between start and end (inclusive of both ends) if start is nil, it is the beginning of the tree. if end is nil, it is the end of the tree.

func (Tree) TotalAccumulatedValue ¶

func (t Tree) TotalAccumulatedValue() sdk.Int

TotalAccumulatedValue returns the sum of the weights for all leaves

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL