arbo

package module
v0.0.0-...-a7c0c5f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 17, 2024 License: GPL-3.0 Imports: 21 Imported by: 16

README

arbo GoDoc Go Report Card Test

arbo: tree in Esperanto.

Note: There is a fork of this Go module as the tree/arbo package under https://github.com/vocdoni/vocdoni-node. The fork is modified so the tree does not store the orphan nodes and remains as small as the last state. This version of the tree will store all historical nodes, which depending on the use case can be useful or not.

MerkleTree implementation in Go. Compatible with the circomlib implementation of the MerkleTree. Specification: https://docs.iden3.io/publications/pdfs/Merkle-Tree.pdf and https://eprint.iacr.org/2018/955.

Main characteristics of arbo are:

  • Allows to define which hash function to use.
    • So for example, when working with zkSnarks the Poseidon hash function can be used, but when not, it can be used the Blake2b hash function, which has much faster computation time.
    • New hash functions can be plugged by just implementing the interface
  • Parallelizes computation by CPUs

AddBatch

The method tree.AddBatch is designed for the cases where there is a big amount of key-values to be added in the tree. It has the following characteristics:

  • Parallelizes by available CPUs
    • If the tree size is not too big (under the configured threshold):
      • Makes a copy of the tree in memory (VirtualTree)
      • The VirtualTree does not compute any hash, only the relations between the nodes of the tree
        • This step (computing the VirtualTree) is done in parallel in each available CPU until level log2(nCPU)
      • Once the VirtualTree is updated with all the new leafs (key-values) in each corresponent position, it computes all the hashes of each node until the root
        • In this way, each node hash is computed only once, while when adding many key-values using tree.Add method, most of the intermediate nodes will be recalculated each time that a new leaf is added
        • This step (computing all the hashes) is done in parallel in each available CPU
    • If the tree size is avobe the configured threshold:
      • Virtually splits the tree in n sub-trees, where n is the number of available CPUs
      • Each CPU adds the corresponent new leaves into each sub-tree (working in a db tx)
      • Once all sub-trees are updated, puts them together again to compute the new tree root

As result, the method tree.AddBatch goes way faster thant looping over tree.Add, and can compute the tree with parallelization, so as more available CPUs, faster will do the computation.

As an example, this is the benchmark for adding 10k leaves (with 4 CPU cores, AddBatch would get faster with more CPUs (powers of 2)):

Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz with 8GB of RAM
nCPU: 4, nLeafs: 10_000

Using Poseidon hash function:
(go) arbo.AddBatch:	436.866007ms
(go) arbo.Add loop:	5.341122678s
(go) iden3.Add loop:	8.581494317s
(js) circomlibjs:	2m09.351s

And, for example, if instead of using Poseidon hash function we use Blake2b, time is reduced to 80.862805ms.

Usage

// create new database
database, err := db.NewBadgerDB(c.TempDir())

// create new Tree with maxLevels=100 and Blake2b hash function
tree, err := arbo.NewTree(database, 100, arbo.HashFunctionBlake2b)

key := []byte("hello")
value := []byte("world")
// Add a key & value into the merkle tree
err = tree.Add(key, value)

// There are cases where multiple key-values (leafs) are going to be added to a
// Tree, for these cases is more efficient to use:
invalids, err := tree.AddBatch(keys, values)

// generate the merkle proof of a leaf by it's key
value, siblings, err := tree.GenProof(key)

// verify the proof
verified, err := arbo.CheckProof(tree.hashFunction, key, value, tree.Root(), siblings)
if !verified {
	fmt.Println("proof could not be verified")
}

// get the value of a leaf assigned to a key
gettedKey, gettedValue, err := tree.Get(key)

// update the value of a leaf assigned to a key
err = tree.Update(key, value)

// dump the tree (the leafs)
dump, err := tree.Dump(nil) // instead of nil, a root to start from can be used

// import the dump into a tree
err = tree.ImportDump(dump)

// print graphviz diagram of the tree
err = tree.PrintGraphviz(nil) // instead of nil, a root to start from can be used
Usage with SNARKs compatibility

Arbo is designed to be compatible with circom merkle tree's snark-friendly merkletree. The only change needed is the hash function used for the Tree, for example using the Poseidon hash function:

tree, err := arbo.NewTree(database, 32, arbo.HashFunctionPoseidon)

Be aware of the characteristics of this kind of hashes, such as using values inside the finite field used by the hash, and also the computation time.

The interface of arbo uses byte arrays, and for the case of these kind of hashes (that usually work directly with finite field elements), arbo expects those values to be represented by little-endian byte arrays. There is a helper method to convert a *big.Int to []byte using little-endian:

bLen := tree.HashFunction().Len()
kBigInt := big.NewInt(100)

// convert *big.Int to byte array
kBytes := arbo.BigIntToBytes(bLen, kBigInt)

// convert byte array to *big.Int
kBigInt2 := arbo.BytesToBigInt(kBytes)

Documentation

Overview

Package arbo implements a Merkle Tree compatible with the circomlib implementation of the MerkleTree, following the specification from https://docs.iden3.io/publications/pdfs/Merkle-Tree.pdf and https://eprint.iacr.org/2018/955.

Allows to define which hash function to use. So for example, when working with zkSnarks the Poseidon hash function can be used, but when not, it can be used the Blake2b hash function, which has much faster computation time.

Package arbo > vt.go implements the Virtual Tree, which computes a tree without computing any hash. With the idea of once all the leafs are placed in their positions, the hashes can be computed, avoiding computing a node hash more than one time.

Index

Constants

View Source
const (
	// PrefixValueLen defines the bytes-prefix length used for the Value
	// bytes representation stored in the db
	PrefixValueLen = 2

	// PrefixValueEmpty is used for the first byte of a Value to indicate
	// that is an Empty value
	PrefixValueEmpty = 0
	// PrefixValueLeaf is used for the first byte of a Value to indicate
	// that is a Leaf value
	PrefixValueLeaf = 1
	// PrefixValueIntermediate is used for the first byte of a Value to
	// indicate that is a Intermediate value
	PrefixValueIntermediate = 2
)

Variables

View Source
var (
	// BN254BaseField is the base field for the BN254 curve.
	BN254BaseField, _ = new(big.Int).SetString("21888242871839275222246405745257275088548364400416034343698204186575808495617", 10)
	// BLS12377BaseField is the base field for the BLS12377 curve.
	BLS12377BaseField, _ = new(big.Int).SetString("25825498262808887005865186224201665565126143020923472090132963926938185026661", 10)
)
View Source
var (
	// TypeHashSha256 represents the label for the HashFunction of Sha256
	TypeHashSha256 = []byte("sha256")
	// TypeHashPoseidon represents the label for the HashFunction of
	// Poseidon
	TypeHashPoseidon = []byte("poseidon")
	// TypeHashPoseidon represents the label for the HashFunction of
	// Poseidon
	TypeHashMultiPoseidon = []byte("multiposeidon")
	// TypeHashBlake2b represents the label for the HashFunction of Blake2b
	TypeHashBlake2b = []byte("blake2b")
	// TypeHashMiMC_BLS12_377 represents the label for the HashFunction of MiMC
	// over BLS12-377 curve
	TypeHashMiMC_BLS12_377 = []byte("mimc_bls12_377")
	// TypeHashMiMC_BN254 represents the label for the HashFunction of MiMC
	// over BN254 curve
	TypeHashMiMC_BN254 = []byte("mimc_bn254")

	// HashFunctionSha256 contains the HashSha256 struct which implements
	// the HashFunction interface
	HashFunctionSha256 HashSha256
	// HashFunctionPoseidon contains the HashPoseidon struct which implements
	// the HashFunction interface
	HashFunctionPoseidon HashPoseidon
	// HashFunctionMultiPoseidon contains the HashMultiPoseidon struct which implements
	// the HashFunction interface
	HashFunctionMultiPoseidon HashMultiPoseidon
	// HashFunctionBlake2b contains the HashBlake2b struct which implements
	// the HashFunction interface
	HashFunctionBlake2b HashBlake2b
	// HashFunctionMiMC_BLS12_377 contains the HashMiMC_BLS12_377 struct which
	// implements the HashFunction interface
	HashFunctionMiMC_BLS12_377 HashMiMC_BLS12_377
	// HashFunctionMiMC_BN254 contains the HashMiMC_BN254 struct which
	// implements the HashFunction interface
	HashFunctionMiMC_BN254 HashMiMC_BN254
)
View Source
var (
	// DefaultThresholdNLeafs defines the threshold number of leafs in the
	// tree that determines if AddBatch will work in memory or in disk.  It
	// is defined when calling NewTree, and if set to 0 it will work always
	// in disk.
	DefaultThresholdNLeafs = 65536

	// ErrKeyNotFound is used when a key is not found in the db neither in
	// the current db Batch.
	ErrKeyNotFound = fmt.Errorf("key not found")
	// ErrKeyAlreadyExists is used when trying to add a key as leaf to the
	// tree that already exists.
	ErrKeyAlreadyExists = fmt.Errorf("key already exists")
	// ErrInvalidValuePrefix is used when going down into the tree, a value
	// is read from the db and has an unrecognized prefix.
	ErrInvalidValuePrefix = fmt.Errorf("invalid value prefix")
	// ErrDBNoTx is used when trying to use Tree.dbPut but Tree.dbBatch==nil
	ErrDBNoTx = fmt.Errorf("dbPut error: no db Batch")
	// ErrMaxLevel indicates when going down into the tree, the max level is
	// reached
	ErrMaxLevel = fmt.Errorf("max level reached")
	// ErrMaxVirtualLevel indicates when going down into the tree, the max
	// virtual level is reached
	ErrMaxVirtualLevel = fmt.Errorf("max virtual level reached")
	// ErrSnapshotNotEditable indicates when the tree is a non writable
	// snapshot, thus can not be modified
	ErrSnapshotNotEditable = fmt.Errorf("snapshot tree can not be edited")
	// ErrTreeNotEmpty indicates when the tree was expected to be empty and
	// it is not
	ErrTreeNotEmpty = fmt.Errorf("tree is not empty")
)

Functions

func BigIntToBytes

func BigIntToBytes(blen int, bi *big.Int) []byte

BigIntToBytes converts a *big.Int into a byte array in Little-Endian

func BigToFF

func BigToFF(baseField, iv *big.Int) *big.Int

BigToFF function returns the finite field representation of the big.Int provided. It uses the curve scalar field to represent the provided number.

func BytesToBigInt

func BytesToBigInt(b []byte) *big.Int

BytesToBigInt converts a byte array in Little-Endian representation into *big.Int

func CheckProof

func CheckProof(hashFunc HashFunction, k, v, root, packedSiblings []byte) (bool, error)

CheckProof verifies the given proof. The proof verification depends on the HashFunction passed as parameter.

func PackSiblings

func PackSiblings(hashFunc HashFunction, siblings [][]byte) ([]byte, error)

PackSiblings packs the siblings into a byte array. [ 2 byte | 2 byte | L bytes | S * N bytes ] [ full length | bitmap length (L) | bitmap | N non-zero siblings ] Where the bitmap indicates if the sibling is 0 or a value from the siblings array. And S is the size of the output of the hash function used for the Tree. The 2 2-byte that define the full length and bitmap length, are encoded in little-endian.

func ReadIntermediateChilds

func ReadIntermediateChilds(b []byte) ([]byte, []byte)

ReadIntermediateChilds reads from a byte array the two childs keys

func ReadLeafValue

func ReadLeafValue(b []byte) ([]byte, []byte)

ReadLeafValue reads from a byte array the leaf key & value

func SwapEndianness

func SwapEndianness(b []byte) []byte

SwapEndianness swaps the order of the bytes in the byte slice.

func UnpackSiblings

func UnpackSiblings(hashFunc HashFunction, b []byte) ([][]byte, error)

UnpackSiblings unpacks the siblings from a byte array.

Types

type CircomVerifierProof

type CircomVerifierProof struct {
	Root     []byte   `json:"root"`
	Siblings [][]byte `json:"siblings"`
	OldKey   []byte   `json:"oldKey"`
	OldValue []byte   `json:"oldValue"`
	IsOld0   bool     `json:"isOld0"`
	Key      []byte   `json:"key"`
	Value    []byte   `json:"value"`
	Fnc      int      `json:"fnc"` // 0: inclusion, 1: non inclusion
}

CircomVerifierProof contains the needed data to check a Circom Verifier Proof inside a circom circuit. CircomVerifierProof allow to verify through a zkSNARK proof the inclusion/exclusion of a leaf in a tree.

func (CircomVerifierProof) MarshalJSON

func (cvp CircomVerifierProof) MarshalJSON() ([]byte, error)

MarshalJSON implements the JSON marshaler

type Config

type Config struct {
	Database        db.Database
	MaxLevels       int
	ThresholdNLeafs int
	HashFunction    HashFunction
}

Config defines the configuration for calling NewTree & NewTreeWithTx methods

type HashBlake2b

type HashBlake2b struct{}

HashBlake2b implements the HashFunction interface for the Blake2b hash

func (HashBlake2b) Hash

func (f HashBlake2b) Hash(b ...[]byte) ([]byte, error)

Hash implements the hash method for the HashFunction HashBlake2b

func (HashBlake2b) Len

func (f HashBlake2b) Len() int

Len returns the length of the Hash output

func (HashBlake2b) Type

func (f HashBlake2b) Type() []byte

Type returns the type of HashFunction for the HashBlake2b

type HashFunction

type HashFunction interface {
	Type() []byte
	Len() int
	Hash(...[]byte) ([]byte, error)
}

HashFunction defines the interface that is expected for a hash function to be used in a generic way in the Tree.

type HashMiMC_BLS12_377

type HashMiMC_BLS12_377 struct{}

HashMiMC_BLS12_377 implements the HashFunction interface for the MiMC hash over the BLS12-377 curve

func (HashMiMC_BLS12_377) Hash

func (f HashMiMC_BLS12_377) Hash(b ...[]byte) ([]byte, error)

Hash implements the hash method for the HashFunction HashMiMC_BLS12_377

func (HashMiMC_BLS12_377) Len

func (f HashMiMC_BLS12_377) Len() int

Len returns the length of the Hash output for the HashMiMC_BLS12_377

func (HashMiMC_BLS12_377) Type

func (f HashMiMC_BLS12_377) Type() []byte

Type returns the type of HashFunction for the HashMiMC_BLS12_377

type HashMiMC_BN254

type HashMiMC_BN254 struct{}

HashMiMC_BN254 implements the HashFunction interface for the MiMC hash over the BN254 curve

func (HashMiMC_BN254) Hash

func (f HashMiMC_BN254) Hash(b ...[]byte) ([]byte, error)

Hash implements the hash method for the HashFunction HashMiMC_BN254

func (HashMiMC_BN254) Len

func (f HashMiMC_BN254) Len() int

Len returns the length of the Hash output for the HashMiMC_BN254

func (HashMiMC_BN254) Type

func (f HashMiMC_BN254) Type() []byte

Type returns the type of HashFunction for the HashMiMC_BN254

type HashMultiPoseidon

type HashMultiPoseidon struct{}

HashMultiPoseidon implements the HashFunction interface for the MultiPoseidon hash

func (HashMultiPoseidon) Hash

func (f HashMultiPoseidon) Hash(b ...[]byte) ([]byte, error)

Hash implements the hash method for the HashFunction HashMultiPoseidon. It expects the byte arrays to be little-endian representations of big.Int values. Notably, if any input is longer than 32 bytes (f.Len()), it will split it into 32 bytes chunks and interpret each of them as a big.Int value. so Hash({[64]byte}) and Hash({[32]byte, [32]byte}) will yield the same result.

func (HashMultiPoseidon) Len

func (f HashMultiPoseidon) Len() int

Len returns the length of the Hash output

func (HashMultiPoseidon) Type

func (f HashMultiPoseidon) Type() []byte

Type returns the type of HashFunction for the HashMultiPoseidon

type HashPoseidon

type HashPoseidon struct{}

HashPoseidon implements the HashFunction interface for the Poseidon hash

func (HashPoseidon) Hash

func (f HashPoseidon) Hash(b ...[]byte) ([]byte, error)

Hash implements the hash method for the HashFunction HashPoseidon. It expects the byte arrays to be little-endian representations of big.Int values.

func (HashPoseidon) Len

func (f HashPoseidon) Len() int

Len returns the length of the Hash output

func (HashPoseidon) Type

func (f HashPoseidon) Type() []byte

Type returns the type of HashFunction for the HashPoseidon

type HashSha256

type HashSha256 struct{}

HashSha256 implements the HashFunction interface for the Sha256 hash

func (HashSha256) Hash

func (f HashSha256) Hash(b ...[]byte) ([]byte, error)

Hash implements the hash method for the HashFunction HashSha256

func (HashSha256) Len

func (f HashSha256) Len() int

Len returns the length of the Hash output

func (HashSha256) Type

func (f HashSha256) Type() []byte

Type returns the type of HashFunction for the HashSha256

type Invalid

type Invalid struct {
	Index int
	Error error
}

Invalid is used when a key-value can not be added trough AddBatch, and contains the index of the key-value and the error.

type Tree

type Tree struct {
	sync.Mutex
	// contains filtered or unexported fields
}

Tree defines the struct that implements the MerkleTree functionalities

func NewTree

func NewTree(cfg Config) (*Tree, error)

NewTree returns a new Tree, if there is a Tree still in the given database, it will load it.

func NewTreeWithTx

func NewTreeWithTx(wTx db.WriteTx, cfg Config) (*Tree, error)

NewTreeWithTx returns a new Tree using the given db.WriteTx, which will not be ccommited inside this method, if there is a Tree still in the given database, it will load it.

func (*Tree) Add

func (t *Tree) Add(k, v []byte) error

Add inserts the key-value into the Tree. If the inputs come from a *big.Int, is expected that are represented by a Little-Endian byte array (for circom compatibility).

func (*Tree) AddBatch

func (t *Tree) AddBatch(keys, values [][]byte) ([]Invalid, error)

AddBatch adds a batch of key-values to the Tree. Returns an array containing the indexes of the keys failed to add. Supports empty values as input parameters, which is equivalent to 0 valued byte array.

func (*Tree) AddBatchWithTx

func (t *Tree) AddBatchWithTx(wTx db.WriteTx, keys, values [][]byte) ([]Invalid, error)

AddBatchWithTx does the same than the AddBatch method, but allowing to pass the db.WriteTx that is used. The db.WriteTx will not be committed inside this method.

func (*Tree) AddWithTx

func (t *Tree) AddWithTx(wTx db.WriteTx, k, v []byte) error

AddWithTx does the same than the Add method, but allowing to pass the db.WriteTx that is used. The db.WriteTx will not be committed inside this method.

func (*Tree) Dump

func (t *Tree) Dump(fromRoot []byte) ([]byte, error)

Dump exports all the Tree leafs in a byte array

func (*Tree) DumpWriter

func (t *Tree) DumpWriter(fromRoot []byte, w io.Writer) error

DumpWriter exports all the Tree leafs writing the bytes in the given Writer

func (*Tree) FillMissingEmptySiblings

func (t *Tree) FillMissingEmptySiblings(s [][]byte) [][]byte

FillMissingEmptySiblings adds the empty values to the array of siblings for the Tree number of max levels

func (*Tree) GenProof

func (t *Tree) GenProof(k []byte) ([]byte, []byte, []byte, bool, error)

GenProof generates a MerkleTree proof for the given key. The leaf value is returned, together with the packed siblings of the proof, and a boolean parameter that indicates if the proof is of existence (true) or not (false).

func (*Tree) GenProofWithTx

func (t *Tree) GenProofWithTx(rTx db.Reader, k []byte) ([]byte, []byte, []byte, bool, error)

GenProofWithTx does the same than the GenProof method, but allowing to pass the db.ReadTx that is used.

func (*Tree) GenerateCircomVerifierProof

func (t *Tree) GenerateCircomVerifierProof(k []byte) (*CircomVerifierProof, error)

GenerateCircomVerifierProof generates a CircomVerifierProof for a given key in the Tree

func (*Tree) Get

func (t *Tree) Get(k []byte) ([]byte, []byte, error)

Get returns the value in the Tree for a given key. If the key is not found, will return the error ErrKeyNotFound, and in the leafK & leafV parameters will be placed the data found in the tree in the leaf that was on the path going to the input key.

func (*Tree) GetNLeafs

func (t *Tree) GetNLeafs() (int, error)

GetNLeafs returns the number of Leafs of the Tree.

func (*Tree) GetNLeafsWithTx

func (t *Tree) GetNLeafsWithTx(rTx db.Reader) (int, error)

GetNLeafsWithTx does the same than the GetNLeafs method, but allowing to pass the db.ReadTx that is used.

func (*Tree) GetWithTx

func (t *Tree) GetWithTx(rTx db.Reader, k []byte) ([]byte, []byte, error)

GetWithTx does the same than the Get method, but allowing to pass the db.ReadTx that is used. If the key is not found, will return the error ErrKeyNotFound, and in the leafK & leafV parameters will be placed the data found in the tree in the leaf that was on the path going to the input key.

func (*Tree) Graphviz

func (t *Tree) Graphviz(w io.Writer, fromRoot []byte) error

Graphviz iterates across the full tree to generate a string Graphviz representation of the tree and writes it to w

func (*Tree) GraphvizFirstNLevels

func (t *Tree) GraphvizFirstNLevels(w io.Writer, fromRoot []byte, untilLvl int) error

GraphvizFirstNLevels iterates across the first NLevels of the tree to generate a string Graphviz representation of the first NLevels of the tree and writes it to w

func (*Tree) HashFunction

func (t *Tree) HashFunction() HashFunction

HashFunction returns Tree.hashFunction

func (*Tree) ImportDump

func (t *Tree) ImportDump(b []byte) error

ImportDump imports the leafs (that have been exported with the Dump method) in the Tree, reading them from the given byte array.

func (*Tree) ImportDumpReader

func (t *Tree) ImportDumpReader(r io.Reader) error

ImportDumpReader imports the leafs (that have been exported with the Dump method) in the Tree, reading them from the given reader.

func (*Tree) Iterate

func (t *Tree) Iterate(fromRoot []byte, f func([]byte, []byte)) error

Iterate iterates through the full Tree, executing the given function on each node of the Tree.

func (*Tree) IterateWithStop

func (t *Tree) IterateWithStop(fromRoot []byte, f func(int, []byte, []byte) bool) error

IterateWithStop does the same than Iterate, but with int for the current level, and a boolean parameter used by the passed function, is to indicate to stop iterating on the branch when the method returns 'true'.

func (*Tree) IterateWithStopWithTx

func (t *Tree) IterateWithStopWithTx(rTx db.Reader, fromRoot []byte,
	f func(int, []byte, []byte) bool) error

IterateWithStopWithTx does the same than the IterateWithStop method, but allowing to pass the db.ReadTx that is used.

func (*Tree) IterateWithTx

func (t *Tree) IterateWithTx(rTx db.Reader, fromRoot []byte, f func([]byte, []byte)) error

IterateWithTx does the same than the Iterate method, but allowing to pass the db.ReadTx that is used.

func (*Tree) PrintGraphviz

func (t *Tree) PrintGraphviz(fromRoot []byte) error

PrintGraphviz prints the output of Tree.Graphviz

func (*Tree) PrintGraphvizFirstNLevels

func (t *Tree) PrintGraphvizFirstNLevels(fromRoot []byte, untilLvl int) error

PrintGraphvizFirstNLevels prints the output of Tree.GraphvizFirstNLevels

func (*Tree) Root

func (t *Tree) Root() ([]byte, error)

Root returns the root of the Tree

func (*Tree) RootWithTx

func (t *Tree) RootWithTx(rTx db.Reader) ([]byte, error)

RootWithTx returns the root of the Tree using the given db.ReadTx

func (*Tree) SetRoot

func (t *Tree) SetRoot(root []byte) error

SetRoot sets the root to the given root

func (*Tree) SetRootWithTx

func (t *Tree) SetRootWithTx(wTx db.WriteTx, root []byte) error

SetRootWithTx sets the root to the given root using the given db.WriteTx

func (*Tree) Snapshot

func (t *Tree) Snapshot(fromRoot []byte) (*Tree, error)

Snapshot returns a read-only copy of the Tree from the given root

func (*Tree) Update

func (t *Tree) Update(k, v []byte) error

Update updates the value for a given existing key. If the given key does not exist, returns an error.

func (*Tree) UpdateWithTx

func (t *Tree) UpdateWithTx(wTx db.WriteTx, k, v []byte) error

UpdateWithTx does the same than the Update method, but allowing to pass the db.WriteTx that is used. The db.WriteTx will not be committed inside this method.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL