bmt

package
v0.0.0-...-0312305 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 3, 2022 License: BSD-3-Clause Imports: 4 Imported by: 0

Documentation

Overview

Package bmt implements Binary Merkle Tree hash. Binary Merkle Tree Hash is a hash function over arbitrary byte slices of limited size. The BMT hash is defined as H(header|bmt-root) where header is an 8-byte metadata prefix and bmt-root is the root hash of the binary merkle tree built over fixed size segments of the underlying chunk using any base hash function H (e.g., keccak 256 SHA3). The segment size is the same as the hash size of H. The number of segments on the base must be a power of 2 so that the resulting tree is balanced. Chunks with data shorter than the fixed size are hashed as if they had zero padding.

BMT hash is used as the chunk hash function in penguin which in turn is the basis for the 128 branching penguin hash used to represent files.

The BMT is optimal for providing compact inclusion proofs, i.e. prove that a segment is a substring of a chunk starting at a particular offset. The size of the underlying segments is fixed to the size of the base hash (called the resolution of the BMT hash), Using Keccak256 SHA3 hash is 32 bytes, the EVM word size to optimize for on-chain BMT verification as well as the hash size optimal for inclusion proofs in the merkle tree of the penguin hash.

Two implementations are provided:

RefHasher is optimized for code simplicity and meant as a reference implementation that is simple to understand

Hasher is optimized for speed taking advantage of concurrency with minimalistic concurrency control.

BMT Hasher implements the following interfaces:

- standard golang hash.Hash - synchronous, reusable

- io.Writer - synchronous left-to-right datawriter.

Index

Constants

View Source
const (
	SpanSize = 8
)

Variables

This section is empty.

Functions

func LengthToSpan

func LengthToSpan(length int64) []byte

LengthToSpan creates a binary data span size representation. It is required for calculating the BMT hash.

Types

type BaseHasherFunc

type BaseHasherFunc func() hash.Hash

BaseHasherFunc is a hash.Hash constructor function used for the base hash of the BMT. implemented by Keccak256 SHA3 sha3.NewLegacyKeccak256

type Conf

type Conf struct {
	// contains filtered or unexported fields
}

configuration

func NewConf

func NewConf(hasher BaseHasherFunc, segmentCount, capacity int) *Conf

type Hash

type Hash interface {
	hash.Hash

	// SetHeaderInt64 sets the header bytes of BMT hash to the little endian binary representation of the int64 argument.
	SetHeaderInt64(int64)

	// SetHeader sets the header bytes of BMT hash by copying the first 8 bytes of the argument.
	SetHeader([]byte)

	// Hash calculates the BMT hash of the buffer written so far and appends it to the argument
	Hash([]byte) ([]byte, error)

	// Capacity returns the maximum amount of bytes that will be processed by the implementation.
	Capacity() int
}

Hash provides the necessary extension of the hash interface to add the length-prefix of the BMT hash.

Any implementation should make it possible to generate a BMT hash using the hash.Hash interface only. However, the limitation will be that the Span of the BMT hash always must be limited to the amount of bytes actually written.

type Hasher

type Hasher struct {
	*Conf // configuration
	// contains filtered or unexported fields
}

Hasher is a reusable hasher for fixed maximum size chunks representing a BMT It reuses a pool of trees for amortised memory allocation and resource control, and supports order-agnostic concurrent segment writes and section (double segment) writes as well as sequential read and write.

The same hasher instance must not be called concurrently on more than one chunk.

The same hasher instance is synchronously reuseable.

Sum gives back the tree to the pool and guaranteed to leave the tree and itself in a state reusable for hashing a new chunk.

func (*Hasher) BlockSize

func (h *Hasher) BlockSize() int

BlockSize returns the optimal write size to the Hasher

func (*Hasher) Capacity

func (h *Hasher) Capacity() int

Capacity returns the maximum amount of bytes that will be processed by this hasher implementation. since BMT assumes a balanced binary tree, capacity it is always a power of 2

func (*Hasher) Hash

func (h *Hasher) Hash(b []byte) ([]byte, error)

Hash returns the BMT root hash of the buffer and an error using Hash presupposes sequential synchronous writes (io.Writer interface).

func (*Hasher) Reset

func (h *Hasher) Reset()

Reset prepares the Hasher for reuse

func (*Hasher) SetHeader

func (h *Hasher) SetHeader(span []byte)

SetHeader sets the metadata preamble to the span bytes given argument for the current hash operation.

func (*Hasher) SetHeaderInt64

func (h *Hasher) SetHeaderInt64(length int64)

SetHeaderInt64 sets the metadata preamble to the little endian binary representation of int64 argument for the current hash operation.

func (*Hasher) Size

func (h *Hasher) Size() int

Size returns the digest size of the hash

func (*Hasher) Sum

func (h *Hasher) Sum(b []byte) []byte

Sum returns the BMT root hash of the buffer, unsafe version of Hash

func (*Hasher) Write

func (h *Hasher) Write(b []byte) (int, error)

Write calls sequentially add to the buffer to be hashed, with every full segment calls processSection in a go routine.

type Pool

type Pool struct {
	*Conf // configuration
	// contains filtered or unexported fields
}

Pool provides a pool of trees used as resources by the BMT Hasher. A tree popped from the pool is guaranteed to have a clean state ready for hashing a new chunk.

func NewPool

func NewPool(c *Conf) *Pool

NewPool creates a tree pool with hasher, segment size, segment count and capacity it reuses free trees or creates a new one if capacity is not reached.

func (*Pool) Get

func (p *Pool) Get() *Hasher

Get returns a BMT hasher possibly reusing a tree from the pool

func (*Pool) Put

func (p *Pool) Put(h *Hasher)

Put is called after using a bmt hasher to return the tree to a pool for reuse

Directories

Path Synopsis
Package reference is a simple nonconcurrent reference implementation for hashsize segment based Binary Merkle tree hash on arbitrary but fixed maximum chunksize n where 0 <= n <= 4096 This implementation does not take advantage of any paralellisms and uses far more memory than necessary, but it is easy to see that it is correct.
Package reference is a simple nonconcurrent reference implementation for hashsize segment based Binary Merkle tree hash on arbitrary but fixed maximum chunksize n where 0 <= n <= 4096 This implementation does not take advantage of any paralellisms and uses far more memory than necessary, but it is easy to see that it is correct.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL