Documentation ¶
Overview ¶
Package bmt implements Binary Merkle Tree hash. Binary Merkle Tree Hash is a hash function over arbitrary byte slices of limited size. The BMT hash is defined as H(header|bmt-root) where header is an 8-byte metadata prefix and bmt-root is the root hash of the binary merkle tree built over fixed size segments of the underlying chunk using any base hash function H (e.g., keccak 256 SHA3). The segment size is the same as the hash size of H. The number of segments on the base must be a power of 2 so that the resulting tree is balanced. Chunks with data shorter than the fixed size are hashed as if they had zero padding.
BMT hash is used as the chunk hash function in penguin which in turn is the basis for the 128 branching penguin hash used to represent files.
The BMT is optimal for providing compact inclusion proofs, i.e. prove that a segment is a substring of a chunk starting at a particular offset. The size of the underlying segments is fixed to the size of the base hash (called the resolution of the BMT hash), Using Keccak256 SHA3 hash is 32 bytes, the EVM word size to optimize for on-chain BMT verification as well as the hash size optimal for inclusion proofs in the merkle tree of the penguin hash.
Two implementations are provided:
RefHasher is optimized for code simplicity and meant as a reference implementation that is simple to understand
Hasher is optimized for speed taking advantage of concurrency with minimalistic concurrency control.
BMT Hasher implements the following interfaces:
- standard golang hash.Hash - synchronous, reusable
- io.Writer - synchronous left-to-right datawriter.
Index ¶
- Constants
- func LengthToSpan(length int64) []byte
- type BaseHasherFunc
- type Conf
- type Hash
- type Hasher
- func (h *Hasher) BlockSize() int
- func (h *Hasher) Capacity() int
- func (h *Hasher) Hash(b []byte) ([]byte, error)
- func (h *Hasher) Reset()
- func (h *Hasher) SetHeader(span []byte)
- func (h *Hasher) SetHeaderInt64(length int64)
- func (h *Hasher) Size() int
- func (h *Hasher) Sum(b []byte) []byte
- func (h *Hasher) Write(b []byte) (int, error)
- type Pool
Constants ¶
const (
SpanSize = 8
)
Variables ¶
This section is empty.
Functions ¶
func LengthToSpan ¶
LengthToSpan creates a binary data span size representation. It is required for calculating the BMT hash.
Types ¶
type BaseHasherFunc ¶
BaseHasherFunc is a hash.Hash constructor function used for the base hash of the BMT. implemented by Keccak256 SHA3 sha3.NewLegacyKeccak256
type Conf ¶
type Conf struct {
// contains filtered or unexported fields
}
configuration
func NewConf ¶
func NewConf(hasher BaseHasherFunc, segmentCount, capacity int) *Conf
type Hash ¶
type Hash interface { hash.Hash // SetHeaderInt64 sets the header bytes of BMT hash to the little endian binary representation of the int64 argument. SetHeaderInt64(int64) // SetHeader sets the header bytes of BMT hash by copying the first 8 bytes of the argument. SetHeader([]byte) // Hash calculates the BMT hash of the buffer written so far and appends it to the argument Hash([]byte) ([]byte, error) // Capacity returns the maximum amount of bytes that will be processed by the implementation. Capacity() int }
Hash provides the necessary extension of the hash interface to add the length-prefix of the BMT hash.
Any implementation should make it possible to generate a BMT hash using the hash.Hash interface only. However, the limitation will be that the Span of the BMT hash always must be limited to the amount of bytes actually written.
type Hasher ¶
type Hasher struct { *Conf // configuration // contains filtered or unexported fields }
Hasher is a reusable hasher for fixed maximum size chunks representing a BMT It reuses a pool of trees for amortised memory allocation and resource control, and supports order-agnostic concurrent segment writes and section (double segment) writes as well as sequential read and write.
The same hasher instance must not be called concurrently on more than one chunk.
The same hasher instance is synchronously reuseable.
Sum gives back the tree to the pool and guaranteed to leave the tree and itself in a state reusable for hashing a new chunk.
func (*Hasher) Capacity ¶
Capacity returns the maximum amount of bytes that will be processed by this hasher implementation. since BMT assumes a balanced binary tree, capacity it is always a power of 2
func (*Hasher) Hash ¶
Hash returns the BMT root hash of the buffer and an error using Hash presupposes sequential synchronous writes (io.Writer interface).
func (*Hasher) SetHeader ¶
SetHeader sets the metadata preamble to the span bytes given argument for the current hash operation.
func (*Hasher) SetHeaderInt64 ¶
SetHeaderInt64 sets the metadata preamble to the little endian binary representation of int64 argument for the current hash operation.
type Pool ¶
type Pool struct { *Conf // configuration // contains filtered or unexported fields }
Pool provides a pool of trees used as resources by the BMT Hasher. A tree popped from the pool is guaranteed to have a clean state ready for hashing a new chunk.
func NewPool ¶
NewPool creates a tree pool with hasher, segment size, segment count and capacity it reuses free trees or creates a new one if capacity is not reached.
Directories ¶
Path | Synopsis |
---|---|
Package reference is a simple nonconcurrent reference implementation for hashsize segment based Binary Merkle tree hash on arbitrary but fixed maximum chunksize n where 0 <= n <= 4096 This implementation does not take advantage of any paralellisms and uses far more memory than necessary, but it is easy to see that it is correct.
|
Package reference is a simple nonconcurrent reference implementation for hashsize segment based Binary Merkle tree hash on arbitrary but fixed maximum chunksize n where 0 <= n <= 4096 This implementation does not take advantage of any paralellisms and uses far more memory than necessary, but it is easy to see that it is correct. |