README ¶
go-muhash
Warning: This is pre-alpha software. The code has not been audited and/or reviewed by anyone other than the author.
go-muhash implements a rolling hash function using a multiplicative hash.
it is based on a multiplicative group over Z=2^3072-1103717 using multiplication and division for adding/removing
elements from the hash function.
the current code is heavily based on: https://github.com/bitcoin/bitcoin/blob/a1fcceac69097a8e6540a6fd8121a5d53022528f/src/crypto/muhash.cpp
(written by Pieter Wuille, MIT licensed)
But uses BLAKE2B as the hash function and Go's standard library big.Int for fast modular inversions(GCD)
MuHash
is the public interface implementing Add/Remove elements functions, and a Finalize function to return a
final hash.
uint3072.go
is a go implementation of the multiplicative group
num3072.c/h
is a C implementation of the multiplicative group
num3072.go
is go bindings for the C imlementation
Ideally we will add Go Assembly implementations using SSE2/SSE4.1/AVX and will choose the correct one in runtime, this should also remove the cgo overhead.
Tests
./build_and_test.sh
will run all the tests and checks in this library.
./fuzz.sh
will run the fuzzer and put new corpus in the corpus
directory. by default, it will use go-fuzz
But if you run with LIBFUZZER=1 ./fuzz.sh
it will run it with libfuzzer
All the current corpus are checked in the unit test in fuzz_corpuses_test.go
(requires -tags=gofuzz
)
Documentation ¶
Overview ¶
Package muhash provides an implementation of a Multiplicative Hash, a cryptographic data structure that allows you to have a rolling hash function that you can add and remove elements from, without the need to re-serialize and re-hash the whole data set.
Index ¶
Constants ¶
const ( // HashSize of array used to store hashes. See Hash. HashSize = 32 // SerializedMuHashSize defines the length in bytes of SerializedMuHash SerializedMuHashSize = elementByteSize )
Variables ¶
var ( // EmptyMuHashHash is the hash of `NewMuHash().Finalize()` EmptyMuHashHash = Hash{0x54, 0x4e, 0xb3, 0x14, 0x2c, 0x0, 0xf, 0xa, 0xd2, 0xc7, 0x6a, 0xc4, 0x1f, 0x42, 0x22, 0xab, 0xba, 0xba, 0xbe, 0xd8, 0x30, 0xee, 0xaf, 0xee, 0x4b, 0x6d, 0xc5, 0x6b, 0x52, 0xd5, 0xca, 0xc0} )
Functions ¶
This section is empty.
Types ¶
type Hash ¶
Hash is a type encapsulating the result of hashing some unknown sized data. it typically represents Blake2b.
func (*Hash) AsArray ¶
AsArray is a helper function to returns a pointer to the underlying byte array.
type MuHash ¶
type MuHash struct {
// contains filtered or unexported fields
}
MuHash is a type used to create a Multiplicative Hash which is a rolling(homomorphic) hash that you can add and remove elements from and receive the same resulting hash as-if you never hashed them. Because of that the order of adding and removing elements doesn't matter. Use NewMuHash to initialize a MuHash, or DeserializeMuHash to parse a MuHash.
func DeserializeMuHash ¶
func DeserializeMuHash(serialized *SerializedMuHash) (*MuHash, error)
DeserializeMuHash will deserialize the MuHash that `Serialize()` serialized.
func NewMuHash ¶
func NewMuHash() *MuHash
NewMuHash return an empty initialized set. when finalized it should be equal to a finalized set with all elements removed.
func (*MuHash) Add ¶
Add hashes the data and adds it to the muhash. Supports arbitrary length data (subject to the underlying hash function(Blake2b) limits)
func (*MuHash) Combine ¶
Combine will add the MuHash together. Equivalent to manually adding all the data elements from one set to the other.
func (*MuHash) Finalize ¶
Finalize will return a hash(blake2b) of the multiset. Because the returned value is a hash of a multiset you cannot "Un-Finalize" it. If this is meant for storage then Serialize should be used instead.
func (*MuHash) Remove ¶
Remove hashes the data and removes it from the multiset. Supports arbitrary length data (subject to the underlying hash function(Blake2b) limits)
func (*MuHash) Reset ¶
func (mu *MuHash) Reset()
Reset clears the muhash from all data. Equivalent to creating a new empty set
func (*MuHash) Serialize ¶
func (mu *MuHash) Serialize() *SerializedMuHash
Serialize returns a serialized version of the MuHash. This is the only right way to serialize a multiset for storage. This MuHash is not finalized, this is meant for storage.
type SerializedMuHash ¶
type SerializedMuHash [SerializedMuHashSize]byte
SerializedMuHash is a is a byte array representing the storage representation of a MuHash
func (SerializedMuHash) String ¶
func (serialized SerializedMuHash) String() string
String returns the SerializedMultiSet as the hexadecimal string