Documentation ¶
Overview ¶
Package deephash hashes a Go value recursively, in a predictable order, without looping. The hash is only valid within the lifetime of a program. Users should not store the hash on disk or send it over the network. The hash is sufficiently strong and unique such that Hash(&x) == Hash(&y) is an appropriate replacement for x == y.
The definition of equality is identical to reflect.DeepEqual except:
- Floating-point values are compared based on the raw bits, which means that NaNs (with the same bit pattern) are treated as equal.
- time.Time are compared based on whether they are the same instant in time and also in the same zone offset. Monotonic measurements and zone names are ignored as part of the hash.
- netip.Addr are compared based on a shallow comparison of the struct.
WARNING: This package, like most of the tailscale.com Go module, should be considered Tailscale-internal; we make no API promises.
Cycle detection ¶
This package correctly handles cycles in the value graph, but in a way that is potentially pathological in some situations.
The algorithm for cycle detection operates by pushing a pointer onto a stack whenever deephash is visiting a pointer and popping the pointer from the stack after deephash is leaving the pointer. Before visiting a new pointer, deephash checks whether it has already been visited on the pointer stack. If so, it hashes the index of the pointer on the stack and avoids visiting the pointer.
This algorithm is guaranteed to detect cycles, but may expand pointers more often than a potential alternate algorithm that remembers all pointers ever visited in a map. The current algorithm uses O(D) memory, where D is the maximum depth of the recursion, while the alternate algorithm would use O(P) memory where P is all pointers ever seen, which can be a lot, and most of which may have nothing to do with cycles. Also, the alternate algorithm has to deal with challenges of producing deterministic results when pointers are visited in non-deterministic ways such as when iterating through a Go map. The stack-based algorithm avoids this challenge since the stack is always deterministic regardless of non-deterministic iteration order of Go maps.
To concretely see how this algorithm can be pathological, consider the following data structure:
var big *Item = ... // some large data structure that is slow to hash var manyBig []*Item for i := range 1000 { manyBig = append(manyBig, &big) } deephash.Hash(manyBig)
Here, the manyBig data structure is not even cyclic. We have the same big *Item being stored multiple times in a []*Item. When deephash hashes []*Item, it hashes each individual *Item not realizing that it had just done the computation earlier. To avoid the pathological situation, Item should implement SelfHasher and memoize attempts to hash itself.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func HasherForType ¶
HasherForType returns a hash that is specialized for the provided type.
HasherForType panics if the opts are invalid for the provided type.
Currently, at most one option can be provided (IncludeFields or ExcludeFields) and its type must match the type of T. Those restrictions may be removed in the future, along with documentation about their precedence when combined.
Types ¶
type Hasher ¶
type Hasher struct {
// contains filtered or unexported fields
}
Hasher is a value passed to [SelfHasher.Hash] that allow implementations to hash themselves in a structured manner.
func (Hasher) HashBytes ¶
HashBytes hashes a sequence of bytes b. The length of b is not explicitly hashed.
func (Hasher) HashString ¶
HashString hashes the string data of s The length of s is not explicitly hashed.
type Option ¶
type Option interface {
// contains filtered or unexported methods
}
Option is an optional argument to HasherForType.
func ExcludeFields ¶
ExcludeFields returns an option that modifies the hashing for T to include all struct fields of T except those provided in fields.
T must be a struct type, and must match the type of the value passed to HasherForType.
func IncludeFields ¶
IncludeFields returns an option that modifies the hashing for T to only include the named struct fields.
T must be a struct type, and must match the type of the value passed to HasherForType.
type SelfHasher ¶
type SelfHasher interface {
Hash(Hasher)
}
SelfHasher is implemented by types that can compute their own hash by writing values through the provided Hasher parameter. Implementations must not leak the provided Hasher.
If the implementation of SelfHasher recursively calls deephash.Hash, then infinite recursion is quite likely to occur. To avoid this, use a type definition to drop methods before calling deephash.Hash:
func (v *MyType) Hash(h deephash.Hasher) { v.hashMu.Lock() defer v.hashMu.Unlock() if v.dirtyHash { type MyTypeWithoutMethods MyType // type define MyType to drop Hash method v.dirtyHash = false // clear out dirty bit to avoid hashing over it v.hashSum = deephash.Sum{} // clear out hashSum to avoid hashing over it v.hashSum = deephash.Hash((*MyTypeWithoutMethods)(v)) } h.HashSum(v.hashSum) }
In the above example, we acquire a lock since it is possible that deephash is called in a concurrent manner, which implies that MyType.Hash may also be called in a concurrent manner. Whether this lock is necessary is application-dependent and left as an exercise to the reader. Also, the example assumes that dirtyHash is set elsewhere by application logic whenever a mutation is made to MyType that would alter the hash.