gcs

package module
v2.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 10, 2023 License: ISC Imports: 12 Imported by: 15

README

gcs

Build Status ISC License Doc

Package gcs provides an API for building and using Golomb-coded sets.

A Golomb-Coded Set (GCS) is a space-efficient probabilistic data structure that is used to test set membership with a tunable false positive rate while simultaneously preventing false negatives. In other words, items that are in the set will always match, but items that are not in the set will also sometimes match with the chosen false positive rate.

This package currently implements two different versions for backwards compatibility. Version 1 is deprecated and therefore should no longer be used.

Version 2 is the GCS variation that follows the specification details in DCP0005.

Version 2 sets do not permit empty items (data of zero length) to be added and are parameterized by the following:

  • A parameter B that defines the remainder code bit size
  • A parameter M that defines the false positive rate as 1/M
  • A key for the SipHash-2-4 function
  • The items to include in the set

A comprehensive suite of tests is provided to ensure proper functionality.

GCS use in Decred

GCS is used as a mechanism for storing, transmitting, and committing to per-block filters. Consensus-validating full nodes commit to a single filter for every block and serve the filter to SPV clients that match against the filter locally to determine if the block is potentially relevant. The required parameters for Decred are defined by the blockcf2 package.

For more details, see the Block Filters section of DCP0005.

Installation and Updating

$ go get -u github.com/decred/dcrd/gcs

License

Package blockchain is licensed under the copyfree ISC License.

Documentation

Overview

Package gcs provides an API for building and using a Golomb-coded set filter.

A Golomb-Coded Set (GCS) is a space-efficient probabilistic data structure that is used to test set membership with a tunable false positive rate while simultaneously preventing false negatives. In other words, items that are in the set will always match, but items that are not in the set will also sometimes match with the chosen false positive rate.

This package currently implements two different versions for backwards compatibility. Version 1 is deprecated and therefore should no longer be used.

Version 2 is the GCS variation that follows the specification details in DCP0005: https://github.com/decred/dcps/blob/master/dcp-0005/dcp-0005.mediawiki#golomb-coded-sets.

Version 2 sets do not permit empty items (data of zero length) to be added and are parameterized by the following:

* A parameter `B` that defines the remainder code bit size * A parameter `M` that defines the false positive rate as `1/M` * A key for the SipHash-2-4 function * The items to include in the set

Errors

Errors returned by this package are of type gcs.Error. This allows the caller to programmatically determine the specific error by examining the ErrorCode field of the type asserted gcs.Error while still providing rich error messages with contextual information. A convenience function named IsErrorCode is also provided to allow callers to easily check for a specific error code. See ErrorCode in the package documentation for a full list.

GCS use in Decred

GCS is used as a mechanism for storing, transmitting, and committing to per-block filters. Consensus-validating full nodes commit to a single filter for every block and serve the filter to SPV clients that match against the filter locally to determine if the block is potentially relevant. The required parameters for Decred are defined by the blockcf2 package.

For more details, see the the Block Filters section of DCP0005: https://github.com/decred/dcps/blob/master/dcp-0005/dcp-0005.mediawiki#block-filters

Index

Constants

View Source
const KeySize = 16

KeySize is the size of the byte array required for key material for the SipHash keyed hash function.

Variables

This section is empty.

Functions

func IsErrorCode

func IsErrorCode(err error, c ErrorCode) bool

IsErrorCode returns whether err is an Error with a matching error code.

func MakeHeaderForFilter

func MakeHeaderForFilter(filter *FilterV1, prevHeader *chainhash.Hash) chainhash.Hash

MakeHeaderForFilter makes a filter chain header for a filter, given the filter and the previous filter chain header.

Types

type Error

type Error struct {
	ErrorCode   ErrorCode // Describes the kind of error
	Description string    // Human readable description of the issue
}

Error identifies a filter-related error. The caller can use type assertions to access the ErrorCode field to ascertain the specific reason for the failure.

func (Error) Error

func (e Error) Error() string

Error satisfies the error interface and prints human-readable errors.

type ErrorCode

type ErrorCode int

ErrorCode identifies a kind of error.

const (
	// ErrNTooBig signifies that the filter can't handle N items.
	ErrNTooBig ErrorCode = iota

	// ErrPTooBig signifies that the filter can't handle `1/2**P`
	// collision probability.
	ErrPTooBig

	// ErrBTooBig signifies that the provided Golomb coding bin size is larger
	// than the maximum allowed value.
	ErrBTooBig

	// ErrMisserialized signifies a filter was misserialized and is missing the
	// N and/or P parameters of a serialized filter.
	ErrMisserialized
)

These constants are used to identify a specific RuleError.

func (ErrorCode) String

func (e ErrorCode) String() string

String returns the ErrorCode as a human-readable name.

type FilterV1

type FilterV1 struct {
	// contains filtered or unexported fields
}

FilterV1 describes an immutable filter that can be built from a set of data elements, serialized, deserialized, and queried in a thread-safe manner. The serialized form is compressed as a Golomb Coded Set (GCS) along with the number of members of the set. The hash function used is SipHash, a keyed function. The key used in building the filter is required in order to match filter values and is not included in the serialized form.

func FromBytesV1

func FromBytesV1(P uint8, d []byte) (*FilterV1, error)

FromBytesV1 deserializes a version 1 GCS filter from a known P and serialized filter as returned by Bytes().

func NewFilterV1

func NewFilterV1(P uint8, key [KeySize]byte, data [][]byte) (*FilterV1, error)

NewFilterV1 builds a new version 1 GCS filter with a collision probability of 1 / 2^P for the given key and data.

func (*FilterV1) Bytes

func (f *FilterV1) Bytes() []byte

Bytes returns the serialized format of the GCS filter which includes N, but does not include other parameters such as the false positive rate or the key.

func (*FilterV1) Hash

func (f *FilterV1) Hash() chainhash.Hash

Hash returns the BLAKE256 hash of the filter.

func (*FilterV1) Match

func (f *FilterV1) Match(key [KeySize]byte, data []byte) bool

Match checks whether a []byte value is likely (within collision probability) to be a member of the set represented by the filter.

func (*FilterV1) MatchAny

func (f *FilterV1) MatchAny(key [KeySize]byte, data [][]byte) bool

MatchAny checks whether any []byte value is likely (within collision probability) to be a member of the set represented by the filter faster than calling Match() for each value individually.

func (*FilterV1) N

func (f *FilterV1) N() uint32

N returns the size of the data set used to build the filter.

func (*FilterV1) P

func (f *FilterV1) P() uint8

P returns the filter's collision probability as a negative power of 2. For example, a collision probability of 1 / 2^20 is represented as 20.

type FilterV2

type FilterV2 struct {
	// contains filtered or unexported fields
}

FilterV2 describes an immutable filter that can be built from a set of data elements, serialized, deserialized, and queried in a thread-safe manner. The serialized form is compressed as a Golomb Coded Set (GCS) along with the number of members of the set. The hash function used is SipHash, a keyed function. The key used in building the filter is required in order to match filter values and is not included in the serialized form.

Version 2 filters differ from version 1 filters in four ways:

  1. Support for independently specifying the false positive rate and Golomb coding bin size which allows minimizing the filter size
  2. A faster (incompatible with version 1) reduction function
  3. A more compact serialization for the number of members in the set
  4. Deduplication of all hash collisions prior to reducing and serializing the deltas

func FromBytesV2

func FromBytesV2(B uint8, M uint64, d []byte) (*FilterV2, error)

FromBytesV2 deserializes a version 2 GCS filter from a known B, M, and serialized filter as returned by Bytes().

func NewFilterV2

func NewFilterV2(B uint8, M uint64, key [KeySize]byte, data [][]byte) (*FilterV2, error)

NewFilterV2 builds a new GCS filter with the provided tunable parameters that contains every item of the passed data as a member of the set.

B is the tunable bits parameter for constructing the filter that is used as the bin size in the underlying Golomb coding with a value of 2^B. The optimal value of B to minimize the size of the filter for a given false positive rate 1/M is floor(log_2(M) - 0.055256). The maximum allowed value for B is 32. An error will be returned for larger values.

M is the inverse of the target false positive rate for the filter. The optimal value of M to minimize the size of the filter for a given B is ceil(1.497137 * 2^B).

key is a key used in the SipHash function used to hash each data element prior to inclusion in the filter. This helps thwart would be attackers attempting to choose elements that intentionally cause false positives.

The general process for determining optimal parameters for B and M to minimize the size of the filter is to start with the desired false positive rate and calculate B per the aforementioned formula accordingly. Then, if the application permits the false positive rate to be varied, calculate the optimal value of M via the formula provided under the description of M.

func (*FilterV2) B

func (f *FilterV2) B() uint8

B returns the tunable bits parameter that was used to construct the filter. It represents the bin size in the underlying Golomb coding with a value of 2^B.

func (*FilterV2) Bytes

func (f *FilterV2) Bytes() []byte

Bytes returns the serialized format of the GCS filter which includes N, but does not include other parameters such as the false positive rate or the key.

func (*FilterV2) Hash

func (f *FilterV2) Hash() chainhash.Hash

Hash returns the BLAKE256 hash of the filter.

func (*FilterV2) Match

func (f *FilterV2) Match(key [KeySize]byte, data []byte) bool

Match checks whether a []byte value is likely (within collision probability) to be a member of the set represented by the filter.

func (*FilterV2) MatchAny

func (f *FilterV2) MatchAny(key [KeySize]byte, data [][]byte) bool

MatchAny checks whether any []byte value is likely (within collision probability) to be a member of the set represented by the filter faster than calling Match() for each value individually.

func (*FilterV2) N

func (f *FilterV2) N() uint32

N returns the size of the data set used to build the filter.

Directories

Path Synopsis
Package blockcf provides functions for building committed filters for blocks using Golomb-coded sets in a way that is useful for light clients such as SPV wallets.
Package blockcf provides functions for building committed filters for blocks using Golomb-coded sets in a way that is useful for light clients such as SPV wallets.
Package blockcf2 provides functions for building committed filters for blocks using Golomb-coded sets in a way that is useful for light clients such as SPV wallets.
Package blockcf2 provides functions for building committed filters for blocks using Golomb-coded sets in a way that is useful for light clients such as SPV wallets.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL