chunker

package
v0.15.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 23, 2024 License: Apache-2.0, MIT Imports: 21 Imported by: 1

Documentation

Overview

Package chunker provides functionality for chunking ad entries generated from provider.MultihashIterator into an IPLD DAG. The interface given a multihash iterator an EntriesChunker drains it, restructures the multihashes in an IPLD DAG and returns the root link to that DAG. Two DAG datastructures are currently implemented: ChainChunker, and HamtChunker. Additionally, CachedEntriesChunker can use either of the chunkers and provide an LRU caching functionality for the generated DAGs.

See: CachedEntriesChunker, ChainChunker, HamtChunker

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CachedEntriesChunker

type CachedEntriesChunker struct {
	// contains filtered or unexported fields
}

CachedEntriesChunker is an EntriesChunker that caches the generated chunks using an LRU cache. The chunks can be formatted as any DAG with two current implementations: HamtChunker and ChainChunker.

The DAGs are guaranteed to either be fully cached or not at all. If DAGs overlap, the smaller overlapping portion is not evicted unless all the DAGs that link to it are evicted.

The number of DAGs cached will be at most equal to the given capacity. The capacity is immutable. DAGs are evicted as needed if the capacity is reached.

See: NewCachedEntriesChunker.

func NewCachedEntriesChunker

func NewCachedEntriesChunker(ctx context.Context, ds datastore.Batching, capacity int, newChunker NewChunkerFunc, purge bool) (*CachedEntriesChunker, error)

NewCachedEntriesChunker instantiates a new CachedEntriesChunker backed by a given datastore.

The DAGs are generated with the given newChunker and are stored in an LRU cache. Once stored, the individual DAGs that make up the entries chain are retrievable in their raw binary

form via CachedEntriesChunker.GetRawCachedChunk.

The shape of the DAGs is dictated by the underlying chunking logic that is instantiated once via newChunker function. See: NewHamtChunkerFunc, NewChainChunkerFunc.

The growth of LRU cache is limited by the given capacity. The capacity specifies the number of complete DAGs that are cached, not the DAGs within each chain. The actual storage consumed by the cache is a factor of: 1) the DAG shape determined by the underlying chunker, 2) multihash length and 3) capacity. For example, a fully populated cache with chunk size of 16384, for multihashes of length 128-bit and capacity of 1024 will consume 256MiB of space, i.e. (16384 * 1024 * 128b).

This implementation guarantees that for any given chain of entries, either the entire chain is cached, or it is not cached at all. When chains overlap, the overlapping portion of the chain is not evicted until the larger chain is evicted.

Unless purge is set to true, upon instantiation, the chunker will restore its state from the datastore, and prunes the datastore as needed. For example, if the given capacity is smaller than the number of chains present in the datastore it will evict chains to respect the given capacity in no particular order.

The purge flag specifies whether any existing cache should be cleared on startup. If set, any existing cached chunks will be deleted from the datastore. Otherwise, the previously cached entries are restored.

Note that a caching metadata with negligible size is persistent in addition to the chunks. The caching metadata is checked during restore to determine the root of cached chains, and the number of overlapping chunks.

The context is only used cancel a call to this function while it is accessing the data store.

See: CachedEntriesChunker.Chunk, CachedEntriesChunker.GetRawCachedChunk.

func (*CachedEntriesChunker) Cap

func (ls *CachedEntriesChunker) Cap() int

Cap returns the maximum number of chained entries chunks this cache stores.

Note, the maximum number refers to the number of chains as a unit and not the total sum of individual chunks across chains.

func (*CachedEntriesChunker) Chunk

func (ls *CachedEntriesChunker) Chunk(ctx context.Context, mhi provider.MultihashIterator) (ipld.Link, error)

Chunk chunks the multihashes supplied by the given mhi into a DAG and returns the link to root.

func (*CachedEntriesChunker) Clear

func (ls *CachedEntriesChunker) Clear(ctx context.Context) error

Clear purges all stored items from the CachedEntriesChunker.

func (*CachedEntriesChunker) Close

func (ls *CachedEntriesChunker) Close() error

Close syncs the backing datastore but does not close it. This is because cached entries chunker wraps an existing datastore and does not construct it, and the wrapped datastore may be in use elsewhere.

func (*CachedEntriesChunker) GetRawCachedChunk

func (ls *CachedEntriesChunker) GetRawCachedChunk(ctx context.Context, l ipld.Link) ([]byte, error)

GetRawCachedChunk gets the raw cached entry chunk for the given link, or nil if no such caching exists.

func (*CachedEntriesChunker) Len

func (ls *CachedEntriesChunker) Len() int

Len returns the number of chained entries chunks thar are currently stored in cache.

Note, the number refers to the number of chains as a unit and not the total sum of individual chunks across chains.

type ChainChunker

type ChainChunker struct {
	// contains filtered or unexported fields
}

ChainChunker chunks advertisement entries as a chained series of schema.EntryChunk nodes. See: NewChainChunker

func NewChainChunker

func NewChainChunker(ls *ipld.LinkSystem, chunkSize int) (*ChainChunker, error)

NewChainChunker instantiates a new chain chunker that given a provider.MultihashIterator it drains all its mulithashes and stores them in the given link system represented as a chain of schema.EntryChunk nodes where each chunk contains no more than chunkSize number of multihashes.

See: schema.EntryChunk.

func (*ChainChunker) Chunk

func (ls *ChainChunker) Chunk(ctx context.Context, mhi provider.MultihashIterator) (ipld.Link, error)

Chunk chunks all the mulithashes returned by the given iterator into a chain of schema.EntryChunk nodes where each chunk contains no more than chunkSize number of multihashes and returns the link the root chunk node.

See: schema.EntryChunk.

type EntriesChunker

type EntriesChunker interface {
	// Chunk chunks multihashes supplied by a given provider.MultihashIterator into a chain of
	// schema.EntryChunk and returns the link of the chain root.
	// If the given iterator has no elements, this function returns a nil link with no error.
	Chunk(context.Context, provider.MultihashIterator) (ipld.Link, error)
}

EntriesChunker chunks multihashes supplied by a given provider.MultihashIterator into a chain of schema.EntryChunk.

type HamtChunker

type HamtChunker struct {
	// contains filtered or unexported fields
}

HamtChunker chunks advertisement entries as an IPLD HAMT data structure. See: NewHamtChunker.

func NewHamtChunker

func NewHamtChunker(ls *ipld.LinkSystem, hashAlg multicodec.Code, bitWidth, bucketSize int) (*HamtChunker, error)

NewHamtChunker instantiates a new HAMT chunker that given a provider.MultihashIterator it drains all its mulithashes and stores them in the given link system represented as an IPLD HAMT ADL.

Only multicodec.Identity, multicodec.Sha2_256 and multicodec.Murmur3X64_64 are supported as hash algorithm. The bit-width and bucket size must be at least 3 and 1 respectively.

See:

func (*HamtChunker) Chunk

func (h *HamtChunker) Chunk(ctx context.Context, iterator provider.MultihashIterator) (ipld.Link, error)

Chunk drains all the multihashes in the given iterator, stores them as an IPLD HAMT ADL and returns the link to the root HAMT node.

The HAMT is used as a set where the keys in the map represent the multihashes and values are simply set to true.

type NewChunkerFunc

type NewChunkerFunc func(ls *ipld.LinkSystem) (EntriesChunker, error)

NewChunkerFunc instantiates the core EntriesChunker to use for generating advertisement entries DAG.

func NewChainChunkerFunc

func NewChainChunkerFunc(chunkSize int) NewChunkerFunc

func NewHamtChunkerFunc

func NewHamtChunkerFunc(hashAlg multicodec.Code, bitWidth, bucketSize int) NewChunkerFunc

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL