ingest

package
v0.7.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2023 License: Apache-2.0, MIT Imports: 48 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var Selectors struct {
	// One selects a single node in a DAG.
	One ipld.Node
	// All selects all the nodes in a DAG recursively traversing all the edges with no recursion
	// limit.
	All ipld.Node
	// AdSequence selects an schema.Advertisement sequence with recursive edge exploration of
	// PreviousID field.
	AdSequence ipld.Node
	// EntriesWithLimit selects schema.EntryChunk nodes  recursively traversing the edge at Next
	// filed with the given recursion limit.
	EntriesWithLimit func(limit selector.RecursionLimit) ipld.Node
}

Selectors captures a collection of IPLD selectors commonly used by the Ingester.

Functions

func Unfreeze added in v0.5.5

func Unfreeze(unfrozen map[peer.ID]cid.Cid, dstore datastore.Datastore) error

Types

type Ingester

type Ingester struct {
	// contains filtered or unexported fields
}

Ingester is a type that uses dagsync for the ingestion protocol.

## Advertisement Ingestion Constraints

1. If an Ad is processed, all older ads referenced by this ad (towards the start of the ad chain), have also been processes. For example, given some chain A <- B <- C, the indexer will never be in the state that it indexed A & C but not B.

2. An indexer will index an Ad chain, but will not make any guarantees about consistency in the presence of multiple ad chains for a given provider. For example if a provider publishes two ad chains at the same time chain1 and chain2 the indexer will apply whichever chain it learns about first, then apply the other chain.

3. An indexer will not index the same Ad twice. An indexer will be resilient to restarts. If the indexer goes down and comes back up it should not break constraint 1.

func NewIngester

func NewIngester(cfg config.Ingest, h host.Host, idxr indexer.Interface, reg *registry.Registry, ds, dsTmp datastore.Batching, options ...Option) (*Ingester, error)

NewIngester creates a new Ingester that uses a dagsync Subscriber to handle communication with providers.

func (*Ingester) Announce

func (ing *Ingester) Announce(ctx context.Context, nextCid cid.Cid, pubAddrInfo peer.AddrInfo) error

Announce sends an announce message to directly to dagsync, instead of through pubsub.

func (*Ingester) Close

func (ing *Ingester) Close() error

Close stops the ingester and all its workers.

func (*Ingester) GetLatestSync

func (ing *Ingester) GetLatestSync(publisherID peer.ID) (cid.Cid, error)

GetLatestSync gets the latest CID synced for the peer.

func (*Ingester) MultihashesFromMirror added in v0.5.14

func (ing *Ingester) MultihashesFromMirror() uint64

func (*Ingester) RunWorkers

func (ing *Ingester) RunWorkers(n int)

func (*Ingester) Sync

func (ing *Ingester) Sync(ctx context.Context, peerInfo peer.AddrInfo, depth int, resync bool) (cid.Cid, error)

Sync syncs advertisements, up to the latest advertisement, from a publisher. Sync returns the final CID that was synced by the call to Sync.

Sync works by first fetching each advertisement from the specified peer starting at the most recent and traversing to the advertisement last seen by the indexer, or until the advertisement depth limit is reached. Then the entries in each advertisement are synced and the multihashes in each entry are indexed.

The selector used to sync the advertisement is controlled by the following parameters: depth, and resync.

The depth argument specifies the recursion depth limit to use during sync. Its value may less than -1 for no limit, 0 to use the indexer's configured value, or greater than 1 for an explicit limit.

The resync argument specifies whether to stop the traversal at the latest known advertisement that is already synced. If set to true, the traversal will continue until either there are no more advertisements left or the recursion depth limit is reached.

The reference to the latest synced advertisement returned by GetLatestSync is only updated if the given depth is zero and resync is set to false. Otherwise, a custom selector with the given depth limit and stop link is constructed and used for traversal. See dagsync.Subscriber.Sync.

The Context argument controls the lifetime of the sync. Canceling it cancels the sync and causes the multihash channel to close without any data.

type Option added in v0.5.6

type Option func(*configIngest) error

Option is a function that sets a value in a config.

func WithIndexCounts added in v0.5.6

func WithIndexCounts(ic *counter.IndexCounts) Option

WithIndexCounts configures counting indexes using an IndexCounts instance.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL