ingest

package
v0.8.38 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 23, 2024 License: Apache-2.0, MIT Imports: 44 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Unfreeze added in v0.5.5

func Unfreeze(unfrozen map[peer.ID]cid.Cid, dstore datastore.Datastore) error

Types

type Ingester

type Ingester struct {
	// contains filtered or unexported fields
}

Ingester is a type that uses dagsync for the ingestion protocol.

## Advertisement Ingestion Constraints

1. If an Ad is processed, all older ads referenced by this ad (towards the start of the ad chain), have also been processes. For example, given some chain A <- B <- C, the indexer will never be in the state that it indexed A & C but not B.

2. An indexer will index an Ad chain, but will not make any guarantees about consistency in the presence of multiple ad chains for a given provider. For example if a provider publishes two ad chains at the same time chain1 and chain2 the indexer will apply whichever chain it learns about first, then apply the other chain.

3. An indexer will not index the same Ad twice. An indexer will be resilient to restarts. If the indexer goes down and comes back up it should not break constraint 1.

func NewIngester

func NewIngester(cfg config.Ingest, h host.Host, idxr indexer.Interface, reg *registry.Registry, ds, dsTmp datastore.Batching) (*Ingester, error)

NewIngester creates a new Ingester that uses a dagsync Subscriber to handle communication with providers.

func (*Ingester) Announce

func (ing *Ingester) Announce(ctx context.Context, nextCid cid.Cid, pubAddrInfo peer.AddrInfo) error

Announce sends an announce message to directly to dagsync, instead of through pubsub.

func (*Ingester) Close

func (ing *Ingester) Close() error

Close stops the ingester and all its workers.

func (*Ingester) GetAllIngestRates added in v0.8.0

func (ing *Ingester) GetAllIngestRates() map[string]rate.Rate

func (*Ingester) GetIngestRate added in v0.8.0

func (ing *Ingester) GetIngestRate(peerID peer.ID) (rate.Rate, bool)

func (*Ingester) GetLatestSync

func (ing *Ingester) GetLatestSync(publisherID peer.ID) (cid.Cid, error)

GetLatestSync gets the latest CID synced for the peer. If no error is returned, then returned CID is never cid.Undef.

func (*Ingester) MarkAdProcessed added in v0.8.10

func (ing *Ingester) MarkAdProcessed(publisher peer.ID, adCid cid.Cid) error

MarkAdProcessed explicitly marks an advertisement as processed. This is used to avoid ingesting this and previous advertisements.

func (*Ingester) MultihashesFromMirror added in v0.5.14

func (ing *Ingester) MultihashesFromMirror() uint64

func (*Ingester) RunWorkers

func (ing *Ingester) RunWorkers(n int)

func (*Ingester) Skip500EntriesError added in v0.8.19

func (ing *Ingester) Skip500EntriesError(skip bool)

func (*Ingester) Sync

func (ing *Ingester) Sync(ctx context.Context, peerInfo peer.AddrInfo, depth int, resync bool) (cid.Cid, error)

Sync syncs advertisements, up to the latest advertisement, from a publisher. Sync returns the final CID that was synced by the call to Sync.

Sync works by first fetching each advertisement from the specified peer starting at the most recent and traversing to the advertisement last seen by the indexer, or until the advertisement depth limit is reached. Then the entries in each advertisement are synced and the multihashes in each entry are indexed.

The depth argument specifies the recursion depth limit to use during sync. Its value may less than -1 for no limit, 0 to use the indexer's configured value, or greater than 1 for an explicit limit.

The resync argument specifies whether to stop the traversal at the latest known advertisement that is already synced. If set to true, the traversal will continue until either there are no more advertisements left or the recursion depth limit is reached.

The reference to the latest synced advertisement returned by GetLatestSync is only updated if the given depth is zero and resync is set to false.

The Context argument controls the lifetime of the sync. Canceling it cancels the sync and causes the multihash channel to close without any data.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL