bloomfilter

package
v1.119.15 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 7, 2025 License: AGPL-3.0 Imports: 17 Imported by: 0

Documentation

Overview

Package bloomfilter contains the functions needed to run part of garbage collection process.

The bloomfilter.Observer implements the ranged loop Observer interface allowing us to subscribe to the loop to get information for every segment in the metabase db.

The bloomfilter.Observer is subscribed to ranged loop instance to account for all existing segment pieces on storage nodes and create "retain requests" which contain a bloom filter of all pieces that possibly exist on a storage node. With ranged loop segments can be processed in parallel to speed up process.

The bloomfilter.Observer will send that requests to the Storj bucket after a full ranged loop iteration. After that bloom filters will be downloaded and sent to the storage nodes with separate service from storj/satellite/gc/sender package.

This bloom filter service should be run only against immutable database snapshot.

See storj/docs/design/garbage-collection.md for more info.

Index

Constants

View Source
const LATEST = "LATEST"

LATEST is the name of the file that contains the most recently completed bloomfilter generation prefix.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	RunOnce bool `help:"set if garbage collection bloom filter process should only run once then exit" default:"false"`

	UseSyncObserver bool `help:"whether to use test GC SyncObserver with ranged loop" default:"true"`

	// value for InitialPieces currently based on average pieces per node
	InitialPieces        int64       `` /* 139-byte string literal not displayed */
	FalsePositiveRate    float64     `help:"the false positive rate used for creating a garbage collection bloom filter" releaseDefault:"0.1" devDefault:"0.1"`
	MaxBloomFilterSize   memory.Size `help:"maximum size of a single bloom filter" default:"2m"`
	ExcludeExpiredPieces bool        `help:"do not include expired pieces into bloom filter" default:"true"`

	AccessGrant  string        `help:"Access Grant which will be used to upload bloom filters to the bucket" default:""`
	Bucket       string        `help:"Bucket which will be used to upload bloom filters" default:"" testDefault:"gc-queue"` // TODO do we need full location?
	ZipBatchSize int           `help:"how many bloom filters will be packed in a single zip" default:"40" testDefault:"2"`
	ExpireIn     time.Duration `` /* 130-byte string literal not displayed */
}

Config contains configurable values for garbage collection.

type Observer added in v1.70.1

type Observer struct {
	// contains filtered or unexported fields
}

Observer implements a rangedloop observer to collect bloom filters for the garbage collection.

architecture: Observer

func NewObserver added in v1.70.1

func NewObserver(log *zap.Logger, config Config, overlay Overlay) *Observer

NewObserver creates a new instance of the gc rangedloop observer.

func (*Observer) Finish added in v1.70.1

func (obs *Observer) Finish(ctx context.Context) (err error)

Finish uploads the bloom filters.

func (*Observer) Fork added in v1.70.1

func (obs *Observer) Fork(ctx context.Context) (_ rangedloop.Partial, err error)

Fork creates a Partial to build bloom filters over a chunk of all the segments.

func (*Observer) Join added in v1.70.1

func (obs *Observer) Join(ctx context.Context, partial rangedloop.Partial) (err error)

Join merges the bloom filters gathered by each Partial.

func (*Observer) Start added in v1.70.1

func (obs *Observer) Start(ctx context.Context, startTime time.Time) (err error)

Start is called at the beginning of each segment loop.

func (*Observer) TestingCreationTime added in v1.114.5

func (obs *Observer) TestingCreationTime() time.Time

TestingCreationTime gets the creation time which will be used to set bloom filter CreationDate.

func (*Observer) TestingForceTableSize added in v1.100.2

func (obs *Observer) TestingForceTableSize(size int)

TestingForceTableSize sets a fixed size for tables. Used for testing.

func (*Observer) TestingRetainInfos added in v1.80.3

func (obs *Observer) TestingRetainInfos() nodeidmap.Map[*RetainInfo]

TestingRetainInfos returns retain infos collected by observer.

type Overlay added in v1.114.5

type Overlay interface {
	ActiveNodesPieceCounts(ctx context.Context) (pieceCounts map[storj.NodeID]int64, err error)
}

Overlay minimal set of overlay functions that are needed for the observer.

type RetainInfo

type RetainInfo struct {
	Filter *bloomfilter.Filter
	Count  int
}

RetainInfo contains info needed for a storage node to retain important data and delete garbage data.

type SyncObserver added in v1.79.1

type SyncObserver struct {
	// contains filtered or unexported fields
}

SyncObserver implements a rangedloop observer to collect bloom filters for the garbage collection.

func NewSyncObserver added in v1.79.1

func NewSyncObserver(log *zap.Logger, config Config, overlay overlay.DB) *SyncObserver

NewSyncObserver creates a new instance of the gc rangedloop observer.

func (*SyncObserver) Finish added in v1.79.1

func (obs *SyncObserver) Finish(ctx context.Context) (err error)

Finish uploads the bloom filters.

func (*SyncObserver) Fork added in v1.79.1

func (obs *SyncObserver) Fork(ctx context.Context) (_ rangedloop.Partial, err error)

Fork creates a Partial to build bloom filters over a chunk of all the segments.

func (*SyncObserver) Join added in v1.79.1

func (obs *SyncObserver) Join(ctx context.Context, partial rangedloop.Partial) (err error)

Join merges the bloom filters gathered by each Partial.

func (*SyncObserver) Process added in v1.79.1

func (obs *SyncObserver) Process(ctx context.Context, segments []rangedloop.Segment) error

Process adds pieces to the bloom filter from remote segments.

func (*SyncObserver) Start added in v1.79.1

func (obs *SyncObserver) Start(ctx context.Context, startTime time.Time) (err error)

Start is called at the beginning of each segment loop.

func (*SyncObserver) TestingForceTableSize added in v1.104.1

func (obs *SyncObserver) TestingForceTableSize(size int)

TestingForceTableSize sets a fixed size for tables. Used for testing.

func (*SyncObserver) TestingRetainInfos added in v1.104.1

func (obs *SyncObserver) TestingRetainInfos() nodeidmap.Map[*RetainInfo]

TestingRetainInfos returns retain infos collected by observer.

type TestingObserver added in v1.104.1

type TestingObserver interface {
	TestingRetainInfos() nodeidmap.Map[*RetainInfo]
	TestingForceTableSize(size int)
}

TestingObserver provides testing methods for bloom filter generation ranged loop observers.

type Upload added in v1.79.1

type Upload struct {
	// contains filtered or unexported fields
}

Upload is used to upload bloom filters to specified bucket.

func NewUpload added in v1.79.1

func NewUpload(log *zap.Logger, config Config) *Upload

NewUpload creates new upload for bloom filters.

func (*Upload) CheckConfig added in v1.79.1

func (bfu *Upload) CheckConfig() error

CheckConfig check configuration values.

func (*Upload) UploadBloomFilters added in v1.79.1

func (bfu *Upload) UploadBloomFilters(ctx context.Context, creationDate time.Time, retainInfos nodeidmap.Map[*RetainInfo]) (err error)

UploadBloomFilters stores a zipfile with multiple bloom filters in a bucket.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL