bloomfilter

package
v1.72.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 8, 2023 License: AGPL-3.0 Imports: 17 Imported by: 0

Documentation

Overview

Package bloomfilter contains the functions needed to run part of garbage collection process.

The bloomfilter.PieceTracker implements the segments loop Observer interface allowing us to subscribe to the loop to get information for every segment in the metabase db.

The bloomfilter.PieceTracker handling functions are used by the bloomfilter.Service to periodically account for all existing pieces on storage nodes and create "retain requests" which contain a bloom filter of all pieces that possibly exist on a storage node.

The bloomfilter.Service will send that requests to the Storj bucket after a full segments loop iteration. After that bloom filters will be downloaded and sent to the storage nodes with separate service from storj/satellite/gc package.

This bloom filter service should be run only against immutable database snapshot.

See storj/docs/design/garbage-collection.md for more info.

Index

Constants

View Source
const LATEST = "LATEST"

LATEST is the name of the file that contains the most recently completed bloomfilter generation prefix.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	Interval time.Duration `help:"the time between each garbage collection executions" releaseDefault:"120h" devDefault:"10m" testDefault:"$TESTINTERVAL"`
	// TODO service is not enabled by default for testing until will be finished
	Enabled bool `help:"set if garbage collection bloom filters is enabled or not" default:"true" testDefault:"false"`

	RunOnce bool `help:"set if garbage collection bloom filter process should only run once then exit" default:"false"`

	UseRangedLoop bool `help:"whether to use ranged loop instead of segment loop" default:"false"`

	// value for InitialPieces currently based on average pieces per node
	InitialPieces     int64   `` /* 139-byte string literal not displayed */
	FalsePositiveRate float64 `help:"the false positive rate used for creating a garbage collection bloom filter" releaseDefault:"0.1" devDefault:"0.1"`

	AccessGrant  string        `help:"Access Grant which will be used to upload bloom filters to the bucket" default:""`
	Bucket       string        `help:"Bucket which will be used to upload bloom filters" default:"" testDefault:"gc-queue"` // TODO do we need full location?
	ZipBatchSize int           `help:"how many bloom filters will be packed in a single zip" default:"500" testDefault:"2"`
	ExpireIn     time.Duration `help:"how quickly uploaded bloom filters will be automatically deleted" default:"336h"`
}

Config contains configurable values for garbage collection.

type Observer added in v1.70.1

type Observer struct {
	// contains filtered or unexported fields
}

Observer implements a rangedloop observer to collect bloom filters for the garbage collection.

architecture: Observer

func NewObserver added in v1.70.1

func NewObserver(log *zap.Logger, config Config, overlay overlay.DB) *Observer

NewObserver creates a new instance of the gc rangedloop observer.

func (*Observer) Finish added in v1.70.1

func (obs *Observer) Finish(ctx context.Context) (err error)

Finish uploads the bloom filters.

func (*Observer) Fork added in v1.70.1

func (obs *Observer) Fork(ctx context.Context) (_ rangedloop.Partial, err error)

Fork creates a Partial to build bloom filters over a chunk of all the segments.

func (*Observer) Join added in v1.70.1

func (obs *Observer) Join(ctx context.Context, partial rangedloop.Partial) (err error)

Join merges the bloom filters gathered by each Partial.

func (*Observer) Start added in v1.70.1

func (obs *Observer) Start(ctx context.Context, startTime time.Time) (err error)

Start is called at the beginning of each segment loop.

type PieceTracker

type PieceTracker struct {
	RetainInfos map[storj.NodeID]*RetainInfo
	// LatestCreationTime will be used to set bloom filter CreationDate.
	// Because bloom filter service needs to be run against immutable database snapshot
	// we can set CreationDate for bloom filters as a latest segment CreatedAt value.
	LatestCreationTime time.Time
	// contains filtered or unexported fields
}

PieceTracker implements the segments loop observer interface for garbage collection.

architecture: Observer

func NewPieceTracker

func NewPieceTracker(log *zap.Logger, config Config, pieceCounts map[storj.NodeID]int64) *PieceTracker

NewPieceTracker instantiates a new gc piece tracker to be subscribed to the segments loop.

func NewPieceTrackerWithSeed added in v1.70.1

func NewPieceTrackerWithSeed(log *zap.Logger, config Config, pieceCounts map[storj.NodeID]int64, seed byte) *PieceTracker

NewPieceTrackerWithSeed instantiates a new gc piece tracker to be subscribed to the rangedloop. The seed is passed so that it can be shared among all parallel PieceTrackers handling each segment range.

func (*PieceTracker) InlineSegment

func (pieceTracker *PieceTracker) InlineSegment(ctx context.Context, segment *segmentloop.Segment) (err error)

InlineSegment returns nil because we're only doing gc for storage nodes for now.

func (*PieceTracker) LoopStarted

func (pieceTracker *PieceTracker) LoopStarted(ctx context.Context, info segmentloop.LoopInfo) (err error)

LoopStarted is called at each start of a loop.

func (*PieceTracker) Process added in v1.70.1

func (pieceTracker *PieceTracker) Process(ctx context.Context, segments []segmentloop.Segment) error

Process adds pieces to the bloom filter from remote segments.

func (*PieceTracker) RemoteSegment

func (pieceTracker *PieceTracker) RemoteSegment(ctx context.Context, segment *segmentloop.Segment) error

RemoteSegment takes a remote segment found in metabase and adds pieces to bloom filters.

type RetainInfo

type RetainInfo struct {
	Filter *bloomfilter.Filter
	Count  int
}

RetainInfo contains info needed for a storage node to retain important data and delete garbage data.

type Service

type Service struct {
	Loop *sync2.Cycle
	// contains filtered or unexported fields
}

Service implements service to collect bloom filters for the garbage collection.

architecture: Chore

func NewService

func NewService(log *zap.Logger, config Config, overlay overlay.DB, loop *segmentloop.Service) *Service

NewService creates a new instance of the gc service.

func (*Service) Run

func (service *Service) Run(ctx context.Context) (err error)

Run starts the gc loop service.

func (*Service) RunOnce

func (service *Service) RunOnce(ctx context.Context) (err error)

RunOnce runs service only once.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL