bloomfilter

package
v1.67.0-rc Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 10, 2022 License: AGPL-3.0 Imports: 16 Imported by: 0

Documentation

Overview

Package bloomfilter contains the functions needed to run part of garbage collection process.

The bloomfilter.PieceTracker implements the segments loop Observer interface allowing us to subscribe to the loop to get information for every segment in the metabase db.

The bloomfilter.PieceTracker handling functions are used by the bloomfilter.Service to periodically account for all existing pieces on storage nodes and create "retain requests" which contain a bloom filter of all pieces that possibly exist on a storage node.

The bloomfilter.Service will send that requests to the Storj bucket after a full segments loop iteration. After that bloom filters will be downloaded and sent to the storage nodes with separate service from storj/satellite/gc package.

This bloom filter service should be run only against immutable database snapshot.

See storj/docs/design/garbage-collection.md for more info.

Index

Constants

View Source
const LATEST = "LATEST"

LATEST is the name of the file that contains the most recently completed bloomfilter generation prefix.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	Interval time.Duration `help:"the time between each garbage collection executions" releaseDefault:"120h" devDefault:"10m" testDefault:"$TESTINTERVAL"`
	// TODO service is not enabled by default for testing until will be finished
	Enabled bool `help:"set if garbage collection bloom filters is enabled or not" default:"true" testDefault:"false"`

	RunOnce bool `help:"set if garbage collection bloom filter process should only run once then exit" default:"false"`

	// value for InitialPieces currently based on average pieces per node
	InitialPieces     int64   `` /* 139-byte string literal not displayed */
	FalsePositiveRate float64 `help:"the false positive rate used for creating a garbage collection bloom filter" releaseDefault:"0.1" devDefault:"0.1"`

	AccessGrant  string        `help:"Access Grant which will be used to upload bloom filters to the bucket" default:""`
	Bucket       string        `help:"Bucket which will be used to upload bloom filters" default:"" testDefault:"gc-queue"` // TODO do we need full location?
	ZipBatchSize int           `help:"how many bloom filters will be packed in a single zip" default:"500" testDefault:"2"`
	ExpireIn     time.Duration `help:"how quickly uploaded bloom filters will be automatically deleted" default:"336h"`
}

Config contains configurable values for garbage collection.

type PieceTracker

type PieceTracker struct {
	RetainInfos map[storj.NodeID]*RetainInfo
	// LatestCreationTime will be used to set bloom filter CreationDate.
	// Because bloom filter service needs to be run against immutable database snapshot
	// we can set CreationDate for bloom filters as a latest segment CreatedAt value.
	LatestCreationTime time.Time
	// contains filtered or unexported fields
}

PieceTracker implements the segments loop observer interface for garbage collection.

architecture: Observer

func NewPieceTracker

func NewPieceTracker(log *zap.Logger, config Config, pieceCounts map[storj.NodeID]int64) *PieceTracker

NewPieceTracker instantiates a new gc piece tracker to be subscribed to the segments loop.

func (*PieceTracker) InlineSegment

func (pieceTracker *PieceTracker) InlineSegment(ctx context.Context, segment *segmentloop.Segment) (err error)

InlineSegment returns nil because we're only doing gc for storage nodes for now.

func (*PieceTracker) LoopStarted

func (pieceTracker *PieceTracker) LoopStarted(ctx context.Context, info segmentloop.LoopInfo) (err error)

LoopStarted is called at each start of a loop.

func (*PieceTracker) RemoteSegment

func (pieceTracker *PieceTracker) RemoteSegment(ctx context.Context, segment *segmentloop.Segment) error

RemoteSegment takes a remote segment found in metabase and adds pieces to bloom filters.

type RetainInfo

type RetainInfo struct {
	Filter *bloomfilter.Filter
	Count  int
}

RetainInfo contains info needed for a storage node to retain important data and delete garbage data.

type Service

type Service struct {
	Loop *sync2.Cycle
	// contains filtered or unexported fields
}

Service implements service to collect bloom filters for the garbage collection.

architecture: Chore

func NewService

func NewService(log *zap.Logger, config Config, overlay overlay.DB, loop *segmentloop.Service) *Service

NewService creates a new instance of the gc service.

func (*Service) Run

func (service *Service) Run(ctx context.Context) (err error)

Run starts the gc loop service.

func (*Service) RunOnce

func (service *Service) RunOnce(ctx context.Context) (err error)

RunOnce runs service only once.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL