Documentation ¶
Overview ¶
Package bloomfilter contains the functions needed to run part of garbage collection process.
The bloomfilter.PieceTracker implements the segments loop Observer interface allowing us to subscribe to the loop to get information for every segment in the metabase db.
The bloomfilter.PieceTracker handling functions are used by the bloomfilter.Service to periodically account for all existing pieces on storage nodes and create "retain requests" which contain a bloom filter of all pieces that possibly exist on a storage node.
The bloomfilter.Service will send that requests to the Storj bucket after a full segments loop iteration. After that bloom filters will be downloaded and sent to the storage nodes with separate service from storj/satellite/gc package.
This bloom filter service should be run only against immutable database snapshot.
See storj/docs/design/garbage-collection.md for more info.
Index ¶
- Constants
- type Config
- type PieceTracker
- func (pieceTracker *PieceTracker) InlineSegment(ctx context.Context, segment *segmentloop.Segment) (err error)
- func (pieceTracker *PieceTracker) LoopStarted(ctx context.Context, info segmentloop.LoopInfo) (err error)
- func (pieceTracker *PieceTracker) RemoteSegment(ctx context.Context, segment *segmentloop.Segment) error
- type RetainInfo
- type Service
Constants ¶
const LATEST = "LATEST"
LATEST is the name of the file that contains the most recently completed bloomfilter generation prefix.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct { Interval time.Duration `help:"the time between each garbage collection executions" releaseDefault:"120h" devDefault:"10m" testDefault:"$TESTINTERVAL"` // TODO service is not enabled by default for testing until will be finished Enabled bool `help:"set if garbage collection bloom filters is enabled or not" default:"true" testDefault:"false"` RunOnce bool `help:"set if garbage collection bloom filter process should only run once then exit" default:"false"` // value for InitialPieces currently based on average pieces per node InitialPieces int64 `` /* 139-byte string literal not displayed */ FalsePositiveRate float64 `help:"the false positive rate used for creating a garbage collection bloom filter" releaseDefault:"0.1" devDefault:"0.1"` AccessGrant string `help:"Access Grant which will be used to upload bloom filters to the bucket" default:""` Bucket string `help:"Bucket which will be used to upload bloom filters" default:"" testDefault:"gc-queue"` // TODO do we need full location? ZipBatchSize int `help:"how many bloom filters will be packed in a single zip" default:"500" testDefault:"2"` ExpireIn time.Duration `help:"how quickly uploaded bloom filters will be automatically deleted" default:"336h"` }
Config contains configurable values for garbage collection.
type PieceTracker ¶
type PieceTracker struct { RetainInfos map[storj.NodeID]*RetainInfo // LatestCreationTime will be used to set bloom filter CreationDate. // Because bloom filter service needs to be run against immutable database snapshot // we can set CreationDate for bloom filters as a latest segment CreatedAt value. LatestCreationTime time.Time // contains filtered or unexported fields }
PieceTracker implements the segments loop observer interface for garbage collection.
architecture: Observer
func NewPieceTracker ¶
func NewPieceTracker(log *zap.Logger, config Config, pieceCounts map[storj.NodeID]int64) *PieceTracker
NewPieceTracker instantiates a new gc piece tracker to be subscribed to the segments loop.
func (*PieceTracker) InlineSegment ¶
func (pieceTracker *PieceTracker) InlineSegment(ctx context.Context, segment *segmentloop.Segment) (err error)
InlineSegment returns nil because we're only doing gc for storage nodes for now.
func (*PieceTracker) LoopStarted ¶
func (pieceTracker *PieceTracker) LoopStarted(ctx context.Context, info segmentloop.LoopInfo) (err error)
LoopStarted is called at each start of a loop.
func (*PieceTracker) RemoteSegment ¶
func (pieceTracker *PieceTracker) RemoteSegment(ctx context.Context, segment *segmentloop.Segment) error
RemoteSegment takes a remote segment found in metabase and adds pieces to bloom filters.
type RetainInfo ¶
type RetainInfo struct { Filter *bloomfilter.Filter Count int }
RetainInfo contains info needed for a storage node to retain important data and delete garbage data.
type Service ¶
Service implements service to collect bloom filters for the garbage collection.
architecture: Chore
func NewService ¶
func NewService(log *zap.Logger, config Config, overlay overlay.DB, loop *segmentloop.Service) *Service
NewService creates a new instance of the gc service.