Documentation ¶
Overview ¶
Package bloomfilter contains the functions needed to run part of garbage collection process.
The bloomfilter.Observer implements the ranged loop Observer interface allowing us to subscribe to the loop to get information for every segment in the metabase db.
The bloomfilter.Observer is subscribed to ranged loop instance to account for all existing segment pieces on storage nodes and create "retain requests" which contain a bloom filter of all pieces that possibly exist on a storage node. With ranged loop segments can be processed in parallel to speed up process.
The bloomfilter.Observer will send that requests to the Storj bucket after a full ranged loop iteration. After that bloom filters will be downloaded and sent to the storage nodes with separate service from storj/satellite/gc/sender package.
This bloom filter service should be run only against immutable database snapshot.
See storj/docs/design/garbage-collection.md for more info.
Index ¶
- Constants
- type Config
- type Observer
- func (obs *Observer) Finish(ctx context.Context) (err error)
- func (obs *Observer) Fork(ctx context.Context) (_ rangedloop.Partial, err error)
- func (obs *Observer) Join(ctx context.Context, partial rangedloop.Partial) (err error)
- func (obs *Observer) Start(ctx context.Context, startTime time.Time) (err error)
- func (obs *Observer) TestingRetainInfos() map[storj.NodeID]*RetainInfo
- type RetainInfo
- type SyncObserver
- func (obs *SyncObserver) Finish(ctx context.Context) (err error)
- func (obs *SyncObserver) Fork(ctx context.Context) (_ rangedloop.Partial, err error)
- func (obs *SyncObserver) Join(ctx context.Context, partial rangedloop.Partial) (err error)
- func (obs *SyncObserver) Process(ctx context.Context, segments []rangedloop.Segment) error
- func (obs *SyncObserver) Start(ctx context.Context, startTime time.Time) (err error)
- type Upload
Constants ¶
const LATEST = "LATEST"
LATEST is the name of the file that contains the most recently completed bloomfilter generation prefix.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct { RunOnce bool `help:"set if garbage collection bloom filter process should only run once then exit" default:"false"` UseSyncObserver bool `help:"whether to use test GC SyncObserver with ranged loop" default:"false"` // value for InitialPieces currently based on average pieces per node InitialPieces int64 `` /* 139-byte string literal not displayed */ FalsePositiveRate float64 `help:"the false positive rate used for creating a garbage collection bloom filter" releaseDefault:"0.1" devDefault:"0.1"` AccessGrant string `help:"Access Grant which will be used to upload bloom filters to the bucket" default:""` Bucket string `help:"Bucket which will be used to upload bloom filters" default:"" testDefault:"gc-queue"` // TODO do we need full location? ZipBatchSize int `help:"how many bloom filters will be packed in a single zip" default:"500" testDefault:"2"` ExpireIn time.Duration `` /* 130-byte string literal not displayed */ }
Config contains configurable values for garbage collection.
type Observer ¶ added in v1.70.1
type Observer struct {
// contains filtered or unexported fields
}
Observer implements a rangedloop observer to collect bloom filters for the garbage collection.
architecture: Observer
func NewObserver ¶ added in v1.70.1
NewObserver creates a new instance of the gc rangedloop observer.
func (*Observer) Fork ¶ added in v1.70.1
Fork creates a Partial to build bloom filters over a chunk of all the segments.
func (*Observer) TestingRetainInfos ¶ added in v1.80.3
func (obs *Observer) TestingRetainInfos() map[storj.NodeID]*RetainInfo
TestingRetainInfos returns retain infos collected by observer.
type RetainInfo ¶
type RetainInfo struct { Filter *bloomfilter.Filter Count int }
RetainInfo contains info needed for a storage node to retain important data and delete garbage data.
type SyncObserver ¶ added in v1.79.1
type SyncObserver struct {
// contains filtered or unexported fields
}
SyncObserver implements a rangedloop observer to collect bloom filters for the garbage collection.
func NewSyncObserver ¶ added in v1.79.1
NewSyncObserver creates a new instance of the gc rangedloop observer.
func (*SyncObserver) Finish ¶ added in v1.79.1
func (obs *SyncObserver) Finish(ctx context.Context) (err error)
Finish uploads the bloom filters.
func (*SyncObserver) Fork ¶ added in v1.79.1
func (obs *SyncObserver) Fork(ctx context.Context) (_ rangedloop.Partial, err error)
Fork creates a Partial to build bloom filters over a chunk of all the segments.
func (*SyncObserver) Join ¶ added in v1.79.1
func (obs *SyncObserver) Join(ctx context.Context, partial rangedloop.Partial) (err error)
Join merges the bloom filters gathered by each Partial.
func (*SyncObserver) Process ¶ added in v1.79.1
func (obs *SyncObserver) Process(ctx context.Context, segments []rangedloop.Segment) error
Process adds pieces to the bloom filter from remote segments.
type Upload ¶ added in v1.79.1
type Upload struct {
// contains filtered or unexported fields
}
Upload is used to upload bloom filters to specified bucket.
func (*Upload) CheckConfig ¶ added in v1.79.1
CheckConfig check configuration values.