bootstrapper

package
v1.0.0-rc.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 12, 2020 License: Apache-2.0 Imports: 22 Imported by: 14

README

bootstrapper

The collection of bootstrappers comprise the task executed when bootstrapping a node.

Bootstrappers

  • fs: The filesystem bootstrapper, used to bootstrap as much data as possible from the local filesystem.
  • peers: The peers bootstrapper, used to bootstrap any remaining data from peers. This is used for a full node join too.
  • commitlog: The commit log bootstrapper, currently only used in the case that peers bootstrapping fails. Once the current block is being snapshotted frequently to disk it might be faster and make more sense to not actively use the peers bootstrapper and just use a combination of the filesystem bootstrapper and the minimal time range required from the commit log bootstrapper.
    • NOTE: the commitlog bootstrapper is special cased in that it runs for the entire bootstrappable range per shard whereas other bootstrappers fill in the unfulfilled gaps as bootstrapping progresses.

Cache policies

The tasks carried out by each bootstrapper vary a lot on the series cache policy being used.

CacheAll series cache policy

For the cache all policy the filesystem bootstrapper will load all series and all the data for each block and return the entire set of data. This will keep every series and series block on heap.

The peers bootstrapper similarly bootstraps all the data from peers that the filesystem does not have and returns the entire set of data fetched.

RecentlyRead series cache policy

For the recently read policy the filesystem bootstrapper will simply fulfill the time ranges requested matching without actually loading the series and blocks from the files it discovers. This relies on data been fetched lazily from the filesystem when data is required for a series that does not live on heap.

The peers bootstrapper will bootstrap all time ranges requested, and if performing a bootstrap with persistence enabled for a time range, will write the data to disk and then remove the results from memory. A bootstrap with persistence enabled is used for any data that is immutable at the time that bootstrapping commences. For time ranges that are mutable the peer bootstrapper will still write the data out to disk in a durable manner, but in the form of a snapshot, and the series and blocks will still be returned directly as a result from the bootstrapper. This enables the commit log bootstrapper to recover the data in case the node shuts down before the in-memory data can be flushed.

Topology Changes

When nodes are added to a replication group, shards are given away to the joining node. Those shards are closed and we re-bootstrap with the shards that we own. When nodes are removed from a replication group, shards from the removed node are given to remaining nodes in a replication group. The remaining nodes in the replication group will bootstrap the "new" shards that were assigned to it. Note that we also take writes for shards that we own while bootstrapping. However, we do not allow warm/cold flushes to happen while bootstrapping.

For example, see the following sequences: (Node add)

  • Node 1:
    • Initial bootstrap (256 shards)
    • Node add
    • Bootstrap (128 shards) // These are the remaining shards it owns.
  • Node 2:
    • Node add
    • Inital bootstrap (128 shards) // These are received from Node 1

(Node remove)

  • Node 1:
    • Node remove
    • Bootstrap (128 shards) // These are received form Node 2, it owns 256 now.
  • Node 2:
    • Node remove

Documentation

Index

Constants

View Source
const (
	// NoOpNoneBootstrapperName is the name of the noOpNoneBootstrapper
	NoOpNoneBootstrapperName = "noop-none"

	// NoOpAllBootstrapperName is the name of the noOpAllBootstrapperProvider
	NoOpAllBootstrapperName = "noop-all"
)

Variables

This section is empty.

Functions

func BuildBootstrapIndexSegment added in v0.15.0

func BuildBootstrapIndexSegment(
	ns namespace.Metadata,
	requestedRanges result.ShardTimeRanges,
	builder segment.DocumentsBuilder,
	compactor *SharedCompactor,
	resultOpts result.Options,
	mmapReporter mmap.Reporter,
	blockStart time.Time,
	blockEnd time.Time,
) (result.IndexBlock, error)

BuildBootstrapIndexSegment is a helper function that builds (in memory) bootstrapped index segments for a ns -> block of time.

func EnqueueReaders added in v0.15.0

func EnqueueReaders(opts EnqueueReadersOptions)

EnqueueReaders into a readers channel grouped by data block.

func GetDefaultIndexBlockForBlockStart added in v0.15.0

func GetDefaultIndexBlockForBlockStart(
	results result.IndexResults,
	blockStart time.Time,
) (result.IndexBlock, bool)

GetDefaultIndexBlockForBlockStart gets the index block for the default volume type from the index results.

func NewBaseBootstrapper

func NewBaseBootstrapper(
	name string,
	src bootstrap.Source,
	opts result.Options,
	next bootstrap.Bootstrapper,
) (bootstrap.Bootstrapper, error)

NewBaseBootstrapper creates a new base bootstrapper.

func NewNoOpAllBootstrapperProvider

func NewNoOpAllBootstrapperProvider() bootstrap.BootstrapperProvider

NewNoOpAllBootstrapperProvider creates a new noOpAllBootstrapperProvider.

func NewNoOpNoneBootstrapperProvider

func NewNoOpNoneBootstrapperProvider() bootstrap.BootstrapperProvider

NewNoOpNoneBootstrapperProvider creates a new noOpNoneBootstrapper.

func PersistBootstrapIndexSegment added in v0.15.0

func PersistBootstrapIndexSegment(
	ns namespace.Metadata,
	requestedRanges result.ShardTimeRanges,
	builder segment.DocumentsBuilder,
	persistManager *SharedPersistManager,
	indexClaimsManager fs.IndexClaimsManager,
	resultOpts result.Options,
	fulfilled result.ShardTimeRanges,
	blockStart time.Time,
	blockEnd time.Time,
) (result.IndexBlock, error)

PersistBootstrapIndexSegment is a helper function that persists bootstrapped index segments for a ns -> block of time.

Types

type EnqueueReadersOptions added in v0.15.9

type EnqueueReadersOptions struct {
	NsMD                      namespace.Metadata
	RunOpts                   bootstrap.RunOptions
	RuntimeOpts               runtime.Options
	FsOpts                    fs.Options
	ShardTimeRanges           result.ShardTimeRanges
	ReaderPool                *ReaderPool
	ReadersCh                 chan<- TimeWindowReaders
	BlockSize                 time.Duration
	OptimizedReadMetadataOnly bool
	Logger                    *zap.Logger
	Span                      opentracing.Span
	NowFn                     clock.NowFn
	Cache                     bootstrap.Cache
}

EnqueueReadersOptions supplies options to enqueue readers.

type NewReaderPoolOptions added in v0.15.0

type NewReaderPoolOptions struct {
	Alloc        ReaderPoolAllocFn
	DisableReuse bool
}

NewReaderPoolOptions contains reader pool options.

type ReaderPool added in v0.15.0

type ReaderPool struct {
	sync.Mutex
	// contains filtered or unexported fields
}

ReaderPool is a lean pool that does not allocate instances up front and is used per bootstrap call.

func NewReaderPool added in v0.15.0

func NewReaderPool(
	opts NewReaderPoolOptions,
) *ReaderPool

NewReaderPool creates a new share-able fileset reader pool

func (*ReaderPool) Get added in v0.15.0

func (p *ReaderPool) Get() (fs.DataFileSetReader, error)

Get gets a fileset reader from the pool in synchronized fashion.

func (*ReaderPool) Put added in v0.15.0

func (p *ReaderPool) Put(r fs.DataFileSetReader)

Put returns a fileset reader back the the pool in synchronized fashion.

type ReaderPoolAllocFn added in v0.15.0

type ReaderPoolAllocFn func() (fs.DataFileSetReader, error)

ReaderPoolAllocFn allocates a new fileset reader.

type ShardID added in v0.15.0

type ShardID uint32

ShardID is the shard #.

type ShardReaders added in v0.15.0

type ShardReaders struct {
	Readers []fs.DataFileSetReader
}

ShardReaders are the fileset readers for a shard.

type ShardTimeRangesTimeWindowGroup added in v0.15.0

type ShardTimeRangesTimeWindowGroup struct {
	Ranges result.ShardTimeRanges
	// contains filtered or unexported fields
}

ShardTimeRangesTimeWindowGroup represents all time ranges for every shard w/in a block of time.

func NewShardTimeRangesTimeWindowGroups added in v0.15.0

func NewShardTimeRangesTimeWindowGroups(
	shardTimeRanges result.ShardTimeRanges,
	windowSize time.Duration,
) []ShardTimeRangesTimeWindowGroup

NewShardTimeRangesTimeWindowGroups divides shard time ranges into grouped blocks.

type SharedCompactor added in v0.15.0

type SharedCompactor struct {
	sync.Mutex
	Compactor *compaction.Compactor
}

SharedCompactor is a lockable compactor that's safe to be shared across threads.

type SharedPersistManager added in v0.15.0

type SharedPersistManager struct {
	sync.Mutex
	Mgr persist.Manager
}

SharedPersistManager is a lockable persist manager that's safe to be shared across threads.

type TimeWindowReaders added in v0.15.0

type TimeWindowReaders struct {
	Ranges  result.ShardTimeRanges
	Readers map[ShardID]ShardReaders
}

TimeWindowReaders are grouped by data block.

Directories

Path Synopsis
fs

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL