mergeplan

package
v0.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 3, 2020 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package mergeplan provides a segment merge planning approach that's inspired by Lucene's TieredMergePolicy.java and descriptions like http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

Index

Constants

View Source
const MaxSegmentSizeLimit = 1<<31 - 1

MaxSegmentSizeLimit represents the maximum size of a segment, this limit comes with hit-1 optimisation/max encoding limit uint31.

Variables

View Source
var DefaultMergePlanOptions = MergePlanOptions{
	MaxSegmentsPerTier:   10,
	MaxSegmentSize:       5000000,
	TierGrowth:           10.0,
	SegmentsPerMergeTask: 10,
	FloorSegmentSize:     2000,
	ReclaimDeletesWeight: 2.0,
}

DefaultMergePlanOptions suggests the default options.

View Source
var ErrMaxSegmentSizeTooLarge = errors.New("MaxSegmentSize exceeds the size limit")

ErrMaxSegmentSizeTooLarge is returned when the size of the segment exceeds the MaxSegmentSizeLimit

Functions

func CalcBudget

func CalcBudget(totalSize int64, firstTierSize int64, o *MergePlanOptions) (
	budgetNumSegments int)

Compute the number of segments that would be needed to cover the totalSize, by climbing up a logarithmically growing staircase of segment tiers.

func ScoreSegments

func ScoreSegments(segments []Segment, o *MergePlanOptions) float64

Smaller result score is better.

func ToBarChart

func ToBarChart(prefix string, barMax int, segments []Segment, plan *MergePlan) string

ToBarChart returns an ASCII rendering of the segments and the plan. The barMax is the max width of the bars in the bar chart.

func ValidateMergePlannerOptions added in v0.8.0

func ValidateMergePlannerOptions(options *MergePlanOptions) error

ValidateMergePlannerOptions validates the merge planner options

Types

type MergePlan

type MergePlan struct {
	Tasks []*MergeTask
}

A MergePlan is the result of the Plan() API.

The planner doesn’t know how or whether these tasks are executed -- that’s up to a separate merge execution system, which might execute these tasks concurrently or not, and which might execute all the tasks or not.

func Plan

func Plan(segments []Segment, o *MergePlanOptions) (*MergePlan, error)

Plan() will functionally compute a merge plan. A segment will be assigned to at most a single MergeTask in the output MergePlan. A segment not assigned to any MergeTask means the segment should remain unmerged.

type MergePlanOptions

type MergePlanOptions struct {
	// Max # segments per logarithmic tier, or max width of any
	// logarithmic “step”.  Smaller values mean more merging but fewer
	// segments.  Should be >= SegmentsPerMergeTask, else you'll have
	// too much merging.
	MaxSegmentsPerTier int

	// Max size of any segment produced after merging.  Actual
	// merging, however, may produce segment sizes different than the
	// planner’s predicted sizes.
	MaxSegmentSize int64

	// The growth factor for each tier in a staircase of idealized
	// segments computed by CalcBudget().
	TierGrowth float64

	// The number of segments in any resulting MergeTask.  e.g.,
	// len(result.Tasks[ * ].Segments) == SegmentsPerMergeTask.
	SegmentsPerMergeTask int

	// Small segments are rounded up to this size, i.e., treated as
	// equal (floor) size for consideration.  This is to prevent lots
	// of tiny segments from resulting in a long tail in the index.
	FloorSegmentSize int64

	// Controls how aggressively merges that reclaim more deletions
	// are favored.  Higher values will more aggressively target
	// merges that reclaim deletions, but be careful not to go so high
	// that way too much merging takes place; a value of 3.0 is
	// probably nearly too high.  A value of 0.0 means deletions don't
	// impact merge selection.
	ReclaimDeletesWeight float64

	// Optional, defaults to mergeplan.CalcBudget().
	CalcBudget func(totalSize int64, firstTierSize int64,
		o *MergePlanOptions) (budgetNumSegments int)

	// Optional, defaults to mergeplan.ScoreSegments().
	ScoreSegments func(segments []Segment, o *MergePlanOptions) float64

	// Optional.
	Logger func(string)
}

The MergePlanOptions is designed to be reusable between planning calls.

func (*MergePlanOptions) RaiseToFloorSegmentSize

func (o *MergePlanOptions) RaiseToFloorSegmentSize(s int64) int64

Returns the higher of the input or FloorSegmentSize.

type MergeTask

type MergeTask struct {
	Segments []Segment
}

A MergeTask represents several segments that should be merged together into a single segment.

type Segment

type Segment interface {
	// Unique id of the segment -- used for sorting.
	Id() uint64

	// Full segment size (the size before any logical deletions).
	FullSize() int64

	// Size of the live data of the segment; i.e., FullSize() minus
	// any logical deletions.
	LiveSize() int64
}

A Segment represents the information that the planner needs to calculate segment merging.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL