segmentindex

package
v0.0.0-...-cba7ee7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 19, 2024 License: GPL-3.0 Imports: 13 Imported by: 0

Documentation

Index

Constants

View Source
const HeaderSize = 16

HeaderSize describes the general offset in a segment until the data starts, it is composed of 2 bytes for level, 2 bytes for version, 2 bytes for secondary index count, 2 bytes for strategy, 8 bytes for the pointer to the index part

Variables

This section is empty.

Functions

func CheckExpectedStrategy

func CheckExpectedStrategy(strategy Strategy, expectedStrategies ...Strategy) error

func IsExpectedStrategy

func IsExpectedStrategy(strategy Strategy, expectedStrategies ...Strategy) bool

func MustBeExpectedStrategy

func MustBeExpectedStrategy(strategy Strategy, expectedStrategies ...Strategy)

Types

type DiskTree

type DiskTree struct {
	// contains filtered or unexported fields
}

DiskTree is a read-only wrapper around a marshalled index search tree, which can be used for reading, but cannot change the underlying structure. It is thus perfectly suited as an index for an (immutable) LSM disk segment, but pretty much useless for anything else

func NewDiskTree

func NewDiskTree(data []byte) *DiskTree

func (*DiskTree) AllKeys

func (t *DiskTree) AllKeys() ([][]byte, error)

AllKeys is a relatively expensive operation as it basically does a full disk read of the index. It is meant for one of operations, such as initializing a segment where we need access to all keys, e.g. to build a bloom filter. This should not run at query time.

The binary tree is traversed in Level-Order so keys have no meaningful order. Do not use this method if an In-Order traversal is required, but only for use cases who don't require a specific order, such as building a bloom filter.

func (*DiskTree) Get

func (t *DiskTree) Get(key []byte) (Node, error)

func (*DiskTree) Next

func (t *DiskTree) Next(key []byte) (Node, error)

func (*DiskTree) QuantileKeys

func (t *DiskTree) QuantileKeys(q int) [][]byte

QuantileKeys returns a list of keys that roughly represent the quantiles of the tree. This can be very useful to bootstrap parallel cursors over the segment that are more or less evenly distributed.

This method uses the natural shape of the tree to determine the distribution of the keys. This is a performance-accuracy trade-off. It does not guarantee perfect distribution, but it is fairly cheap to obtain as most runs will only need to go a few levels deep – even on massive trees.

The number of keys returned is not guaranteed to be exactly q, in most cases returns more keys. This is because in a real-life application you would likely aggregate across multiple segments. Similarly keys are not returned in any specific order, as the assumption is that post-processing will be done when keys are aggregated across multiple segments.

The two guarantees you get are:

  1. If there are at least q keys in the tree, you will get at least q keys, most likely more
  2. If there are less than q keys in the tree, you will get all keys.

func (*DiskTree) Seek

func (t *DiskTree) Seek(key []byte) (Node, error)

func (*DiskTree) Size

func (t *DiskTree) Size() int
type Header struct {
	Level            uint16
	Version          uint16
	SecondaryIndices uint16
	Strategy         Strategy
	IndexStart       uint64
}

func ParseHeader

func ParseHeader(r io.Reader) (*Header, error)

func (*Header) PrimaryIndex

func (h *Header) PrimaryIndex(source []byte) ([]byte, error)

func (*Header) SecondaryIndex

func (h *Header) SecondaryIndex(source []byte, indexID uint16) ([]byte, error)

func (*Header) WriteTo

func (h *Header) WriteTo(w io.Writer) (int64, error)

type Indexes

type Indexes struct {
	Keys                []Key
	SecondaryIndexCount uint16
	ScratchSpacePath    string
}

func (Indexes) WriteTo

func (s Indexes) WriteTo(w io.Writer) (int64, error)

type Key

type Key struct {
	Key           []byte
	SecondaryKeys [][]byte
	ValueStart    int
	ValueEnd      int
}

Key is a helper struct that can be used to build the key nodes for the segment index. It contains the primary key and an arbitrary number of secondary keys, as well as valueStart and valueEnd indicator. Those are used to find the correct payload for each key.

type Node

type Node struct {
	Key   []byte
	Start uint64
	End   uint64
}

type Strategy

type Strategy uint16
const (
	StrategyReplace Strategy = iota
	StrategySetCollection
	StrategyMapCollection
	StrategyRoaringSet
	StrategyRoaringSetRange
)

type Tree

type Tree struct {
	// contains filtered or unexported fields
}

func NewBalanced

func NewBalanced(nodes []Node) Tree

func NewTree

func NewTree(capacity int) Tree

func (*Tree) Get

func (t *Tree) Get(key []byte) ([]byte, uint64, uint64)

func (*Tree) Height

func (t *Tree) Height() int

func (*Tree) Insert

func (t *Tree) Insert(key []byte, start, end uint64)

func (*Tree) MarshalBinary

func (t *Tree) MarshalBinary() ([]byte, error)

func (*Tree) MarshalBinaryInto

func (t *Tree) MarshalBinaryInto(w io.Writer) (int64, error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL