segmentindex

package

v1.25.23 Latest Latest Go to latest Published: Oct 29, 2024 License: BSD-3-Clause Imports: 13 Imported by: 2

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/weaviate/weaviate

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
type DiskTree
- func NewDiskTree(data []byte) *DiskTree
type Header
- func ParseHeader(r io.Reader) (*Header, error)
type Indexes
- func (s Indexes) WriteTo(w io.Writer) (int64, error)
type Key
type Node
type Strategy
type Tree
- func NewBalanced(nodes []Node) Tree
- func NewTree(capacity int) Tree

Constants ¶

View Source

const HeaderSize = 16

HeaderSize describes the general offset in a segment until the data starts, it is composed of 2 bytes for level, 2 bytes for version, 2 bytes for secondary index count, 2 bytes for strategy, 8 bytes for the pointer to the index part

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type DiskTree ¶

type DiskTree struct {
	// contains filtered or unexported fields
}

DiskTree is a read-only wrapper around a marshalled index search tree, which can be used for reading, but cannot change the underlying structure. It is thus perfectly suited as an index for an (immutable) LSM disk segment, but pretty much useless for anything else

func NewDiskTree ¶

func NewDiskTree(data []byte) *DiskTree

func (*DiskTree) AllKeys ¶

func (t *DiskTree) AllKeys() ([][]byte, error)

AllKeys is a relatively expensive operation as it basically does a full disk read of the index. It is meant for one of operations, such as initializing a segment where we need access to all keys, e.g. to build a bloom filter. This should not run at query time.

The binary tree is traversed in Level-Order so keys have no meaningful order. Do not use this method if an In-Order traversal is required, but only for use cases who don't require a specific order, such as building a bloom filter.

func (*DiskTree) Get ¶

func (t *DiskTree) Get(key []byte) (Node, error)

func (*DiskTree) QuantileKeys ¶ added in v1.25.11

func (t *DiskTree) QuantileKeys(q int) [][]byte

QuantileKeys returns a list of keys that roughly represent the quantiles of the tree. This can be very useful to bootstrap parallel cursors over the segment that are more or less evenly distributed.

This method uses the natural shape of the tree to determine the distribution of the keys. This is a performance-accuracy trade-off. It does not guarantee perfect distribution, but it is fairly cheap to obtain as most runs will only need to go a few levels deep – even on massive trees.

The number of keys returned is not guaranteed to be exactly q, in most cases returns more keys. This is because in a real-life application you would likely aggregate across multiple segments. Similarly keys are not returned in any specific order, as the assumption is that post-processing will be done when keys are aggregated across multiple segments.

The two guarantees you get are:

If there are at least q keys in the tree, you will get at least q keys, most likely more
If there are less than q keys in the tree, you will get all keys.

func (*DiskTree) Seek ¶

func (t *DiskTree) Seek(key []byte) (Node, error)

func (*DiskTree) Size ¶

func (t *DiskTree) Size() int

type Header ¶ added in v1.18.0

type Header struct {
	Level            uint16
	Version          uint16
	SecondaryIndices uint16
	Strategy         Strategy
	IndexStart       uint64
}

func ParseHeader ¶ added in v1.18.0

func ParseHeader(r io.Reader) (*Header, error)

func (h *Header) PrimaryIndex(source []byte) ([]byte, error)

func (h *Header) SecondaryIndex(source []byte, indexID uint16) ([]byte, error)

func (h *Header) WriteTo(w io.Writer) (int64, error)

type Indexes ¶ added in v1.18.0

type Indexes struct {
	Keys                []Key
	SecondaryIndexCount uint16
	ScratchSpacePath    string
}

func (Indexes) WriteTo ¶ added in v1.18.0

func (s Indexes) WriteTo(w io.Writer) (int64, error)

type Key ¶ added in v1.18.0

type Key struct {
	Key           []byte
	SecondaryKeys [][]byte
	ValueStart    int
	ValueEnd      int
}

Key is a helper struct that can be used to build the key nodes for the segment index. It contains the primary key and an arbitrary number of secondary keys, as well as valueStart and valueEnd indicator. Those are used to find the correct payload for each key.

type Node ¶

type Node struct {
	Key   []byte
	Start uint64
	End   uint64
}

type Strategy ¶ added in v1.18.0

type Strategy uint16

const (
	StrategyReplace Strategy = iota
	StrategySetCollection
	StrategyMapCollection
	StrategyRoaringSet
)

type Tree ¶

type Tree struct {
	// contains filtered or unexported fields
}

func NewBalanced ¶

func NewBalanced(nodes []Node) Tree

func NewTree ¶

func NewTree(capacity int) Tree

func (*Tree) Get ¶

func (t *Tree) Get(key []byte) ([]byte, uint64, uint64)

func (*Tree) Height ¶

func (t *Tree) Height() int

func (*Tree) Insert ¶

func (t *Tree) Insert(key []byte, start, end uint64)

func (*Tree) MarshalBinary ¶

func (t *Tree) MarshalBinary() ([]byte, error)

func (*Tree) MarshalBinaryInto ¶

func (t *Tree) MarshalBinaryInto(w io.Writer) (int64, error)

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL