roaringsetrange

package

v0.0.0-...-f09cf9b Latest Latest Go to latest Published: Jan 13, 2025 License: BSD-3-Clause Imports: 15 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/liutizhong/weaviate

Documentation ¶

Index ¶

type CombinedReader
- func NewCombinedReader(readers []InnerReader, releaseReaders func(), concurrency int, ...) *CombinedReader
- func (r *CombinedReader) Close()
- func (r *CombinedReader) Read(ctx context.Context, value uint64, operator filters.Operator) (*sroar.Bitmap, error)
type Compactor
- func NewCompactor(w io.WriteSeeker, left, right SegmentCursor, level uint16, ...) *Compactor
- func (c *Compactor) Do() error
type GaplessSegmentCursor
- func NewGaplessSegmentCursor(cursor SegmentCursor) *GaplessSegmentCursor
- func (c *GaplessSegmentCursor) First() (uint8, roaringset.BitmapLayer, bool)
- func (c *GaplessSegmentCursor) Next() (uint8, roaringset.BitmapLayer, bool)
type InnerReader
type Memtable
- func NewMemtable(logger logrus.FieldLogger) *Memtable
- func (m *Memtable) Delete(key uint64, values []uint64)
- func (m *Memtable) Insert(key uint64, values []uint64)
- func (m *Memtable) Nodes() []*MemtableNode
type MemtableNode
type MemtableReader
- func NewMemtableReader(memtable *Memtable) *MemtableReader
- func (r *MemtableReader) Read(ctx context.Context, value uint64, operator filters.Operator) (roaringset.BitmapLayer, error)
type SegmentCursor
type SegmentCursorMmap
- func NewSegmentCursorMmap(data []byte) *SegmentCursorMmap
- func (c *SegmentCursorMmap) First() (uint8, roaringset.BitmapLayer, bool)
- func (c *SegmentCursorMmap) Next() (uint8, roaringset.BitmapLayer, bool)
type SegmentCursorPread
- func NewSegmentCursorPread(readSeeker io.ReadSeeker, bufferCount int) *SegmentCursorPread
- func (c *SegmentCursorPread) First() (uint8, roaringset.BitmapLayer, bool)
- func (c *SegmentCursorPread) Next() (uint8, roaringset.BitmapLayer, bool)
type SegmentNode
- func NewSegmentNode(key uint8, additions, deletions *sroar.Bitmap) (*SegmentNode, error)
- func NewSegmentNodeFromBuffer(buf []byte) *SegmentNode
- func (sn *SegmentNode) Additions() *sroar.Bitmap
- func (sn *SegmentNode) Deletions() *sroar.Bitmap
- func (sn *SegmentNode) Key() uint8
- func (sn *SegmentNode) Len() uint64
- func (sn *SegmentNode) ToBuffer() []byte
type SegmentReader
- func NewSegmentReader(cursor *GaplessSegmentCursor) *SegmentReader
- func NewSegmentReaderConcurrent(cursor *GaplessSegmentCursor, concurrency int) *SegmentReader
- func (r *SegmentReader) Read(ctx context.Context, value uint64, operator filters.Operator) (roaringset.BitmapLayer, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type CombinedReader ¶

type CombinedReader struct {
	// contains filtered or unexported fields
}

func NewCombinedReader ¶

func NewCombinedReader(readers []InnerReader, releaseReaders func(), concurrency int,
	logger logrus.FieldLogger,
) *CombinedReader

func (*CombinedReader) Close ¶

func (r *CombinedReader) Close()

func (*CombinedReader) Read ¶

func (r *CombinedReader) Read(ctx context.Context, value uint64, operator filters.Operator,
) (*sroar.Bitmap, error)

type Compactor ¶

type Compactor struct {
	// contains filtered or unexported fields
}

Compactor takes in a left and a right segment and merges them into a single segment. The input segments are represented by cursors without their respective segmentindexes. A new segmentindex is built from the merged nodes without taking the old indexes into account at all.

The left segment must precede the right one in its creation time, as the compactor applies latest-takes-presence rules when there is a conflict.

Merging independent key/value pairs ¶

The new segment's nodes will be in sorted fashion (this is a requirement for the segment index and segment cursors to function). To achieve a sorted end result, the Compactor goes over both input cursors simultaneously and always works on the smaller of the two keys. After a key/value pair has been added to the output only the input cursor that provided the pair is advanced.

Merging key/value pairs with identical keys ¶

When both segment have a key/value pair with an overlapping key, the value has to be merged. The merge logic is not part of the compactor itself. Instead it makes use of [BitmapLayers.Merge].

Exit Criterium ¶

When both cursors no longer return values, all key/value pairs are considered compacted. The compactor then deals with metadata.

Index and Header metadata ¶

Once the key/value pairs have been compacted, the input writer is rewinded to be able to write the header metadata at the beginning of the file Because of this, the input writer must be an io.WriteSeeker, such as *os.File.

The level of the resulting segment is the input level increased by one. Levels help the "eligible for compaction" cycle to find suitable compaction pairs.

func NewCompactor ¶

func NewCompactor(w io.WriteSeeker, left, right SegmentCursor,
	level uint16, cleanupDeletions bool,
) *Compactor

NewCompactor from left (older) and right (newer) seeker. See Compactor for an explanation of what goes on under the hood, and why the input requirements are the way they are.

func (*Compactor) Do ¶

func (c *Compactor) Do() error

Do starts a compaction. See Compactor for an explanation of this process.

type GaplessSegmentCursor ¶

type GaplessSegmentCursor struct {
	// contains filtered or unexported fields
}

func NewGaplessSegmentCursor ¶

func NewGaplessSegmentCursor(cursor SegmentCursor) *GaplessSegmentCursor

func (*GaplessSegmentCursor) First ¶

func (c *GaplessSegmentCursor) First() (uint8, roaringset.BitmapLayer, bool)

func (*GaplessSegmentCursor) Next ¶

func (c *GaplessSegmentCursor) Next() (uint8, roaringset.BitmapLayer, bool)

type InnerReader ¶

type InnerReader interface {
	Read(ctx context.Context, value uint64, operator filters.Operator) (roaringset.BitmapLayer, error)
}

type Memtable ¶

type Memtable struct {
	// contains filtered or unexported fields
}

As docID (acting as value) can have only single value (acting as key) assigned every new value replaces the previous one (therefore array data types are not supported)

func NewMemtable ¶

func NewMemtable(logger logrus.FieldLogger) *Memtable

func (*Memtable) Delete ¶

func (m *Memtable) Delete(key uint64, values []uint64)

func (*Memtable) Insert ¶

func (m *Memtable) Insert(key uint64, values []uint64)

func (*Memtable) Nodes ¶

func (m *Memtable) Nodes() []*MemtableNode

type MemtableNode ¶

type MemtableNode struct {
	Key       uint8
	Additions *sroar.Bitmap
	Deletions *sroar.Bitmap
}

type MemtableReader ¶

type MemtableReader struct {
	// contains filtered or unexported fields
}

func NewMemtableReader ¶

func NewMemtableReader(memtable *Memtable) *MemtableReader

func (*MemtableReader) Read ¶

func (r *MemtableReader) Read(ctx context.Context, value uint64, operator filters.Operator,
) (roaringset.BitmapLayer, error)

type SegmentCursor ¶

type SegmentCursor interface {
	First() (uint8, roaringset.BitmapLayer, bool)
	Next() (uint8, roaringset.BitmapLayer, bool)
}

type SegmentCursorMmap ¶

type SegmentCursorMmap struct {
	// contains filtered or unexported fields
}

A SegmentCursorMmap iterates over all key-value pairs in a single disk segment. You can start at the beginning using *SegmentCursorMmap.First and move forward using *SegmentCursorMmap.Next

func NewSegmentCursorMmap ¶

func NewSegmentCursorMmap(data []byte) *SegmentCursorMmap

NewSegmentCursorMmap creates a cursor for a single disk segment. Make sure that the data buf is already sliced correctly to start at the payload, as calling *SegmentCursorMmap.First will start reading at offset 0 relative to the passed in buffer. Similarly, the buffer may only contain payloads, as the buffer end is used to determine if more keys can be found.

Therefore if the payload is part of a longer continuous buffer, the cursor should be initialized with data[payloadStartPos:payloadEndPos]

func (*SegmentCursorMmap) First ¶

func (c *SegmentCursorMmap) First() (uint8, roaringset.BitmapLayer, bool)

func (*SegmentCursorMmap) Next ¶

func (c *SegmentCursorMmap) Next() (uint8, roaringset.BitmapLayer, bool)

type SegmentCursorPread ¶

type SegmentCursorPread struct {
	// contains filtered or unexported fields
}

A SegmentCursor iterates over all key-value pairs in a single disk segment. You can start at the beginning using *SegmentCursorPread.First and move forward using *SegmentCursorPread.Next

func NewSegmentCursorPread ¶

func NewSegmentCursorPread(readSeeker io.ReadSeeker, bufferCount int) *SegmentCursorPread

NewSegmentCursorPread creates a cursor for a single disk segment. Make sure that the reader has offset = 0 set correctly to start at the payload, as calling *SegmentCursorPread.First will start reading at offset 0. Similarly, the reader may only read payload, as the EOF is used to determine if more keys can be found.

bufferCount tells how many exlusive payload buffers should be used to return expected data. Set multiple buffers if data returned by First/Next will not be used before following call will be made, not to overwrite previously fetched values. (e.g. count 3 means, 3 buffers will be used internally and following calls to First/Next will return data in buffers: 1, 2, 3, 1, 2, 3, ...)

func (*SegmentCursorPread) First ¶

func (c *SegmentCursorPread) First() (uint8, roaringset.BitmapLayer, bool)

func (*SegmentCursorPread) Next ¶

func (c *SegmentCursorPread) Next() (uint8, roaringset.BitmapLayer, bool)

type SegmentNode ¶

type SegmentNode struct {
	// contains filtered or unexported fields
}

SegmentNode stores one Key-Value pair in the LSM Segment. It uses a single []byte internally. As a result there is no decode step required at runtime. Instead you can use

*SegmentNode.Key
*SegmentNode.Additions
*SegmentNode.Deletions

to access the contents. Those helpers in turn do not require a decoding step. The accessor methods that return Roaring Bitmaps only point to existing memory.

This makes the SegmentNode very fast to access at query time, even when it contains a large amount of data.

The internal structure of the data is:

byte begin-start    | description
--------------------|-----------------------------------------------------
0:8                 | uint64 indicating the total length of the node,
                    | this is used in cursors to identify the next node.
8:9					| key
9:17                | uint64 length indicator for additions sraor bitmap (x)
17:(17+x)           | additions bitmap
(17+x):(25+x)       | uint64 length indicator for deletions sroar bitmap (y)
(25+x):(25+x+y)     | deletions bitmap
					| deletion indicator and bitmaps are used only for key == 0

func NewSegmentNode ¶

func NewSegmentNode(key uint8, additions, deletions *sroar.Bitmap) (*SegmentNode, error)

func NewSegmentNodeFromBuffer ¶

func NewSegmentNodeFromBuffer(buf []byte) *SegmentNode

NewSegmentNodeFromBuffer creates a new segment node by using the underlying buffer without copying data. Only use this when you can be sure that it's safe to share the data or create your own copy.

func (*SegmentNode) Additions ¶

func (sn *SegmentNode) Additions() *sroar.Bitmap

Additions returns the additions roaring bitmap with shared state. Only use this method if you can guarantee that you will only use it while holding a maintenance lock or can otherwise be sure that no compaction can occur.

func (*SegmentNode) Deletions ¶

func (sn *SegmentNode) Deletions() *sroar.Bitmap

Deletions returns the deletions roaring bitmap with shared state. Only use this method if you can guarantee that you will only use it while holding a maintenance lock or can otherwise be sure that no compaction can occur.

func (*SegmentNode) Key ¶

func (sn *SegmentNode) Key() uint8

func (*SegmentNode) Len ¶

func (sn *SegmentNode) Len() uint64

Len indicates the total length of the SegmentNode. When reading multiple segments back-2-back, such as in a cursor situation, the offset of element (n+1) is the offset of element n + Len()

func (*SegmentNode) ToBuffer ¶

func (sn *SegmentNode) ToBuffer() []byte

ToBuffer returns the internal buffer without copying data. Only use this, when you can be sure that it's safe to share the data, or create your own copy.

It truncates the buffer at is own length, in case it was initialized with a long buffer that only had a beginning offset, but no end. Such a situation may occur with cursors. If we then returned the whole buffer and don't know what the caller plans on doing with the data, we risk passing around too much memory. Truncating at the length prevents this and has no other negative effects.

type SegmentReader ¶

type SegmentReader struct {
	// contains filtered or unexported fields
}

func NewSegmentReader ¶

func NewSegmentReader(cursor *GaplessSegmentCursor) *SegmentReader

func NewSegmentReaderConcurrent ¶

func NewSegmentReaderConcurrent(cursor *GaplessSegmentCursor, concurrency int) *SegmentReader

func (*SegmentReader) Read ¶

func (r *SegmentReader) Read(ctx context.Context, value uint64, operator filters.Operator,
) (roaringset.BitmapLayer, error)

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL