Documentation ¶
Overview ¶
Package roaringset contains all the LSM business logic that is unique to the "RoaringSet" strategy
This package alone does not contain an entire LSM store. It's intended to be used as part of the github.com/donomii/tagdb/tagbrowser/lsmkv package.
Motivation ¶
What makes the RoaringSet strategy unique is that it's essentially a fully persistent Roaring Bitmap that can be built up and updated incrementally (without write amplification) while being extremely fast to query.
Without this specific strategy, it would not be efficient to use roaring bitmaps in an LSM store. For example:
Lucene uses posting lists in the inverted index on disk and supports converting them to a Roaring Bitmap at query time. This resulting bitmap can then be cached. However, the cost to initially convert a posting list to a roaring bitmap is quite huge. In our own tests, inserting 90M out of 100M possible ids into a github.com/weaviate/sroar.Bitmap takes about 3.5s.
You could store a regular roaring bitmap, such as github.com/weaviate/sroar.Bitmap in a regular LSM store, such as RocksDB. This would fix the retrieval issue and you should be able to retrieve and initialize a bitmap containing 90M objects in a few milliseconds. However, the cost to incrementally update this bitmap would be extreme. You would have to use a read-modify-write pattern which would lead to huge write-amplification on large setups. A 90M roaring bitmap is about 10.5MB, so to add a single entry (which would take up anywhere from 1 bit to 2 bytes), you would have to read 10.5MB and write 10.5MB again. That's not feasible except for bulk-loading. In Weaviate we cannot always assume bulk loading, as user behavior and insert orders are generally unpredictable.
We solve this issue by making the LSM store roaring-bitmap-native. This way, we can keep the benefits of an LSM store (very fast writes) with the benefits of a serialized roaring bitmap (very fast reads/initializations).
Essentially this means the RoaringSet strategy behaves like a fully persistent (and durable) Roaring Bitmap. See the next section to learn how it works under the hood.
Internals ¶
The public-facing methods make use of github.com/weaviate/sroar.Bitmap. This serialized bitmap already fulfills many of the criteria needed in Weaviate. It can be initialized at almost no cost (sharing memory) or very little cost (copying some memory). Furthermore, its set, remove, and intersection methods work well for the inverted index use cases in Weaviate.
So, the novel part in the lsmkv.RoaringSet strategy does not sit in the roaring bitmap itself, but rather in the way it's persisted. It uses the standard principles of an LSM store where each new write is first cached in a memtable (and of course written into a Write-Ahead-Log to make it durable). The memtable is flushed into a disk segment when specific criteria are met (memtable size, WAL size, idle time, time since last flush, etc.).
This means that each layer (represented by BitmapLayer) only contains the deltas that were written in a specific time interval. When reading, all layers must be combined into a single bitmap (see BitmapLayers.Flatten).
Over time segments can be combined into fewer, larger segments using an LSM Compaction process. The logic for that can be found in BitmapLayers.Merge.
To make sure access is efficient the entire RoaringSet strategy is built to avoid encoding/decoding steps. Instead we internally store data as simple byte slices. For example, see SegmentNode. You can access bitmaps without any meaningful allocations using SegmentNode.Additions and SegmentNode.Deletions. If you plan to hold on to the bitmap for a time window that is longer than holding a lock that prevents a compaction, you need to copy data (e.g. using SegmentNode.AdditionsWithCopy). Even with such a copy, reading a 90M-ids bitmap takes only single-digit milliseconds.
Index ¶
- Constants
- func Condense(bm *sroar.Bitmap) *sroar.Bitmap
- func NewBitmap(values ...uint64) *sroar.Bitmap
- func NewBitmapPrefill(maxVal uint64, logger logrus.FieldLogger) *sroar.Bitmap
- func NewInvertedBitmap(source *sroar.Bitmap, maxVal uint64, logger logrus.FieldLogger) *sroar.Bitmap
- type BinarySearchNode
- func (n *BinarySearchNode) IsNil() bool
- func (n *BinarySearchNode) IsRed() bool
- func (n *BinarySearchNode) Left() rbtree.Node
- func (n *BinarySearchNode) Parent() rbtree.Node
- func (n *BinarySearchNode) Right() rbtree.Node
- func (n *BinarySearchNode) SetLeft(left rbtree.Node)
- func (n *BinarySearchNode) SetParent(parent rbtree.Node)
- func (n *BinarySearchNode) SetRed(isRed bool)
- func (n *BinarySearchNode) SetRight(right rbtree.Node)
- type BinarySearchTree
- type BinarySearchTreeCursor
- type BitmapFactory
- type BitmapLayer
- type BitmapLayers
- type CombinedCursor
- type Compactor
- type InnerCursor
- type Insert
- type MaxValGetterFunc
- type Seeker
- type SegmentCursor
- type SegmentNode
- func (sn *SegmentNode) Additions() *sroar.Bitmap
- func (sn *SegmentNode) AdditionsWithCopy() *sroar.Bitmap
- func (sn *SegmentNode) Deletions() *sroar.Bitmap
- func (sn *SegmentNode) DeletionsWithCopy() *sroar.Bitmap
- func (sn *SegmentNode) KeyIndexAndWriteTo(w io.Writer, offset int) (segmentindex.Key, error)
- func (sn *SegmentNode) Len() uint64
- func (sn *SegmentNode) PrimaryKey() []byte
- func (sn *SegmentNode) ToBuffer() []byte
- type SegmentNodeList
- func (sn *SegmentNodeList) Additions() []uint64
- func (sn *SegmentNodeList) AdditionsWithCopy() []uint64
- func (sn *SegmentNodeList) Deletions() []uint64
- func (sn *SegmentNodeList) DeletionsWithCopy() []uint64
- func (sn *SegmentNodeList) KeyIndexAndWriteTo(w io.Writer, offset int) (segmentindex.Key, error)
- func (sn *SegmentNodeList) Len() uint64
- func (sn *SegmentNodeList) PrimaryKey() []byte
- func (sn *SegmentNodeList) ToBuffer() []byte
Constants ¶
const ( // DefaultBufferIncrement is the amount of bits greater than <maxVal> // to reduce the amount of times BitmapFactory has to reallocate. DefaultBufferIncrement = uint64(100) )
Variables ¶
This section is empty.
Functions ¶
func Condense ¶
Operations on bitmaps may result in oversized instances in relation to number of elements currently contained in bitmap Examples of such operations: - And-ing bitmaps may results in size being sum of both sizes (especially and-ing bitmap with itself) - Removing elements from bitmap results in size not being reduced (even if there is only few or no elements left)
Method should be used before saving bitmap to file, to ensure minimal required size
For most cases Or between empty bitmap and used bitmap works pretty well for reducing its final size, except for use case, where used bitmap uses internally bitmap - it will not be converted to underlying array, even if there are single elements left
func NewBitmapPrefill ¶
func NewBitmapPrefill(maxVal uint64, logger logrus.FieldLogger) *sroar.Bitmap
Creates prefilled bitmap with values from 0 to maxVal (included).
It is designed to be more performant both time-wise (compared to Set/SetMany) and memory-wise (compared to FromSortedList accepting entire slice of elements) Method creates multiple small bitmaps using FromSortedList (slice is reusable) and ORs them together to get final bitmap. For maxVal > prefillBufferSize (65_536) and multiple CPUs available task is performed by up to prefillMaxRoutines (4) goroutines.
func NewInvertedBitmap ¶
func NewInvertedBitmap(source *sroar.Bitmap, maxVal uint64, logger logrus.FieldLogger) *sroar.Bitmap
NewInvertedBitmap creates a bitmap that as all IDs filled from 0 to maxVal. Then the source bitmap is subtracted (AndNot) from the all-ids bitmap, resulting in a bitmap containing all ids from 0 to maxVal except the ones that were set on the source.
Types ¶
type BinarySearchNode ¶
type BinarySearchNode struct { Key []byte Value BitmapLayer // contains filtered or unexported fields }
func BinarySearchNodeFromRB ¶
func BinarySearchNodeFromRB(rbNode rbtree.Node) (bsNode *BinarySearchNode)
func (*BinarySearchNode) IsNil ¶
func (n *BinarySearchNode) IsNil() bool
func (*BinarySearchNode) IsRed ¶
func (n *BinarySearchNode) IsRed() bool
func (*BinarySearchNode) Left ¶
func (n *BinarySearchNode) Left() rbtree.Node
func (*BinarySearchNode) Parent ¶
func (n *BinarySearchNode) Parent() rbtree.Node
func (*BinarySearchNode) Right ¶
func (n *BinarySearchNode) Right() rbtree.Node
func (*BinarySearchNode) SetLeft ¶
func (n *BinarySearchNode) SetLeft(left rbtree.Node)
func (*BinarySearchNode) SetParent ¶
func (n *BinarySearchNode) SetParent(parent rbtree.Node)
func (*BinarySearchNode) SetRed ¶
func (n *BinarySearchNode) SetRed(isRed bool)
func (*BinarySearchNode) SetRight ¶
func (n *BinarySearchNode) SetRight(right rbtree.Node)
type BinarySearchTree ¶
type BinarySearchTree struct {
// contains filtered or unexported fields
}
func (*BinarySearchTree) FlattenInOrder ¶
func (t *BinarySearchTree) FlattenInOrder() []*BinarySearchNode
FlattenInOrder creates list of ordered copies of bst nodes Only Key and Value fields are populated
func (*BinarySearchTree) Get ¶
func (t *BinarySearchTree) Get(key []byte) (BitmapLayer, error)
Get creates copies of underlying bitmaps to prevent future (concurrent) read and writes after layer being returned
func (*BinarySearchTree) Insert ¶
func (t *BinarySearchTree) Insert(key []byte, values Insert)
type BinarySearchTreeCursor ¶
type BinarySearchTreeCursor struct {
// contains filtered or unexported fields
}
func NewBinarySearchTreeCursor ¶
func NewBinarySearchTreeCursor(bst *BinarySearchTree) *BinarySearchTreeCursor
func (*BinarySearchTreeCursor) First ¶
func (c *BinarySearchTreeCursor) First() ([]byte, BitmapLayer, error)
func (*BinarySearchTreeCursor) Next ¶
func (c *BinarySearchTreeCursor) Next() ([]byte, BitmapLayer, error)
func (*BinarySearchTreeCursor) Seek ¶
func (c *BinarySearchTreeCursor) Seek(key []byte) ([]byte, BitmapLayer, error)
type BitmapFactory ¶
type BitmapFactory struct {
// contains filtered or unexported fields
}
BitmapFactory exists to prevent an expensive call to NewBitmapPrefill each time NewInvertedBitmap is invoked
func NewBitmapFactory ¶
func NewBitmapFactory(maxValGetter MaxValGetterFunc, logger logrus.FieldLogger) *BitmapFactory
func (*BitmapFactory) ActualMaxVal ¶
func (bmf *BitmapFactory) ActualMaxVal() uint64
ActualMaxVal returns the highest value in the bitmap not including the buffer
func (*BitmapFactory) GetBitmap ¶
func (bmf *BitmapFactory) GetBitmap() *sroar.Bitmap
GetBitmap returns a prefilled bitmap, which is cloned from a shared internal. This method is safe to call concurrently. The purpose behind sharing an internal bitmap, is that a Clone() operation is much cheaper than prefilling a map up to <maxDocID> elements is an expensive operation, and this way we only have to do it once.
type BitmapLayer ¶
A BitmapLayer contains all the bitmap related delta-information stored for a specific key in one layer. A layer typically corresponds to one disk segment or a memtable layer
A layer is essentially a snapshot in time and to get an accurate few of the set in its entirety multiple layers need to be combined using BitmapLayers.
The contents of Additions and Deletions must be mutually exclusive. A layer cannot both add and delete an element. The only way to create new layers is through inserting into a Memtable. The memtable must make sure that:
- When an element is added, any previous deletion of this element is removed
- When an element is deleted, any previous addition of this element is removed.
As a result, an element is either a net addition or a net deletion in a layer, but it can never be both.
func (*BitmapLayer) Clone ¶
func (l *BitmapLayer) Clone() BitmapLayer
type BitmapLayers ¶
type BitmapLayers []BitmapLayer
BitmapLayers are a helper type to perform operations on multiple layers, such as BitmapLayers.Flatten or BitmapLayers.Merge.
func (BitmapLayers) Flatten ¶
func (bml BitmapLayers) Flatten() *sroar.Bitmap
Flatten reduces all snapshots into a single Bitmap. This bitmap no longer contains separate additions and deletions, but a single set where all additions and deletions have been applied in the correct order.
If you do not wish to flatten all of history, but rather combine two layers, such as would happen in a Compaction, use BitmapLayers.Merge instead.
Flatten is typically used when serving a specific key to the user: It flattens all disk segments, a currently flushing memtable if it exists, and the active memtable into a single bitmap. The final bitmap is returned to the user.
Flattening Logic ¶
- The first layer is seen as chronologically first. Deletions in the first layers are ignored, as there is nothing to be deleted. As a result, the additions of the first segment become the root state in the first iteration.
- Any subsequent layer is merged into the root layer in the following way: Deletions remove any existing additions, Additions are added.
- This process happens one layer at a time. This way delete-and-readd cycles are reflected correctly. For example, if layer 2 deletes an element X and layer 3 adds element X, then it is a net addition overall, and X should be represented in the final bitmap. If the order is reversed and layer 2 adds X, whereas layer 3 removes X, it is should not be contained in the final map.
func (BitmapLayers) FlattenMutate ¶
func (bml BitmapLayers) FlattenMutate() *sroar.Bitmap
FlattenMutate works as Flatten but mutates passed bitmaps: bml[0].Additions and bml[!=0].Deletions. It is important to provide cloned bitmaps if original ones are expected not to change.
func (BitmapLayers) Merge ¶
func (bml BitmapLayers) Merge() (BitmapLayer, error)
Merge turns two successive layers into one. It does not flatten the segment, but keeps additions and deletions separate. This is because there are no guarantees that the first segment was the root segment. A merge could run on segments 3+4 and they could contain deletions of elements that were added in segments 1 or 2.
Merge is intended to be used as part of compactions.
type CombinedCursor ¶
type CombinedCursor struct {
// contains filtered or unexported fields
}
func NewCombinedCursor ¶
func NewCombinedCursor(innerCursors []InnerCursor, keyOnly bool) *CombinedCursor
When keyOnly flag is set, only keys are returned by First/Next/Seek access methods, 2nd value returned is expected to be nil When keyOnly is not set, 2nd value is always bitmap. Returned bitmap can be empty (e.g. for Next call after last element was already returned)
type Compactor ¶
type Compactor struct {
// contains filtered or unexported fields
}
Compactor takes in a left and a right segment and merges them into a single segment. The input segments are represented by cursors without their respective segmentindexes. A new segmentindex is built from the merged nodes without taking the old indexes into account at all.
The left segment must precede the right one in its creation time, as the compactor applies latest-takes-presence rules when there is a conflict.
Merging independent key/value pairs ¶
The new segment's nodes will be in sorted fashion (this is a requirement for the segment index and segment cursors to function). To achieve a sorted end result, the Compactor goes over both input cursors simultaneously and always works on the smaller of the two keys. After a key/value pair has been added to the output only the input cursor that provided the pair is advanced.
Merging key/value pairs with identical keys ¶
When both segment have a key/value pair with an overlapping key, the value has to be merged. The merge logic is not part of the compactor itself. Instead it makes use of BitmapLayers.Merge.
Exit Criterium ¶
When both cursors no longer return values, all key/value pairs are considered compacted. The compactor then deals with metadata.
Index and Header metadata ¶
Only once the key/value pairs have been compacted, will the compactor write the primary index based on the new key/value payload. Finally, the input writer is rewinded to be able to write the header metadata at the beginning of the file. Because of this, the input writer must be an io.WriteSeeker, such as *os.File.
The level of the resulting segment is the input level increased by one. Levels help the "eligible for compaction" cycle to find suitable compaction pairs.
func NewCompactor ¶
func NewCompactor(w io.WriteSeeker, left, right *SegmentCursor, level uint16, scratchSpacePath string, cleanupDeletions bool, ) *Compactor
NewCompactor from left (older) and right (newer) seeker. See Compactor for an explanation of what goes on under the hood, and why the input requirements are the way they are.
type InnerCursor ¶
type InnerCursor interface { First() ([]byte, BitmapLayer, error) Next() ([]byte, BitmapLayer, error) Seek(key []byte) ([]byte, BitmapLayer, error) }
type MaxValGetterFunc ¶
type MaxValGetterFunc func() uint64
type SegmentCursor ¶
type SegmentCursor struct {
// contains filtered or unexported fields
}
A SegmentCursor iterates over all key-value pairs in a single disk segment. You can either start at the beginning using *SegmentCursor.First or start at an arbitrary key that you may find using *SegmentCursor.Seek
func NewSegmentCursor ¶
func NewSegmentCursor(data []byte, index Seeker) *SegmentCursor
NewSegmentCursor creates a cursor for a single disk segment. Make sure that the data buf is already sliced correctly to start at the payload, as calling *SegmentCursor.First will start reading at offset 0 relative to the passed in buffer. Similarly, the buffer may only contain payloads, as the buffer end is used to determine if more keys can be found.
Therefore if the payload is part of a longer continuous buffer, the cursor should be initialized with data[payloadStartPos:payloadEndPos]
func (*SegmentCursor) First ¶
func (c *SegmentCursor) First() ([]byte, BitmapLayer, error)
func (*SegmentCursor) Next ¶
func (c *SegmentCursor) Next() ([]byte, BitmapLayer, error)
func (*SegmentCursor) Seek ¶
func (c *SegmentCursor) Seek(key []byte) ([]byte, BitmapLayer, error)
type SegmentNode ¶
type SegmentNode struct {
// contains filtered or unexported fields
}
SegmentNode was replaced by SegmentNodeList for WAL, but it is still used in the bitmap layers.
SegmentNode stores one Key-Value pair (without its index) in the LSM Segment. It uses a single []byte internally. As a result there is no decode step required at runtime. Instead you can use
- *SegmentNode.Additions
- *SegmentNode.AdditionsWithCopy
- *SegmentNode.Deletions
- *SegmentNode.DeletionsWithCopy
- *SegmentNode.PrimaryKey
to access the contents. Those helpers in turn do not require a decoding step. The accessor methods that return Roaring Bitmaps only point to existing memory (methods without WithCopy suffix), or in the worst case copy one byte slice (methods with WithCopy suffix).
This makes the SegmentNode very fast to access at query time, even when it contains a large amount of data.
The internal structure of the data is:
byte begin-start | description --------------------|----------------------------------------------------- 0-8 | uint64 indicating the total length of the node, | this is used in cursors to identify the next node. 8-16 | uint64 length indicator for additions sraor bm -> x 16-(x+16) | additions bitmap (x+16)-(x+24) | uint64 length indicator for deletions sroar bm -> y (x+24)-(x+y+24) | deletions bitmap (x+y+24)-(x+y+28) | uint32 length indicator for primary key length -> z (x+y+28)-(x+y+z+28) | primary key
func NewSegmentNode ¶
func NewSegmentNode( key []byte, additions, deletions *sroar.Bitmap, ) (*SegmentNode, error)
func NewSegmentNodeFromBuffer ¶
func NewSegmentNodeFromBuffer(buf []byte) *SegmentNode
NewSegmentNodeFromBuffer creates a new segment node by using the underlying buffer without copying data. Only use this when you can be sure that it's safe to share the data or create your own copy.
func (*SegmentNode) Additions ¶
func (sn *SegmentNode) Additions() *sroar.Bitmap
Additions returns the additions roaring bitmap with shared state. Only use this method if you can guarantee that you will only use it while holding a maintenance lock or can otherwise be sure that no compaction can occur. If you can't guarantee that, instead use *SegmentNode.AdditionsWithCopy.
func (*SegmentNode) AdditionsWithCopy ¶
func (sn *SegmentNode) AdditionsWithCopy() *sroar.Bitmap
AdditionsWithCopy returns the additions roaring bitmap without sharing state. It creates a copy of the underlying buffer. This is safe to use indefinitely, but much slower than *SegmentNode.Additions as it requires copying all the memory. If you know that you will only need the contents of the node for a duration of time where a lock is held that prevents compactions, it is more efficient to use *SegmentNode.Additions.
func (*SegmentNode) Deletions ¶
func (sn *SegmentNode) Deletions() *sroar.Bitmap
Deletions returns the deletions roaring bitmap with shared state. Only use this method if you can guarantee that you will only use it while holding a maintenance lock or can otherwise be sure that no compaction can occur. If you can't guarantee that, instead use *SegmentNode.DeletionsWithCopy.
func (*SegmentNode) DeletionsWithCopy ¶
func (sn *SegmentNode) DeletionsWithCopy() *sroar.Bitmap
DeletionsWithCopy returns the deletions roaring bitmap without sharing state. It creates a copy of the underlying buffer. This is safe to use indefinitely, but much slower than *SegmentNode.Deletions as it requires copying all the memory. If you know that you will only need the contents of the node for a duration of time where a lock is held that prevents compactions, it is more efficient to use *SegmentNode.Deletions.
func (*SegmentNode) KeyIndexAndWriteTo ¶
func (sn *SegmentNode) KeyIndexAndWriteTo(w io.Writer, offset int) (segmentindex.Key, error)
KeyIndexAndWriteTo is a helper to flush a memtables full of SegmentNodes. It writes itself into the given writer and returns a segmentindex.Key with start and end indicators (respecting SegmentNode.Offset). Those keys can then be used to build an index for the nodes. The combination of index and node make up an LSM segment.
RoaringSets do not support secondary keys, thus the segmentindex.Key will only ever contain a primary key.
func (*SegmentNode) Len ¶
func (sn *SegmentNode) Len() uint64
Len indicates the total length of the SegmentNode. When reading multiple segments back-2-back, such as in a cursor situation, the offset of element (n+1) is the offset of element n + Len()
func (*SegmentNode) PrimaryKey ¶
func (sn *SegmentNode) PrimaryKey() []byte
func (*SegmentNode) ToBuffer ¶
func (sn *SegmentNode) ToBuffer() []byte
ToBuffer returns the internal buffer without copying data. Only use this, when you can be sure that it's safe to share the data, or create your own copy.
It truncates the buffer at is own length, in case it was initialized with a long buffer that only had a beginning offset, but no end. Such a situation may occur with cursors. If we then returned the whole buffer and don't know what the caller plans on doing with the data, we risk passing around too much memory. Truncating at the length prevents this and has no other negative effects.
type SegmentNodeList ¶
type SegmentNodeList struct {
// contains filtered or unexported fields
}
SegmentNodeList is inspired by the roaringset.SegmentNode, keeping the same fuctionality, but stores lists of []uint64 instead of bitmaps. It stores one Key-Value pair (without its index) in the LSM Segment. As with SegmentNode, it uses a single []byte internally. As a result there is no decode step required at runtime. Instead you can use
- *SegmentNodeList.Additions
- *SegmentNodeList.AdditionsWithCopy
- *SegmentNodeList.Deletions
- *SegmentNodeList.DeletionsWithCopy
- *SegmentNodeList.PrimaryKey
to access the contents. Those helpers in turn do not require a decoding step. Instead of returning Roaring Bitmaps, it returns []uint64 slices. This makes the indexing time much faster, as we don't have to create the roaring bitmaps for each WAL insert. It also makes it smaller on disk, as we don't have to store the roaring bitmap, but only the list of uint64s.
The internal structure of the data is close to the original SegmentNode, storing []uint64 instead of roaring bitmaps. The structure is as follows:
byte begin-start | description --------------------|----------------------------------------------------- 0-8 | uint64 indicating the total length of the node, | this is used in cursors to identify the next node. 8-16 | uint64 length indicator for additions | len(additions)*size(uint64) -> x 16-(x+16) | additions []uint64 (x+16)-(x+24) | uint64 length indicator for deletions | len(deletions)*size(uint64) -> y (x+24)-(x+y+24) | deletions []uint64 (x+y+24)-(x+y+28) | uint32 length indicator for primary key length -> z (x+y+28)-(x+y+z+28) | primary key
func NewSegmentNodeList ¶
func NewSegmentNodeList( key []byte, additions, deletions []uint64, ) (*SegmentNodeList, error)
func NewSegmentNodeListFromBuffer ¶
func NewSegmentNodeListFromBuffer(buf []byte) *SegmentNodeList
NewSegmentNodeFromBuffer creates a new segment node by using the underlying buffer without copying data. Only use this when you can be sure that it's safe to share the data or create your own copy.
func (*SegmentNodeList) Additions ¶
func (sn *SegmentNodeList) Additions() []uint64
Additions returns the additions []uint64 with shared state. Only use this method if you can guarantee that you will only use it while holding a maintenance lock or can otherwise be sure that no compaction can occur. If you can't guarantee that, instead use *SegmentNodeList.AdditionsWithCopy.
func (*SegmentNodeList) AdditionsWithCopy ¶
func (sn *SegmentNodeList) AdditionsWithCopy() []uint64
AdditionsWithCopy returns the additions []uint64 without sharing state. It creates a copy of the underlying buffer. This is safe to use indefinitely, but much slower than *SegmentNodeList.Additions as it requires copying all the memory. If you know that you will only need the contents of the node for a duration of time where a lock is held that prevents compactions, it is more efficient to use *SegmentNodeList.Additions.
func (*SegmentNodeList) Deletions ¶
func (sn *SegmentNodeList) Deletions() []uint64
Deletions returns the deletions []uint64 with shared state. Only use this method if you can guarantee that you will only use it while holding a maintenance lock or can otherwise be sure that no compaction can occur. If you can't guarantee that, instead use *SegmentNodeList.DeletionsWithCopy.
func (*SegmentNodeList) DeletionsWithCopy ¶
func (sn *SegmentNodeList) DeletionsWithCopy() []uint64
DeletionsWithCopy returns the deletions []uint64 without sharing state. It creates a copy of the underlying buffer. This is safe to use indefinitely, but much slower than *SegmentNodeList.Deletions as it requires copying all the memory. If you know that you will only need the contents of the node for a duration of time where a lock is held that prevents compactions, it is more efficient to use *SegmentNodeList.Deletions.
func (*SegmentNodeList) KeyIndexAndWriteTo ¶
func (sn *SegmentNodeList) KeyIndexAndWriteTo(w io.Writer, offset int) (segmentindex.Key, error)
KeyIndexAndWriteTo is a helper to flush a memtables full of SegmentNodes. It writes itself into the given writer and returns a segmentindex.Key with start and end indicators (respecting SegmentNodeList.Offset). Those keys can then be used to build an index for the nodes. The combination of index and node make up an LSM segment.
RoaringSets do not support secondary keys, thus the segmentindex.Key will only ever contain a primary key.
func (*SegmentNodeList) Len ¶
func (sn *SegmentNodeList) Len() uint64
Len indicates the total length of the SegmentNodeList. When reading multiple segments back-2-back, such as in a cursor situation, the offset of element (n+1) is the offset of element n + Len()
func (*SegmentNodeList) PrimaryKey ¶
func (sn *SegmentNodeList) PrimaryKey() []byte
func (*SegmentNodeList) ToBuffer ¶
func (sn *SegmentNodeList) ToBuffer() []byte
ToBuffer returns the internal buffer without copying data. Only use this, when you can be sure that it's safe to share the data, or create your own copy.
It truncates the buffer at is own length, in case it was initialized with a long buffer that only had a beginning offset, but no end. Such a situation may occur with cursors. If we then returned the whole buffer and don't know what the caller plans on doing with the data, we risk passing around too much memory. Truncating at the length prevents this and has no other negative effects.