roaring64

package
v1.9.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 3, 2024 License: Apache-2.0, BSD-3-Clause, Apache-2.0 Imports: 12 Imported by: 109

Documentation

Index

Constants

View Source
const (
	// Min64BitSigned - Minimum 64 bit value
	Min64BitSigned = -9223372036854775808
	// Max64BitSigned - Maximum 64 bit value
	Max64BitSigned = 9223372036854775807
)

Variables

This section is empty.

Functions

func ClearBits added in v0.5.1

func ClearBits(foundSet, target *Bitmap)

ClearBits cleared the bits that exist in the target if they are also in the found set.

Types

type BSI added in v0.5.1

type BSI struct {
	MaxValue int64
	MinValue int64
	// contains filtered or unexported fields
}

BSI is at its simplest is an array of bitmaps that represent an encoded binary value. The advantage of a BSI is that comparisons can be made across ranges of values whereas a bitmap can only represent the existence of a single value for a given column ID. Another usage scenario involves storage of high cardinality values.

It depends upon the bitmap libraries. It is not thread safe, so upstream concurrency guards must be provided.

func NewBSI added in v0.5.1

func NewBSI(maxValue int64, minValue int64) *BSI

NewBSI constructs a new BSI. Note that it is your responsibility to ensure that the min/max values are set correctly. Queries CompareValue, MinMax, etc. will not work correctly if the min/max values are not set correctly.

func NewDefaultBSI added in v0.5.1

func NewDefaultBSI() *BSI

NewDefaultBSI constructs an auto-sized BSI

func (*BSI) Add added in v0.5.2

func (b *BSI) Add(other *BSI)

Add - In-place sum the contents of another BSI with this BSI, column wise.

func (*BSI) BatchEqual added in v0.5.1

func (b *BSI) BatchEqual(parallelism int, values []int64) *Bitmap

BatchEqual returns a bitmap containing the column IDs where the values are contained within the list of values provided.

func (*BSI) BitCount added in v0.5.1

func (b *BSI) BitCount() int

BitCount returns the number of bits needed to represent values.

func (*BSI) ClearValues added in v0.5.1

func (b *BSI) ClearValues(foundSet *Bitmap)

ClearValues removes the values found in foundSet

func (*BSI) Clone added in v0.5.2

func (b *BSI) Clone() *BSI

Clone performs a deep copy of BSI contents.

func (*BSI) CompareValue added in v0.5.1

func (b *BSI) CompareValue(parallelism int, op Operation, valueOrStart, end int64,
	foundSet *Bitmap) *Bitmap

CompareValue compares value. Values should be in the range of the BSI (max, min). If the value is outside the range, the result might erroneous. The operation parameter indicates the type of comparison to be made. For all operations with the exception of RANGE, the value to be compared is specified by valueOrStart. For the RANGE parameter the comparison criteria is >= valueOrStart and <= end. The parallelism parameter indicates the number of CPU threads to be applied for processing. A value of zero indicates that all available CPU resources will be potentially utilized.

func (*BSI) Equals added in v1.3.0

func (b *BSI) Equals(other *BSI) bool

Equals - Check for semantic equality of two BSIs.

func (*BSI) GetCardinality added in v0.5.1

func (b *BSI) GetCardinality() uint64

GetCardinality returns a count of unique column IDs for which a value has been set.

func (*BSI) GetExistenceBitmap added in v0.5.2

func (b *BSI) GetExistenceBitmap() *Bitmap

GetExistenceBitmap returns a pointer to the underlying existence bitmap of the BSI

func (*BSI) GetSizeInBytes added in v1.4.0

func (b *BSI) GetSizeInBytes() int

GetSizeInBytes - the size in bytes of the data structure

func (*BSI) GetValue added in v0.5.1

func (b *BSI) GetValue(columnID uint64) (value int64, exists bool)

GetValue gets the value at the column ID. Second param will be false for non-existent values.

func (*BSI) HasRunCompression added in v0.5.2

func (b *BSI) HasRunCompression() bool

HasRunCompression returns true if the bitmap benefits from run compression

func (*BSI) Increment added in v0.5.2

func (b *BSI) Increment(foundSet *Bitmap)

Increment - In-place increment of values in a BSI. Found set select columns for incrementing.

func (*BSI) IncrementAll added in v0.5.2

func (b *BSI) IncrementAll()

IncrementAll - In-place increment of all values in a BSI.

func (*BSI) IntersectAndTranspose added in v0.5.1

func (b *BSI) IntersectAndTranspose(parallelism int, foundSet *Bitmap) *Bitmap

IntersectAndTranspose is a matrix transpose function. Return a bitmap such that the values are represented as column IDs in the returned bitmap. This is accomplished by iterating over the foundSet and only including the column IDs in the source (foundSet) as compared with this BSI. This can be useful for vectoring one set of integers to another.

TODO: This implementation is functional but not performant, needs to be re-written perhaps using SIMD SSE2 instructions.

func (*BSI) MarshalBinary added in v0.5.1

func (b *BSI) MarshalBinary() ([][]byte, error)

MarshalBinary serializes a BSI

func (*BSI) MinMax added in v0.8.0

func (b *BSI) MinMax(parallelism int, op Operation, foundSet *Bitmap) int64

MinMax - Find minimum or maximum value.

func (*BSI) NewBSIRetainSet added in v0.5.1

func (b *BSI) NewBSIRetainSet(foundSet *Bitmap) *BSI

NewBSIRetainSet - Construct a new BSI from a clone of existing BSI, retain only values contained in foundSet

func (*BSI) ParOr added in v0.5.1

func (b *BSI) ParOr(parallelism int, bsis ...*BSI)

ParOr is intended primarily to be a concatenation function to be used during bulk load operations. Care should be taken to make sure that columnIDs do not overlap (unless overlapping values are identical).

func (*BSI) ReadFrom added in v1.3.0

func (b *BSI) ReadFrom(stream io.Reader) (p int64, err error)

ReadFrom reads a serialized version of this BSI from stream.

func (*BSI) RunOptimize added in v0.5.2

func (b *BSI) RunOptimize()

RunOptimize attempts to further compress the runs of consecutive values found in the bitmap

func (*BSI) SetValue added in v0.5.1

func (b *BSI) SetValue(columnID uint64, value int64)

SetValue sets a value for a given columnID.

func (*BSI) Sum added in v0.5.1

func (b *BSI) Sum(foundSet *Bitmap) (sum int64, count uint64)

Sum all values contained within the foundSet. As a convenience, the cardinality of the foundSet is also returned (for calculating the average).

func (*BSI) Transpose added in v0.5.1

func (b *BSI) Transpose() *Bitmap

Transpose calls b.IntersectAndTranspose(0, b.eBM)

func (*BSI) TransposeWithCounts added in v0.5.2

func (b *BSI) TransposeWithCounts(parallelism int, foundSet, filterSet *Bitmap) *BSI

TransposeWithCounts is a matrix transpose function that returns a BSI that has a columnID system defined by the values contained within the input BSI. Given that for BSIs, different columnIDs can have the same value. TransposeWithCounts is useful for situations where there is a one-to-many relationship between the vectored integer sets. The resulting BSI contains the number of times a particular value appeared in the input BSI.

func (*BSI) UnmarshalBinary added in v0.5.1

func (b *BSI) UnmarshalBinary(bitData [][]byte) error

UnmarshalBinary de-serialize a BSI. The value at bitData[0] is the EBM. Other indices are in least to most significance order starting at bitData[1] (bit position 0).

func (*BSI) ValueExists added in v0.5.1

func (b *BSI) ValueExists(columnID uint64) bool

ValueExists tests whether the value exists.

func (*BSI) WriteTo added in v1.3.0

func (b *BSI) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes a serialized version of this BSI to stream.

type Bitmap

type Bitmap struct {
	// contains filtered or unexported fields
}

Bitmap represents a compressed bitmap where you can add integers.

func And

func And(x1, x2 *Bitmap) *Bitmap

And computes the intersection between two bitmaps and returns the result

func AndNot

func AndNot(x1, x2 *Bitmap) *Bitmap

AndNot computes the difference between two bitmaps and returns the result

func BitmapOf

func BitmapOf(dat ...uint64) *Bitmap

BitmapOf generates a new bitmap filled with the specified integers

func FastAnd

func FastAnd(bitmaps ...*Bitmap) *Bitmap

FastAnd computes the intersection between many bitmaps quickly Compared to the And function, it can take many bitmaps as input, thus saving the trouble of manually calling "And" many times.

func FastOr

func FastOr(bitmaps ...*Bitmap) *Bitmap

FastOr computes the union between many bitmaps quickly, as opposed to having to call Or repeatedly.

func Flip

func Flip(rb *Bitmap, rangeStart, rangeEnd uint64) *Bitmap

Flip negates the bits in the given range (i.e., [rangeStart,rangeEnd)), any integer present in this range and in the bitmap is removed, and any integer present in the range and not in the bitmap is added, a new bitmap is returned leaving the current bitmap unchanged.

func FlipInt

func FlipInt(bm *Bitmap, rangeStart, rangeEnd int) *Bitmap

FlipInt calls Flip after casting the parameters (convenience method)

func New

func New() *Bitmap

New creates a new empty Bitmap (same as NewBitmap)

func NewBitmap

func NewBitmap() *Bitmap

NewBitmap creates a new empty Bitmap (see also New)

func Or

func Or(x1, x2 *Bitmap) *Bitmap

Or computes the union between two bitmaps and returns the result

func ParOr

func ParOr(parallelism int, bitmaps ...*Bitmap) *Bitmap

ParOr computes the union (OR) of all provided bitmaps in parallel, where the parameter "parallelism" determines how many workers are to be used (if it is set to 0, a default number of workers is chosen)

func Roaring32AsRoaring64 added in v1.8.0

func Roaring32AsRoaring64(bm32 *roaring.Bitmap) *Bitmap

Roaring32AsRoaring64 inserts a 32-bit roaring bitmap into a 64-bit roaring bitmap. No copy is made.

func Xor

func Xor(x1, x2 *Bitmap) *Bitmap

Xor computes the symmetric difference between two bitmaps and returns the result

func (*Bitmap) Add

func (rb *Bitmap) Add(x uint64)

Add the integer x to the bitmap

func (*Bitmap) AddInt

func (rb *Bitmap) AddInt(x int)

AddInt adds the integer x to the bitmap (convenience method: the parameter is casted to uint32 and we call Add)

func (*Bitmap) AddMany

func (rb *Bitmap) AddMany(dat []uint64)

AddMany add all of the values in dat

func (*Bitmap) AddRange

func (rb *Bitmap) AddRange(rangeStart, rangeEnd uint64)

AddRange adds the integers in [rangeStart, rangeEnd) to the bitmap.

func (*Bitmap) And

func (rb *Bitmap) And(x2 *Bitmap)

And computes the intersection between two bitmaps and stores the result in the current bitmap

func (*Bitmap) AndCardinality

func (rb *Bitmap) AndCardinality(x2 *Bitmap) uint64

AndCardinality returns the cardinality of the intersection between two bitmaps, bitmaps are not modified

func (*Bitmap) AndNot

func (rb *Bitmap) AndNot(x2 *Bitmap)

AndNot computes the difference between two bitmaps and stores the result in the current bitmap

func (*Bitmap) CheckedAdd

func (rb *Bitmap) CheckedAdd(x uint64) bool

CheckedAdd adds the integer x to the bitmap and return true if it was added (false if the integer was already present)

func (*Bitmap) CheckedRemove

func (rb *Bitmap) CheckedRemove(x uint64) bool

CheckedRemove removes the integer x from the bitmap and return true if the integer was effectively remove (and false if the integer was not present)

func (*Bitmap) Clear

func (rb *Bitmap) Clear()

Clear resets the Bitmap to be logically empty, but may retain some memory allocations that may speed up future operations

func (*Bitmap) Clone

func (rb *Bitmap) Clone() *Bitmap

Clone creates a copy of the Bitmap

func (*Bitmap) CloneCopyOnWriteContainers

func (rb *Bitmap) CloneCopyOnWriteContainers()

CloneCopyOnWriteContainers clones all containers which have needCopyOnWrite set to true. This can be used to make sure it is safe to munmap a []byte that the roaring array may still have a reference to, after calling FromBuffer. More generally this function is useful if you call FromBuffer to construct a bitmap with a backing array buf and then later discard the buf array. Note that you should call CloneCopyOnWriteContainers on all bitmaps that were derived from the 'FromBuffer' bitmap since they map have dependencies on the buf array as well.

func (*Bitmap) Contains

func (rb *Bitmap) Contains(x uint64) bool

Contains returns true if the integer is contained in the bitmap

func (*Bitmap) ContainsInt

func (rb *Bitmap) ContainsInt(x int) bool

ContainsInt returns true if the integer is contained in the bitmap (this is a convenience method, the parameter is casted to uint64 and Contains is called)

func (*Bitmap) Equals

func (rb *Bitmap) Equals(srb *Bitmap) bool

Equals returns true if the two bitmaps contain the same integers

func (*Bitmap) Flip

func (rb *Bitmap) Flip(rangeStart, rangeEnd uint64)

Flip negates the bits in the given range (i.e., [rangeStart,rangeEnd)), any integer present in this range and in the bitmap is removed, and any integer present in the range and not in the bitmap is added.

func (*Bitmap) FlipInt

func (rb *Bitmap) FlipInt(rangeStart, rangeEnd int)

FlipInt calls Flip after casting the parameters (convenience method)

func (*Bitmap) FromBase64

func (rb *Bitmap) FromBase64(str string) (int64, error)

FromBase64 deserializes a bitmap from Base64

func (*Bitmap) FromUnsafeBytes added in v1.4.0

func (rb *Bitmap) FromUnsafeBytes(data []byte) (p int64, err error)

FromUnsafeBytes reads a serialized version of this bitmap from the byte buffer without copy. It is the caller's responsibility to ensure that the input data is not modified and remains valid for the entire lifetime of this bitmap. This method avoids small allocations but holds references to the input data buffer. It is GC-friendly, but it may consume more memory eventually.

func (*Bitmap) GetCardinality

func (rb *Bitmap) GetCardinality() uint64

GetCardinality returns the number of integers contained in the bitmap

func (*Bitmap) GetCopyOnWrite

func (rb *Bitmap) GetCopyOnWrite() (val bool)

GetCopyOnWrite gets this bitmap's copy-on-write property

func (*Bitmap) GetSerializedSizeInBytes added in v0.6.0

func (rb *Bitmap) GetSerializedSizeInBytes() uint64

GetSerializedSizeInBytes computes the serialized size in bytes of the Bitmap. It should correspond to the number of bytes written when invoking WriteTo. You can expect that this function is much cheaper computationally than WriteTo.

func (*Bitmap) GetSizeInBytes

func (rb *Bitmap) GetSizeInBytes() uint64

GetSizeInBytes estimates the memory usage of the Bitmap. Note that this might differ slightly from the amount of bytes required for persistent storage

func (*Bitmap) HasRunCompression

func (rb *Bitmap) HasRunCompression() bool

HasRunCompression returns true if the bitmap benefits from run compression

func (*Bitmap) Intersects

func (rb *Bitmap) Intersects(x2 *Bitmap) bool

Intersects checks whether two bitmap intersects, bitmaps are not modified

func (*Bitmap) IsEmpty

func (rb *Bitmap) IsEmpty() bool

IsEmpty returns true if the Bitmap is empty (it is faster than doing (GetCardinality() == 0))

func (*Bitmap) Iterator

func (rb *Bitmap) Iterator() IntPeekable64

Iterator creates a new IntPeekable to iterate over the integers contained in the bitmap, in sorted order; the iterator becomes invalid if the bitmap is modified (e.g., with Add or Remove).

func (*Bitmap) ManyIterator

func (rb *Bitmap) ManyIterator() ManyIntIterable64

ManyIterator creates a new ManyIntIterable to iterate over the integers contained in the bitmap, in sorted order; the iterator becomes invalid if the bitmap is modified (e.g., with Add or Remove).

func (*Bitmap) MarshalBinary

func (rb *Bitmap) MarshalBinary() ([]byte, error)

MarshalBinary implements the encoding.BinaryMarshaler interface for the bitmap (same as ToBytes)

func (*Bitmap) Maximum

func (rb *Bitmap) Maximum() uint64

Maximum get the largest value stored in this roaring bitmap, assumes that it is not empty

func (*Bitmap) Minimum

func (rb *Bitmap) Minimum() uint64

Minimum get the smallest value stored in this roaring bitmap, assumes that it is not empty

func (*Bitmap) Or

func (rb *Bitmap) Or(x2 *Bitmap)

Or computes the union between two bitmaps and stores the result in the current bitmap

func (*Bitmap) OrCardinality

func (rb *Bitmap) OrCardinality(x2 *Bitmap) uint64

OrCardinality returns the cardinality of the union between two bitmaps, bitmaps are not modified

func (*Bitmap) Rank

func (rb *Bitmap) Rank(x uint64) uint64

Rank returns the number of integers that are smaller or equal to x (Rank(infinity) would be GetCardinality())

func (*Bitmap) ReadFrom

func (rb *Bitmap) ReadFrom(stream io.Reader) (p int64, err error)

ReadFrom reads a serialized version of this bitmap from stream. The format is compatible with other 64-bit RoaringBitmap implementations (Java, Go, C++) and it has a specification : https://github.com/RoaringBitmap/RoaringFormatSpec#extention-for-64-bit-implementations

func (*Bitmap) Remove

func (rb *Bitmap) Remove(x uint64)

Remove the integer x from the bitmap

func (*Bitmap) RemoveRange

func (rb *Bitmap) RemoveRange(rangeStart, rangeEnd uint64)

RemoveRange removes the integers in [rangeStart, rangeEnd) from the bitmap.

func (*Bitmap) ReverseIterator

func (rb *Bitmap) ReverseIterator() IntIterable64

ReverseIterator creates a new IntIterable to iterate over the integers contained in the bitmap, in sorted order; the iterator becomes invalid if the bitmap is modified (e.g., with Add or Remove).

func (*Bitmap) RunOptimize

func (rb *Bitmap) RunOptimize()

RunOptimize attempts to further compress the runs of consecutive values found in the bitmap

func (*Bitmap) Select

func (rb *Bitmap) Select(x uint64) (uint64, error)

Select returns the xth integer in the bitmap

func (*Bitmap) SetCopyOnWrite

func (rb *Bitmap) SetCopyOnWrite(val bool)

SetCopyOnWrite sets this bitmap to use copy-on-write so that copies are fast and memory conscious if the parameter is true, otherwise we leave the default where hard copies are made (copy-on-write requires extra care in a threaded context). Calling SetCopyOnWrite(true) on a bitmap created with FromBuffer is unsafe.

func (*Bitmap) Stats

func (rb *Bitmap) Stats() roaring.Statistics

Stats returns details on container type usage in a Statistics struct.

func (*Bitmap) String

func (rb *Bitmap) String() string

String creates a string representation of the Bitmap

func (*Bitmap) ToArray

func (rb *Bitmap) ToArray() []uint64

ToArray creates a new slice containing all of the integers stored in the Bitmap in sorted order

func (*Bitmap) ToBase64

func (rb *Bitmap) ToBase64() (string, error)

ToBase64 serializes a bitmap as Base64

func (*Bitmap) ToBytes

func (rb *Bitmap) ToBytes() ([]byte, error)

ToBytes returns an array of bytes corresponding to what is written when calling WriteTo

func (*Bitmap) UnmarshalBinary

func (rb *Bitmap) UnmarshalBinary(data []byte) error

UnmarshalBinary implements the encoding.BinaryUnmarshaler interface for the bitmap

func (*Bitmap) WriteTo

func (rb *Bitmap) WriteTo(stream io.Writer) (int64, error)

WriteTo writes a serialized version of this bitmap to stream. The format is compatible with other 64-bit RoaringBitmap implementations (Java, Go, C++) and it has a specification : https://github.com/RoaringBitmap/RoaringFormatSpec#extention-for-64-bit-implementations

func (*Bitmap) Xor

func (rb *Bitmap) Xor(x2 *Bitmap)

Xor computes the symmetric difference between two bitmaps and stores the result in the current bitmap

type IntIterable64

type IntIterable64 interface {
	HasNext() bool
	Next() uint64
}

IntIterable64 allows you to iterate over the values in a Bitmap

type IntPeekable64

type IntPeekable64 interface {
	IntIterable64
	// PeekNext peeks the next value without advancing the iterator
	PeekNext() uint64
	// AdvanceIfNeeded advances as long as the next value is smaller than minval
	AdvanceIfNeeded(minval uint64)
}

IntPeekable64 allows you to look at the next value without advancing and advance as long as the next value is smaller than minval

type ManyIntIterable64

type ManyIntIterable64 interface {
	// pass in a buffer to fill up with values, returns how many values were returned
	NextMany([]uint64) int
}

ManyIntIterable64 allows you to iterate over the values in a Bitmap

type Operation added in v0.5.1

type Operation int

Operation identifier

const (
	// LT less than
	LT Operation = 1 + iota
	// LE less than or equal
	LE
	// EQ equal
	EQ
	// GE greater than or equal
	GE
	// GT greater than
	GT
	// RANGE range
	RANGE
	// MIN find minimum
	MIN
	// MAX find maximum
	MAX
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL