bitset

package module
v1.16.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 21, 2024 License: BSD-3-Clause Imports: 9 Imported by: 321

README

bitset

Go language library to map between non-negative integers and boolean values

Test Go Report Card PkgGoDev

This library is part of the awesome go collection. It is used in production by several important systems:

Description

Package bitset implements bitsets, a mapping between non-negative integers and boolean values. It should be more efficient than map[uint] bool.

It provides methods for setting, clearing, flipping, and testing individual integers.

But it also provides set intersection, union, difference, complement, and symmetric operations, as well as tests to check whether any, all, or no bits are set, and querying a bitset's current length and number of positive bits.

BitSets are expanded to the size of the largest set bit; the memory allocation is approximately Max bits, where Max is the largest set bit. BitSets are never shrunk. On creation, a hint can be given for the number of bits that will be used.

Many of the methods, including Set, Clear, and Flip, return a BitSet pointer, which allows for chaining.

Example use:
package main

import (
	"fmt"
	"math/rand"

	"github.com/bits-and-blooms/bitset"
)

func main() {
	fmt.Printf("Hello from BitSet!\n")
	var b bitset.BitSet
	// play some Go Fish
	for i := 0; i < 100; i++ {
		card1 := uint(rand.Intn(52))
		card2 := uint(rand.Intn(52))
		b.Set(card1)
		if b.Test(card2) {
			fmt.Println("Go Fish!")
		}
		b.Clear(card1)
	}

	// Chaining
	b.Set(10).Set(11)

	for i, e := b.NextSet(0); e; i, e = b.NextSet(i + 1) {
		fmt.Println("The following bit is set:", i)
	}
	if b.Intersection(bitset.New(100).Set(10)).Count() == 1 {
		fmt.Println("Intersection works.")
	} else {
		fmt.Println("Intersection doesn't work???")
	}
}

Package documentation is at: https://pkg.go.dev/github.com/bits-and-blooms/bitset?tab=doc

Serialization

You may serialize a bitset safely and portably to a stream of bytes as follows:

    const length = 9585
	const oneEvery = 97
	bs := bitset.New(length)
	// Add some bits
	for i := uint(0); i < length; i += oneEvery {
		bs = bs.Set(i)
	}

	var buf bytes.Buffer
	n, err := bs.WriteTo(&buf)
	if err != nil {
		// failure
	}
	// Here n == buf.Len()

You can later deserialize the result as follows:

	// Read back from buf
	bs = bitset.New()
	n, err = bs.ReadFrom(&buf)
	if err != nil {
		// error
	}
	// n is the number of bytes read

The ReadFrom function attempts to read the data into the existing BitSet instance, to minimize memory allocations.

Performance tip: When reading and writing to a file or a network connection, you may get better performance by wrapping your streams with bufio instances.

E.g.,

	f, err := os.Create("myfile")
	w := bufio.NewWriter(f)
	f, err := os.Open("myfile")
	r := bufio.NewReader(f)

Memory Usage

The memory usage of a bitset using N bits is at least N/8 bytes. The number of bits in a bitset is at least as large as one plus the greatest bit index you have accessed. Thus it is possible to run out of memory while using a bitset. If you have lots of bits, you might prefer compressed bitsets, like the Roaring bitmaps and its Go implementation.

The roaring library allows you to go back and forth between compressed Roaring bitmaps and the conventional bitset instances:

			mybitset := roaringbitmap.ToBitSet()
			newroaringbitmap := roaring.FromBitSet(mybitset)
Goroutine safety

In general, it not safe to access the same BitSet using different goroutines--they are unsynchronized for performance. Should you want to access a BitSet from more than one goroutine, you should provide synchronization. Typically this is done by using channels to pass the *BitSet around (in Go style; so there is only ever one owner), or by using sync.Mutex to serialize operations on BitSets.

Implementation Note

Go 1.9 introduced a native math/bits library. We provide backward compatibility to Go 1.7, which might be removed.

It is possible that a later version will match the math/bits return signature for counts (which is int, rather than our library's uint64). If so, the version will be bumped.

Installation

go get github.com/bits-and-blooms/bitset

Contributing

If you wish to contribute to this project, please branch and issue a pull request against master ("GitHub Flow")

Running all tests

Before committing the code, please check if it passes tests, has adequate coverage, etc.

go test
go test -cover

Documentation

Overview

Package bitset implements bitsets, a mapping between non-negative integers and boolean values. It should be more efficient than map[uint] bool.

It provides methods for setting, clearing, flipping, and testing individual integers.

But it also provides set intersection, union, difference, complement, and symmetric operations, as well as tests to check whether any, all, or no bits are set, and querying a bitset's current length and number of positive bits.

BitSets are expanded to the size of the largest set bit; the memory allocation is approximately Max bits, where Max is the largest set bit. BitSets are never shrunk. On creation, a hint can be given for the number of bits that will be used.

Many of the methods, including Set,Clear, and Flip, return a BitSet pointer, which allows for chaining.

Example use:

import "bitset"
var b BitSet
b.Set(10).Set(11)
if b.Test(1000) {
	b.Clear(1000)
}
if B.Intersection(bitset.New(100).Set(10)).Count() > 1 {
	fmt.Println("Intersection works.")
}

As an alternative to BitSets, one should check out the 'big' package, which provides a (less set-theoretical) view of bitsets.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Base64StdEncoding added in v1.1.4

func Base64StdEncoding()

Base64StdEncoding Marshal/Unmarshal BitSet with base64.StdEncoding(Default: base64.URLEncoding)

func BigEndian added in v1.14.3

func BigEndian()

BigEndian sets Marshal/Unmarshal Binary as Big Endian (Default: binary.BigEndian)

func BinaryOrder added in v1.14.3

func BinaryOrder() binary.ByteOrder

BinaryOrder returns the current binary order, see also LittleEndian() and BigEndian() to change the order.

func Cap

func Cap() uint

Cap returns the total possible capacity, or number of bits that can be stored in the BitSet theoretically. Under 32-bit system, it is 4294967295 and under 64-bit system, it is 18446744073709551615. Note that this is further limited by the maximum allocation size in Go, and your available memory, as any Go data structure.

func LittleEndian added in v1.1.4

func LittleEndian()

LittleEndian sets Marshal/Unmarshal Binary as Little Endian (Default: binary.BigEndian)

Types

type BitSet

type BitSet struct {
	// contains filtered or unexported fields
}

A BitSet is a set of bits. The zero value of a BitSet is an empty set of length 0.

func From added in v1.1.2

func From(buf []uint64) *BitSet

From is a constructor used to create a BitSet from an array of words

func FromWithLength added in v1.2.2

func FromWithLength(length uint, set []uint64) *BitSet

FromWithLength constructs from an array of words and length in bits. This function is for advanced users, most users should prefer the From function. As a user of FromWithLength, you are responsible for ensuring that the length is correct: your slice should have length at least (length+63)/64 in 64-bit words.

func MustNew added in v1.16.0

func MustNew(length uint) (bset *BitSet)

MustNew creates a new BitSet with the given length bits. It panics if length exceeds the possible capacity or by a lack of memory.

func New

func New(length uint) (bset *BitSet)

New creates a new BitSet with a hint that length bits will be required

func (*BitSet) All

func (b *BitSet) All() bool

All returns true if all bits are set, false otherwise. Returns true for empty sets.

func (*BitSet) Any

func (b *BitSet) Any() bool

Any returns true if any bit is set, false otherwise

func (*BitSet) BinaryStorageSize added in v1.1.2

func (b *BitSet) BinaryStorageSize() int

BinaryStorageSize returns the binary storage requirements (see WriteTo) in bytes.

func (*BitSet) Bytes added in v1.1.2

func (b *BitSet) Bytes() []uint64

Bytes returns the bitset as array of 64-bit words, giving direct access to the internal representation. It is not a copy, so changes to the returned slice will affect the bitset. It is meant for advanced users.

func (*BitSet) Clear

func (b *BitSet) Clear(i uint) *BitSet

Clear bit i to 0. This never cause a memory allocation. It is always safe.

func (*BitSet) ClearAll

func (b *BitSet) ClearAll() *BitSet

ClearAll clears the entire BitSet. It does not free the memory.

func (*BitSet) Clone

func (b *BitSet) Clone() *BitSet

Clone this BitSet

func (*BitSet) Compact added in v1.2.0

func (b *BitSet) Compact() *BitSet

Compact shrinks BitSet to so that we preserve all set bits, while minimizing memory usage. Compact calls Shrink. A new slice is allocated to store the new bits, so you may see an increase in memory usage until the GC runs. Normally this should not be a problem, but if you have an extremely large BitSet its important to understand that the old BitSet will remain in memory until the GC frees it. If you are memory constrained, this function may cause a panic.

func (*BitSet) Complement

func (b *BitSet) Complement() (result *BitSet)

Complement computes the (local) complement of a bitset (up to length bits)

func (*BitSet) Copy

func (b *BitSet) Copy(c *BitSet) (count uint)

Copy into a destination BitSet using the Go array copy semantics: the number of bits copied is the minimum of the number of bits in the current BitSet (Len()) and the destination Bitset. We return the number of bits copied in the destination BitSet.

func (*BitSet) CopyFull added in v1.2.2

func (b *BitSet) CopyFull(c *BitSet)

CopyFull copies into a destination BitSet such that the destination is identical to the source after the operation, allocating memory if necessary.

func (*BitSet) Count

func (b *BitSet) Count() uint

Count (number of set bits). Also known as "popcount" or "population count".

func (*BitSet) DeleteAt added in v1.1.10

func (b *BitSet) DeleteAt(i uint) *BitSet

DeleteAt deletes the bit at the given index position from within the bitset All the bits residing on the left of the deleted bit get shifted right by 1 The running time of this operation may potentially be relatively slow, O(length)

func (*BitSet) Difference

func (b *BitSet) Difference(compare *BitSet) (result *BitSet)

Difference of base set and other set This is the BitSet equivalent of &^ (and not)

func (*BitSet) DifferenceCardinality

func (b *BitSet) DifferenceCardinality(compare *BitSet) uint

DifferenceCardinality computes the cardinality of the differnce

func (*BitSet) DumpAsBits

func (b *BitSet) DumpAsBits() string

DumpAsBits dumps a bit set as a string of bits. Following the usual convention in Go, the least significant bits are printed last (index 0 is at the end of the string). This is useful for debugging and testing. It is not suitable for serialization.

func (*BitSet) Equal

func (b *BitSet) Equal(c *BitSet) bool

Equal tests the equivalence of two BitSets. False if they are of different sizes, otherwise true only if all the same bits are set

func (*BitSet) Flip

func (b *BitSet) Flip(i uint) *BitSet

Flip bit at i. Warning: using a very large value for 'i' may lead to a memory shortage and a panic: the caller is responsible for providing sensible parameters in line with their memory capacity.

func (*BitSet) FlipRange added in v1.2.0

func (b *BitSet) FlipRange(start, end uint) *BitSet

FlipRange bit in [start, end). Warning: using a very large value for 'end' may lead to a memory shortage and a panic: the caller is responsible for providing sensible parameters in line with their memory capacity.

func (*BitSet) GetWord64AtBit added in v1.15.0

func (b *BitSet) GetWord64AtBit(i uint) uint64

GetWord64AtBit retrieves bits i through i+63 as a single uint64 value

func (*BitSet) InPlaceDifference

func (b *BitSet) InPlaceDifference(compare *BitSet)

InPlaceDifference computes the difference of base set and other set This is the BitSet equivalent of &^ (and not)

func (*BitSet) InPlaceIntersection

func (b *BitSet) InPlaceIntersection(compare *BitSet)

InPlaceIntersection destructively computes the intersection of base set and the compare set. This is the BitSet equivalent of & (and)

func (*BitSet) InPlaceSymmetricDifference

func (b *BitSet) InPlaceSymmetricDifference(compare *BitSet)

InPlaceSymmetricDifference creates the destructive SymmetricDifference of base set and other set This is the BitSet equivalent of ^ (xor)

func (*BitSet) InPlaceUnion

func (b *BitSet) InPlaceUnion(compare *BitSet)

InPlaceUnion creates the destructive union of base set and compare set. This is the BitSet equivalent of | (or).

func (*BitSet) InsertAt added in v1.1.10

func (b *BitSet) InsertAt(idx uint) *BitSet

InsertAt takes an index which indicates where a bit should be inserted. Then it shifts all the bits in the set to the left by 1, starting from the given index position, and sets the index position to 0.

Depending on the size of your BitSet, and where you are inserting the new entry, this method could be extremely slow and in some cases might cause the entire BitSet to be recopied.

func (*BitSet) Intersection

func (b *BitSet) Intersection(compare *BitSet) (result *BitSet)

Intersection of base set and other set This is the BitSet equivalent of & (and)

func (*BitSet) IntersectionCardinality

func (b *BitSet) IntersectionCardinality(compare *BitSet) uint

IntersectionCardinality computes the cardinality of the intersection

func (*BitSet) IsStrictSuperSet added in v1.1.2

func (b *BitSet) IsStrictSuperSet(other *BitSet) bool

IsStrictSuperSet returns true if this is a strict superset of the other set

func (*BitSet) IsSuperSet added in v1.1.2

func (b *BitSet) IsSuperSet(other *BitSet) bool

IsSuperSet returns true if this is a superset of the other set

func (*BitSet) Len

func (b *BitSet) Len() uint

Len returns the number of bits in the BitSet. Note that it differ from Count function.

Example
var b BitSet
b.Set(8)
fmt.Println("len", b.Len())
fmt.Println("count", b.Count())
Output:

len 9
count 1

func (*BitSet) MarshalBinary added in v1.1.2

func (b *BitSet) MarshalBinary() ([]byte, error)

MarshalBinary encodes a BitSet into a binary form and returns the result. Please see WriteTo for details.

func (BitSet) MarshalJSON

func (b BitSet) MarshalJSON() ([]byte, error)

MarshalJSON marshals a BitSet as a JSON structure

func (*BitSet) NextClear added in v1.1.2

func (b *BitSet) NextClear(i uint) (uint, bool)

NextClear returns the next clear bit from the specified index, including possibly the current index along with an error code (true = valid, false = no bit found i.e. all bits are set)

func (*BitSet) NextSet

func (b *BitSet) NextSet(i uint) (uint, bool)

NextSet returns the next bit set from the specified index, including possibly the current index along with an error code (true = valid, false = no set bit found) for i,e := v.NextSet(0); e; i,e = v.NextSet(i + 1) {...}

Users concerned with performance may want to use NextSetMany to retrieve several values at once.

func (*BitSet) NextSetMany added in v1.1.4

func (b *BitSet) NextSetMany(i uint, buffer []uint) (uint, []uint)

NextSetMany returns many next bit sets from the specified index, including possibly the current index and up to cap(buffer). If the returned slice has len zero, then no more set bits were found

buffer := make([]uint, 256) // this should be reused
j := uint(0)
j, buffer = bitmap.NextSetMany(j, buffer)
for ; len(buffer) > 0; j, buffer = bitmap.NextSetMany(j,buffer) {
 for k := range buffer {
  do something with buffer[k]
 }
 j += 1
}

It is possible to retrieve all set bits as follow:

indices := make([]uint, bitmap.Count())
bitmap.NextSetMany(0, indices)

However if bitmap.Count() is large, it might be preferable to use several calls to NextSetMany, for performance reasons.

func (*BitSet) None

func (b *BitSet) None() bool

None returns true if no bit is set, false otherwise. Returns true for empty sets.

func (*BitSet) Rank added in v1.13.0

func (b *BitSet) Rank(index uint) uint

Rank returns the nunber of set bits up to and including the index that are set in the bitset. See https://en.wikipedia.org/wiki/Ranking#Ranking_in_statistics

func (*BitSet) ReadFrom added in v1.1.2

func (b *BitSet) ReadFrom(stream io.Reader) (int64, error)

ReadFrom reads a BitSet from a stream written using WriteTo The format is: 1. uint64 length 2. []uint64 set See WriteTo for details. Upon success, the number of bytes read is returned. If the current BitSet is not large enough to hold the data, it is extended. In case of error, the BitSet is either left unchanged or made empty if the error occurs too late to preserve the content.

Performance: if this function is used to read from a disk or network connection, it might be beneficial to wrap the stream in a bufio.Reader. E.g.,

f, err := os.Open("myfile")
r := bufio.NewReader(f)

func (*BitSet) Select added in v1.13.0

func (b *BitSet) Select(index uint) uint

Select returns the index of the jth set bit, where j is the argument. The caller is responsible to ensure that 0 <= j < Count(): when j is out of range, the function returns the length of the bitset (b.length).

Note that this function differs in convention from the Rank function which returns 1 when ranking the smallest value. We follow the conventional textbook definition of Select and Rank.

func (*BitSet) Set

func (b *BitSet) Set(i uint) *BitSet

Set bit i to 1, the capacity of the bitset is automatically increased accordingly. Warning: using a very large value for 'i' may lead to a memory shortage and a panic: the caller is responsible for providing sensible parameters in line with their memory capacity. The memory usage is at least slightly over i/8 bytes.

func (*BitSet) SetAll added in v1.13.0

func (b *BitSet) SetAll() *BitSet

SetAll sets the entire BitSet

func (*BitSet) SetBitsetFrom added in v1.3.0

func (b *BitSet) SetBitsetFrom(buf []uint64)

SetBitsetFrom fills the bitset with an array of integers without creating a new BitSet instance

func (*BitSet) SetTo

func (b *BitSet) SetTo(i uint, value bool) *BitSet

SetTo sets bit i to value. Warning: using a very large value for 'i' may lead to a memory shortage and a panic: the caller is responsible for providing sensible parameters in line with their memory capacity.

func (*BitSet) ShiftLeft added in v1.14.0

func (b *BitSet) ShiftLeft(bits uint)

ShiftLeft shifts the bitset like << operation would do.

Left shift may require bitset size extension. We try to avoid the unnecessary memory operations by detecting the leftmost set bit. The function will panic if shift causes excess of capacity.

func (*BitSet) ShiftRight added in v1.14.0

func (b *BitSet) ShiftRight(bits uint)

ShiftRight shifts the bitset like >> operation would do.

func (*BitSet) Shrink added in v1.1.10

func (b *BitSet) Shrink(lastbitindex uint) *BitSet

Shrink shrinks BitSet so that the provided value is the last possible set value. It clears all bits > the provided index and reduces the size and length of the set.

Note that the parameter value is not the new length in bits: it is the maximal value that can be stored in the bitset after the function call. The new length in bits is the parameter value + 1. Thus it is not possible to use this function to set the length to 0, the minimal value of the length after this function call is 1.

A new slice is allocated to store the new bits, so you may see an increase in memory usage until the GC runs. Normally this should not be a problem, but if you have an extremely large BitSet its important to understand that the old BitSet will remain in memory until the GC frees it. If you are memory constrained, this function may cause a panic.

func (*BitSet) String added in v1.1.2

func (b *BitSet) String() string

String creates a string representation of the BitSet. It is only intended for human-readable output and not for serialization.

func (*BitSet) SymmetricDifference

func (b *BitSet) SymmetricDifference(compare *BitSet) (result *BitSet)

SymmetricDifference of base set and other set This is the BitSet equivalent of ^ (xor)

func (*BitSet) SymmetricDifferenceCardinality

func (b *BitSet) SymmetricDifferenceCardinality(compare *BitSet) uint

SymmetricDifferenceCardinality computes the cardinality of the symmetric difference

func (*BitSet) Test

func (b *BitSet) Test(i uint) bool

Test whether bit i is set.

func (*BitSet) Union

func (b *BitSet) Union(compare *BitSet) (result *BitSet)

Union of base set and other set This is the BitSet equivalent of | (or)

func (*BitSet) UnionCardinality

func (b *BitSet) UnionCardinality(compare *BitSet) uint

UnionCardinality computes the cardinality of the uniton of the base set and the compare set.

func (*BitSet) UnmarshalBinary added in v1.1.2

func (b *BitSet) UnmarshalBinary(data []byte) error

UnmarshalBinary decodes the binary form generated by MarshalBinary. Please see WriteTo for details.

func (*BitSet) UnmarshalJSON

func (b *BitSet) UnmarshalJSON(data []byte) error

UnmarshalJSON unmarshals a BitSet from JSON created using MarshalJSON

func (*BitSet) WriteTo added in v1.1.2

func (b *BitSet) WriteTo(stream io.Writer) (int64, error)

WriteTo writes a BitSet to a stream. The format is: 1. uint64 length 2. []uint64 set The length is the number of bits in the BitSet.

The set is a slice of uint64s containing between length and length + 63 bits. It is interpreted as a big-endian array of uint64s by default (see BinaryOrder()) meaning that the first 8 bits are stored at byte index 7, the next 8 bits are stored at byte index 6... the bits 64 to 71 are stored at byte index 8, etc. If you change the binary order, you need to do so for both reading and writing. We recommend using the default binary order.

Upon success, the number of bytes written is returned.

Performance: if this function is used to write to a disk or network connection, it might be beneficial to wrap the stream in a bufio.Writer. E.g.,

      f, err := os.Create("myfile")
	       w := bufio.NewWriter(f)

type Error added in v1.1.2

type Error string

Error is used to distinguish errors (panics) generated in this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL