bptree

package
v0.0.0-...-932458b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 19, 2023 License: GPL-3.0 Imports: 10 Imported by: 0

README

B+ Tree

Package bptree implements an on-disk B+ tree. This implementation of B+ tree can store keys of variable size (but limited by configured maxKeySize) and uint64 values. Since this implementation is meant to act as an indexing scheme for the Kiwi store, the uint64 value here is meant to store the offset/record id of the actual data stored in a data-file.

Implementation

Each node in the tree is mapped to exactly one page or block (block size must be multiple of 4096) in the underlying file. Degree of the tree is computed based on the page size and key size. Degree might vary slightly between internal nodes and leaf nodes since internal nodes store only keys and pointers to child nodes whereas leaf nodes store key, value and pointers to right and left siblings for range scans. But the B+ tree invariants will be ensured.

First page is reserved for metadata about the tree which includes the page size used to initialize, maximum key size, free list etc.

Page Layouts
  • Meta page:

    --- header section ---
    magic   (2 bytes) - a constant magic marker
    version (1 byte) - version of the implementation
    flags   (1 byte) - control flags if any (unused)
    keySz   (2 byte) - max key size allowed (i.e., upto 2^16-1)
    pageSz  (4 byte) - page size used to init index
    size    (4 byte) - number of entries in the tree
    rootID  (4 byte) - pointer to the root node
    freeSz  (4 byte) - size of the free list (allocated but unused page ids)
    ---- header ends -----
    freeId1 (4 byte) - pointer to a free page
    ...
    
  • Leaf Node:

    --- header section ---
    flags  (1 byte)   - flags to indicate leaf/internal etc.
    count  (2 bytes)  - number of entries in this node
    next   (4 bytes)  - pointer to right sibling
    prev   (4 bytes)  - pointer to left sibling
    ---- header ends ----
    value1 (8 bytes)  - value associated with the first key
    key1Sz (2 bytes)  - size of the first key
    key1   (variable) - first key itself
    value2 (8 bytes)  - value associated with the second key
    key2Sz (2 bytes)  - size of the second key
    key2   (variable) - second key itself
    ...
    
  • Internal Node:

    --- header section ---
    flags  (1 byte)   - flags to indicate leaf/internal etc.
    count  (2 bytes)  - number of entries in this node
    ---- header ends ----
    P0     (4 bytes)  - pointer to the 0th child
    key1Sz (2 bytes)  - size of key 1
    key1   (variable) - key 1 itself
    P1     (4 bytes)  - pointer to the 1st child
    key2Sz (2 bytes)  - size of key 2
    key2   (variable) - key 2 itself
    ...
    

Limitations

  1. Key size is bounded by the configured max key size which itself is bounded by page size since these 2 together calculate the branching factor (or degree) of the tree and increasing the key size reduces the degree.
  2. There is no compaction implemented for the index file after too many deletions cause lot of free pages. Free pages are simply discarded with a warning log if the IDs don't fit in the meta page. Simple solution for this is to do a range-scan and re-create a new index file and delete the old one.

Benchmarks

Output of very simple benchmarks (found in tests/ directory):

    TestBPlusTree: bptree_test.go:16: using file 'kiwi_bptree.idx'...
    TestBPlusTree: bptree_test.go:38: took 45.007031ms to Put 10000 entries
    TestBPlusTree: bptree_test.go:44: took 49.01µs to Scan 10000 entries
    TestBPlusTree: bptree_test.go:50: took 1.874451ms to Get 10000 entries

Documentation

Overview

Package bptree implements an on-disk B+ tree indexing scheme that can store key-value pairs and provide fast lookups and range scans. keys can be blobs binary data and value is uint64.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Print

func Print(tree *BPlusTree) error

Print traverses the entire tree and pretty prints it. This should be used for debugging only.

Types

type BPlusTree

type BPlusTree struct {
	// contains filtered or unexported fields
}

BPlusTree represents an on-disk B+ tree. Each node in the tree is mapped to a single page in the file. Degree of the tree is decided based on the page size and max key size while initializing.

func Open

func Open(fileName string, opts *Options) (*BPlusTree, error)

Open opens the named file as a B+ tree index file and returns an instance B+ tree for use. Use ":memory:" for an in-memory B+ tree instance for quick testing setup. Degree of the tree is computed based on maxKeySize and pageSize used by the pager. If nil options are provided, defaultOptions will be used.

func (*BPlusTree) Close

func (tree *BPlusTree) Close() error

Close flushes any writes and closes the underlying pager.

func (*BPlusTree) Del

func (tree *BPlusTree) Del(key []byte) (uint64, error)

Del removes the key-value entry from the B+ tree. If the key does not exist, returns error.

func (*BPlusTree) Get

func (tree *BPlusTree) Get(key []byte) (uint64, error)

Get fetches the value associated with the given key. Returns error if key not found.

func (*BPlusTree) Put

func (tree *BPlusTree) Put(key []byte, val uint64) error

Put puts the key-value pair into the B+ tree. If the key already exists, its value will be updated.

func (*BPlusTree) Scan

func (tree *BPlusTree) Scan(key []byte, reverse bool, scanFn func(key []byte, v uint64) bool) error

Scan performs an index scan starting at the given key. Each entry will be passed to the scanFn. If the key is zero valued (nil or len=0), then the left/right leaf key will be used as the starting key. Scan continues until the right most leaf node is reached or the scanFn returns 'true' indicating to stop the scan. If reverse=true, scan starts at the right most node and executes in descending order of keys.

func (*BPlusTree) Size

func (tree *BPlusTree) Size() int64

Size returns the number of entries in the entire tree

func (*BPlusTree) String

func (tree *BPlusTree) String() string

type Options

type Options struct {
	// ReadOnly mode for index. All mutating operations will return
	// error in this mode.
	ReadOnly bool

	// FileMode for creating the file. Applicable only if when a new
	// index file is being initialized.
	FileMode os.FileMode

	// PageSize to be for file I/O. All reads and writes will always
	// be done with pages of this size. Must be multiple of 4096.
	PageSize int

	// MaxKeySize represents the maximum size allowed for the key.
	// Put call with keys larger than this will result in error.
	// Branching factor reduces as this size increases. So smaller
	// the better.
	MaxKeySize int

	// PreAlloc can be set to enable pre-allocating pages when the
	// index is initialized. This helps avoid mmap/unmap and truncate
	// overheads during insertions.
	PreAlloc int
}

Options represents the configuration options for the B+ tree index.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL