Trees

package
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 10, 2024 License: MIT Imports: 4 Imported by: 0

Documentation

Overview

Package Trees implements Size Balanced Tree using arrays in a minimally recursive approach. Elements in the tree is unique. The left most element is the smallest element.

# Size Balanced Tree Size Balanced Tree(SBTree) is a binary tree that balances itself naturally using subtree sizes. It's flatter than other prevalent implementations like RBTree. It's also faster than those implementations as indicated by the benchmark and th original paper. Here its superiority over RBTree is also demonstrated.

# Using Arrays SBTree is quite tricky to implement using pointers Therefore, like the original paper, we use arrays to implement it. However, 1 very common problem of using arrays is indexing. A tree that's good for general use must utilize all indexes in the array, it mustn't contain holes. The most intuitive "left=2^n, right=2^n+1" approach will leave lots of wasted indexes should the tree not be perfect. Maintaining a top index count doesn't have this issue, but, like the above, are prone to creating and forgetting hole indexes when deletion is involved. In this implementation, I employed a way to record the hole indexes without additional memory and can be stored and retried in O(1) time. This enables this implementation to utilize the advantages of using arrays while not worry about wasting indexes. In fact, this implementation is guaranteed to exhaust existing array before growing.

In Go, array are indexed using `int`, so we restrict the indexing type to `uint` at most. This means that you won't be able to use `uint64` as indexes on 32bit machine. This is pretty reasonable.

# Modifications This implementation is slightly different from the original implementation. First, maintaining is split into 2 orientations to make it more intuitive without degrading performance. Second, unlike the original, deletion will sometimes balance the tree afterward. See [base.maintainLeft], Tree.Del for details.

# Minimal Recursions This implementation is minimally recursive. Mutating operations are mostly iterative while readonly operations are usually constant space iterative. Balancing operations, namely [base.maintainLeft] and [base.maintainRight], are recursive because they are hard to express iteratively, but they're amortized O(1). Tree.Add and Tree.Del requires backtracking, thus we've to use a something to store the stack. Fortunately, they both involve very simple backtracking and can be implemented efficiently and easily by using a slice. Also, using the height guarantee of SBTree, we can actually allocate this slice beforehand and reuse it as long as we know how big will the tree be.

For traversing, I provided both a normal stack based traversal and a constant space Morris traversal. This is because Morris traversal is a mutating operation, so only 1 traversal may happen at any time. Also, Morris traversal doesn't support early termination by its nature, so in some cases, normal traversals may still be helpful.

# Concurrency This isn't safe for concurrent usages. Writes shouldn't happen at the same time with other reads or writes. Section below is for those who are interested in details. Generally, you shouldn't worry about it.

Due to the existence of Morris traversal, we can't classify operations into simple reads and writes. We define the followings. Read operations:

  • R0: Reads indexing data, [info.l] and [info.r], in [base.ifsHead].
  • R1: Reads value data as stored in [base.vsHead].
  • R2: Reads balancing data, [info.sz], in [base.ifsHead].

Write Operations:

  • W0: Corresponds to R0.
  • W1: Corresponds to R1.
  • W2: Corresponds to R2.

Multiple reads of any types may happen at the same time. Only 1 write of a single type can happen at the same time. This means that reads and write of different type can happen at the same type. We will classify all public functions using these definitions.

This implementation returns pointers instead of values. Generally, you shouldn't keep track of the pointer and should immediately dereference it for safety. However, it's fine to use the pointers later under some occasions. A W1 operation can invalidate the pointer, so pointers prior to the beginning of a W1 operation should be discarded. Other than this, just make sure you're not writing to any pointers during R1 operations. Writing to the pointer can also break the structure of the tree, leading to undefined behaviors. Unfortunately, Go doesn't have const pointers like C++, so it's up to the users to either not write to the pointer or ensure that the write operation doesn't break its standing in the tree.

# Performance Like any other balanced binary trees, most operations are all O(log(N)) in time while the memory usage is O(N). Section below is for those interested in the constant factors.

Let A0 be the backing [info] array [base.ifsHead], A1 be the backing value array [base.vsHead]. Note that len(A1)=len(A0)-1. len(Ai)<=cap(Ai) and cap(Ai) is linear to len(Ai). Let B be the total number of unique insertions, calls to Tree.Add that returns true. Let C be the total number of elements in the array. Note that C<=len(A1) Let D be the actual average height of the tree, D<=e(C) or e(B) where e is a function that gives the theoretically max height of the tree given C. e(x)~=1.4404*log2(x) disregarding some small constant offset. In practice, this can be estimated easily by `bits.Len(x)*7/5`. We will use these terms when giving a detailed description of performance for each function.

The non-constant term of memory usage for value type T and index type S is sizeof(T)*cap(A1)+sizeof(S)*3*cap(A1). As you may notice, it's extremely compact in terms of memory. [info] has no padding itself, so putting T and [info] each into their own respective array is the most compact format. Also, on 64-bit machine, using uint32 as indexing will mean that each node only takes an additional 12 bytes space, or 1.5 words, this isn't achievable on any pointer based implementations assuming that pointers take 1 full word.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CTree added in v0.5.0

type CTree[T any, S Indexable] struct {

	// Returns negative number if first < second, 0 if first==second, positive number if first>second. see cmp.Compare for an example.
	Cmp func(T, T) int
	// contains filtered or unexported fields
}

CTree is a variant that uses custom comparator. It may also be useful in cases where comparisons is very burdensome (very long strings) because it only compares once.

func FromC added in v0.5.0

func FromC[T any, S Indexable](vs []T, cmp func(T, T) int) *CTree[T, S]

func NewC added in v0.5.0

func NewC[T any, S Indexable](hint S, cmp func(T, T) int) *CTree[T, S]

func (*CTree[T, S]) Add added in v0.5.0

func (u *CTree[T, S]) Add(v T, st []uintptr) (bool, []uintptr)

func (*CTree) Clear added in v0.5.0

func (u *CTree) Clear(zero bool)

Clear the tree, also zeroes the value array's values if zero is true. Doesn't allocate new arrays. Time: O(1) when !zero, O(len(A1)) when zero. Space: O(1). Type: W0, W1, W2.

func (*CTree[T, S]) Clone added in v0.5.0

func (u *CTree[T, S]) Clone() *CTree[T, S]

func (*CTree) Compact added in v0.5.0

func (u *CTree) Compact()

Compact the tree by copying the content to a smaller array and filling the holes if necessary. Time: O(C). Space: sizeof(T)*C+sizeof(S)*3*(C+1). Type: W0, W1, W2.

func (*CTree[T, S]) Del added in v0.5.0

func (u *CTree[T, S]) Del(v T, st []uintptr) (bool, []uintptr)

func (*CTree[T, S]) Get added in v0.5.0

func (u *CTree[T, S]) Get(v T) *T

func (*CTree) InOrder added in v0.5.0

func (u *CTree) InOrder(f func(*T) bool, st []S) []S

InOrder traversal of teh tree. When st==nil, uses morris traversal; otherwise, use normal stack based iterative traversal. Time: O(n). Space: O(1) when using Morris Traversal, O(sizeof(S)*D) when using normal traversal. Type: W0 when Morris Traversal, R0 when normal traversal.

func (*CTree) InOrderR added in v0.5.0

func (u *CTree) InOrderR(f func(*T) bool, st []S) []S

InOrderR is the reverse in order traversal.

func (*CTree[T, S]) Predecessor added in v0.5.0

func (u *CTree[T, S]) Predecessor(v T, strict bool) (p *T)

func (*CTree) RankK added in v0.5.0

func (u *CTree) RankK(k S) *T

RankK element in tree, starting from 0. Time: O(D). Space: O(1). Type: R0, R2.

func (*CTree[T, S]) RankOf added in v0.5.0

func (u *CTree[T, S]) RankOf(v T) (S, bool)

func (*CTree) Size added in v0.5.0

func (u *CTree) Size() S

Size of tree. Type: R2.

func (*CTree[T, S]) Successor added in v0.5.0

func (u *CTree[T, S]) Successor(v T, strict bool) (p *T)

func (*CTree[T, S]) Zero added in v0.5.0

func (u *CTree[T, S]) Zero() (count S)

Zero all the removed elements(holes) in the value array. Time: <=O(len(A1)). Space: O(1). Type: W0, W1.

type Indexable added in v0.5.0

type Indexable interface {
	~byte | ~uint16 | ~uint32 | ~uint

} // Exclude uint64 from being used as indexes.

Indexable are types that can be used as indexes in the tree.

type Tree

type Tree[T cmp.Ordered, S Indexable] struct {
	// contains filtered or unexported fields
}

Tree is a variant that supports only cmp.Ordered as keys.

func From added in v0.5.0

func From[T cmp.Ordered, S Indexable](vs []T) *Tree[T, S]

From a given value array, directly build a tree. The array is handled to the tree, and it mustn't be modified by the caller later. vs must be sorted in ascending order for the tree to not be corrupt. Time: O(C).

func New added in v0.4.0

func New[T cmp.Ordered, S Indexable](hint S) *Tree[T, S]

New Tree that can hold hint number of elements without growing.

func (*Tree[T, S]) Add added in v0.5.0

func (u *Tree[T, S]) Add(v T, st []uintptr) (bool, []uintptr)

Add an element to the tree. Add guarantees that holes are filled first before appending to the underlying arrays. st is the slice used as the recursion stack. Returns whether the element is added and the grown recursion stack. To reuse the slice:

var st []uintptr
for ... {
	_, st = tree.Add(..., st)
}

Reassigning the value is unnecessary if st is allocated to be enough. Time: O(D). Space: O(H). Type: W0, W1, W2.

func (*Tree) Clear added in v0.5.0

func (u *Tree) Clear(zero bool)

Clear the tree, also zeroes the value array's values if zero is true. Doesn't allocate new arrays. Time: O(1) when !zero, O(len(A1)) when zero. Space: O(1). Type: W0, W1, W2.

func (*Tree[T, S]) Clone added in v0.5.0

func (u *Tree[T, S]) Clone() *Tree[T, S]

Clone the tree, making an almost exact copy (up to len(A0) and len(A1)). Time: O(C). Space: O(1) disregarding the new tree. Type: R0, R1, R2.

func (*Tree) Compact added in v0.5.0

func (u *Tree) Compact()

Compact the tree by copying the content to a smaller array and filling the holes if necessary. Time: O(C). Space: sizeof(T)*C+sizeof(S)*3*(C+1). Type: W0, W1, W2.

func (*Tree[T, S]) Del added in v0.5.0

func (u *Tree[T, S]) Del(v T, st []uintptr) (bool, []uintptr)

Del an element from the tree. Del sometimes balances the tree; the chance is inversely proportional to tree's size. Returns the grown stack and whether the element is deleted. Time: O(D). Space: O(H). Type: W0, W1, W2.

func (*Tree[T, S]) Get added in v0.5.0

func (u *Tree[T, S]) Get(v T) *T

Get the pointer to the element that's equal to v in the tree. Time: O(D). Space: O(1). Type: R0, R1..

func (*Tree) InOrder

func (u *Tree) InOrder(f func(*T) bool, st []S) []S

InOrder traversal of teh tree. When st==nil, uses morris traversal; otherwise, use normal stack based iterative traversal. Time: O(n). Space: O(1) when using Morris Traversal, O(sizeof(S)*D) when using normal traversal. Type: W0 when Morris Traversal, R0 when normal traversal.

func (*Tree) InOrderR added in v0.5.0

func (u *Tree) InOrderR(f func(*T) bool, st []S) []S

InOrderR is the reverse in order traversal.

func (*Tree[T, S]) Predecessor

func (u *Tree[T, S]) Predecessor(v T, strict bool) (p *T)

Predecessor of v. If strict is true, result<v if found; otherwise, result<=v. Time: O(D). Space: O(1). Type: R0, R1.

func (*Tree) RankK added in v0.5.0

func (u *Tree) RankK(k S) *T

RankK element in tree, starting from 0. Time: O(D). Space: O(1). Type: R0, R2.

func (*Tree[T, S]) RankOf

func (u *Tree[T, S]) RankOf(v T) (S, bool)

RankOf v, starting from 0. If v isn't found, returns the rank as if v is added to the tree. Time: O(D). Space: O(1). Type: R0, R1, R2.

func (*Tree) Size

func (u *Tree) Size() S

Size of tree. Type: R2.

func (*Tree[T, S]) Successor

func (u *Tree[T, S]) Successor(v T, strict bool) (p *T)

Successor of v. If strict is true, result>v if found; otherwise, result>=v. Time: O(D). Space: O(1). Type: R0, R1.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL