trie

package module
v0.0.0-...-7ffb9d4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 26, 2023 License: MIT Imports: 6 Imported by: 2

README

trie

A Trie (Prefix Index) implementation in golang. It works fine with all unicode characters.

Documentation can be found at godoc.org.

Entries are reference counted: If you Add("foo") twice and Del("foo") it once it will still be found.

Build Status

Example

t := trie.NewTrie()
t.Add("foo")
t.Add("bar")
t.PrintDump()

// output:
//  I:f (-)
// - V:oo (1)
// --- $
//  I:b (-)
// - V:ar (1)
// --- $

t.Add("foo")
t.PrintDump()

// output:
//  I:f (-)
// - V:oo (2)
// --- $
//  I:b (-)
// - V:ar (1)
// --- $

fmt.Println(t.Has("foo"))
// output: true

fmt.Println(t.HasCount("foo"))
// output: true 2

fmt.Println(t.Has("foobar"))
// output: false

fmt.Println(t.Members())
// output: [foo(2) bar(1)]

t.Add("food")
t.Add("foobar")
t.Add("foot")
fmt.Println(t.HasPrefix("foo"))
// output: true

fmt.Println(t.PrefixMembers("foo"))
// output: [foo(2) food(1) foobar(1) foot(1)]

A Trie can be dumped into a file with

t.DumpToFile("/tmp/trie_foo")

And loaded with

t2, _ := trie.LoadFromFile("/tmp/trie_foo")
fmt.Println(t2.Members())
// output: [foo(2) food(1) foobar(1) foot(1) bar(1)]

An existing Trie can be merged with a stored one with

t3 := trie.NewTrie()
t3.Add("フー")
t3.Add("バー")
t3.Add("日本語")
fmt.Println(t3.Members())
// output: [フー(1) バー(1) 日本語(1)]

t3.MergeFromFile("/tmp/trie_foo")
fmt.Println(t3.Members())
// output: [フー(1) バー(1) 日本語(1) foo(2) food(1) foobar(1) foot(1) bar(1)]

Documentation

Overview

Trie is a prefix index package for golang.

Terminology used

Trie - Contains the Root of the Index - which is a Branch

Branch - A Branch might have a LeafValue or not and might have other Branches splitting off. It has a flag `End` that marks the end of a term that has been inserted.

Entry - an entry refers to a _complete_ term that is inserted, removed from, or matched in the index. It requires `End` on the Branch to be set to `true`, which makes it different from a

Prefix - which does not require the Branch to have End set to `true` to match.

Index

Constants

View Source
const PADDING_CHAR = "-"

Variables

This section is empty.

Functions

This section is empty.

Types

type Branch

type Branch struct {
	sync.RWMutex
	Branches  map[byte]*Branch
	LeafValue []byte
	End       bool
	Count     int64
}

func (*Branch) Dump

func (b *Branch) Dump(depth int) (out string)

func (*Branch) NewBranch

func (b *Branch) NewBranch() *Branch

NewBranch returns a new initialezed *Branch

func (*Branch) PrintDump

func (b *Branch) PrintDump()

func (*Branch) String

func (b *Branch) String() string

type MemberInfo

type MemberInfo struct {
	Value string
	Count int64
}

func (*MemberInfo) String

func (m *MemberInfo) String() string

type Trie

type Trie struct {
	Root *Branch
}

func LoadFromFile

func LoadFromFile(fname string) (*Trie, error)

LoadFromFile loads a gib encoded wordlist from a file and creates a new Trie by Add()ing all of them.

func NewTrie

func NewTrie() *Trie

NewTrie returns the pointer to a new Trie with an initialized root Branch

func (*Trie) Add

func (t *Trie) Add(entry string) *Branch

Add adds an entry to the trie and returns the branch node that the insertion was made at - or rather where the end of the entry was marked.

func (*Trie) Delete

func (t *Trie) Delete(entry string) bool

Delete decrements the count of an existing entry by one. If the count equals zero it removes an the entry from the trie. Returns true if the entry existed, false otherwise. Note that the return value says something about the previous existence of the entry - not whether it has been completely removed or just its count decremented.

func (*Trie) Dump

func (t *Trie) Dump() string

Dump returns a string representation of the `Trie`

func (*Trie) DumpToFile

func (t *Trie) DumpToFile(fname string) error

DumpToFile dumps all values into a slice of strings and writes that to a file using encoding/gob.

The Trie itself can currently not be encoded directly because gob does not directly support structs with a sync.Mutex on them.

func (*Trie) GetBranch

func (t *Trie) GetBranch(entry string) *Branch

GetBranch returns the branch end if the `entry` exists in the `Trie`

func (*Trie) Has

func (t *Trie) Has(entry string) bool

Has returns true if the `entry` exists in the `Trie`

func (*Trie) HasCount

func (t *Trie) HasCount(entry string) (exists bool, count int64)

HasCount returns true if the `entry` exists in the `Trie`. The second returned value is the count how often the entry has been set.

func (*Trie) HasPrefix

func (t *Trie) HasPrefix(prefix string) bool

HasPrefix returns true if the the `Trie` contains entries with the given prefix

func (*Trie) HasPrefixCount

func (t *Trie) HasPrefixCount(prefix string) (exists bool, count int64)

HasPrefixCount returns true if the the `Trie` contains entries with the given prefix. The second returned value is the count how often the entry has been set.

func (*Trie) Members

func (t *Trie) Members() []*MemberInfo

Members returns all entries of the Trie with their counts as MemberInfo

func (*Trie) MembersList

func (t *Trie) MembersList() (members []string)

Members returns a Slice of all entries of the Trie

func (*Trie) MergeFromFile

func (t *Trie) MergeFromFile(fname string) error

MergeFromFile loads a gib encoded wordlist from a file and Add() them to the `Trie`.

TODO: write tests for merge

func (*Trie) PrefixMembers

func (t *Trie) PrefixMembers(prefix string) []*MemberInfo

PrefixMembers returns all entries of the Trie that have the given prefix with their counts as MemberInfo

func (*Trie) PrefixMembersList

func (t *Trie) PrefixMembersList(prefix string) (members []string)

PrefixMembers returns a List of all entries of the Trie that have the given prefix

func (*Trie) PrintDump

func (t *Trie) PrintDump()

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL