Documentation ¶
Overview ¶
Package keywordmap provides a trie-based implemetation of a map from strings to indices. Its intended use is mapping strings to indices into a list of keywords. Strings are considered as sequences of bytes.
For typical keyword sets, membership tests using keywordmap are about 1.5 – 2 times faster than tests using map[string]int (as benchmarked on an M1 MacBook Air).
Internally, keywordmap constructs a compact trie backed by an array of unsigned integers. The default (and recommended) Trie type uses 16-bit integers. This is sufficient for realistically sized sets of keywords. You may use GenericTrie[uint32] for larger tries. However, this package is not optimized for dealing with large sets of keywords.
Index ¶
- func AddToTrie[T ByteIndexable, I constraints.Unsigned](trie *GenericTrie[I], word T, wordIndex int) bool
- func GetBackingSlice[I constraints.Unsigned](trie GenericTrie[I]) []I
- func KeywordIndex[T ByteIndexable, I constraints.Unsigned](trie GenericTrie[I], word T) int
- type ByteIndexable
- type GenericTrie
- type Trie
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AddToTrie ¶
func AddToTrie[T ByteIndexable, I constraints.Unsigned](trie *GenericTrie[I], word T, wordIndex int) bool
AddToTrie adds word to the trie and associates it with the index wordIndex. It returns true if the word was successfully added to the trie, or false otherwise. A word can fail to be added to the trie if the trie becomes too big. In the case where AddToTrie returns false, part of the word may have been added to the trie.
It is usually better to construct tries using MakeTrie. AddToTrie is useful if there are gaps in the sequence of indices associated with each keyword.
func GetBackingSlice ¶
func GetBackingSlice[I constraints.Unsigned](trie GenericTrie[I]) []I
GetBackingSlice returns a copy of the trie's backing slice. This value (or another slice with the same contents) can be passed to MakeTrieFromBackingSlice. The return value serves no other purpose and has no defined interpretation.
func KeywordIndex ¶
func KeywordIndex[T ByteIndexable, I constraints.Unsigned](trie GenericTrie[I], word T) int
KeywordIndex returns the index of word in the list of keywords passed to MakeTrie/MakeGenericTrie, or -1 if it is not present.
Types ¶
type ByteIndexable ¶
ByteIndexable is a string or byte slice
type GenericTrie ¶
type GenericTrie[I constraints.Unsigned] struct { // contains filtered or unexported fields }
GenericTrie represents a set of strings (such as the set of a programming language's keywords). It should be constructed only via MakeGenericTrie (or MakeTrie when I=uint16). The I type parameter specifies the types of the elements of the backing array.
func MakeEmptyTrie ¶
func MakeEmptyTrie[I constraints.Unsigned]() GenericTrie[I]
MakeEmptyTrie returns an empty trie.
func MakeGenericTrie ¶
func MakeGenericTrie[I constraints.Unsigned, T ByteIndexable](keywords []T) (GenericTrie[I], bool)
MakeGenericTrie constructs a trie from a set of keywords. Each keyword is considered as a sequence of bytes. If your keywords have multiple possible encodings, you will need to add each encoding to the trie. The second return value is true if a suitable trie could be constructed, or false otherwise. In the latter case, the returned trie is empty. A trie can fail to be constructed if the set of keywords is too large.
func MakeTrieFromBackingSlice ¶
func MakeTrieFromBackingSlice[I constraints.Unsigned](slice []I) GenericTrie[I]
MakeTrieFromBackingSlice can be used for minimal-cost construction of a trie. Follow these steps:
- Construct your desired trie in some scratch code using MakeTrie/MakeGenericTrie.
- Use GetBackingSlice to get the contents of the backing slice.
- Copy the contents of the backing slice into your code as a constant.
- Use this constant as the argument to MakeTrieFromBackingSlice.
It should rarely (if ever) be necessary to initialize a trie using this function, as MakeTrie/MakeGenericTrie are not at all expensive.
type Trie ¶
type Trie = GenericTrie[uint16]
Trie is the recommended instantiation of GenericTrie. A backing array of uint16 suffices for sets of keywords with no more than a few hundred members.
func MakeTrie ¶
func MakeTrie[T ByteIndexable](keywords []T) (Trie, bool)
MakeTrie calls MakeGenericTrie with the I type parameter set to uint16 (the recommended default).