Documentation
¶
Overview ¶
Package gaddag implements the GADDAG, a pretty cool data structure invented by Steven Gordon.
Utility functions for doing cool things with gaddags.
Index ¶
- Constants
- func FindHooks(g *SimpleGaddag, word string, hookType int) []rune
- func FindInnerHook(g *SimpleGaddag, word string, hookType int) bool
- func FindPrefix(g *SimpleGaddag, prefix string) bool
- func FindWord(g *SimpleGaddag, word string) bool
- func FindWordDawg(g *SimpleGaddag, word string) bool
- type SimpleGaddag
- func (g *SimpleGaddag) ArcToIdxLetter(arcIdx uint32) (uint32, alphabet.MachineLetter)
- func (g *SimpleGaddag) GetAlphabet() *alphabet.Alphabet
- func (g *SimpleGaddag) GetLetterSet(nodeIdx uint32) alphabet.LetterSet
- func (g *SimpleGaddag) GetRootNodeIndex() uint32
- func (g *SimpleGaddag) InLetterSet(letter alphabet.MachineLetter, nodeIdx uint32) bool
- func (g *SimpleGaddag) LetterSetAsRunes(nodeIdx uint32) []rune
- func (g *SimpleGaddag) LexiconName() string
- func (g *SimpleGaddag) NextNodeIdx(nodeIdx uint32, letter alphabet.MachineLetter) uint32
- func (g *SimpleGaddag) NumArcs(nodeIdx uint32) byte
Constants ¶
const ( BackHooks = 0 FrontHooks = 1 BackInnerHook = 2 FrontInnerHook = 3 )
Variables ¶
This section is empty.
Functions ¶
func FindInnerHook ¶ added in v0.0.4
func FindInnerHook(g *SimpleGaddag, word string, hookType int) bool
FindInnerHook finds whether the word has a back or front "inner hook", that is, whether if you remove the back or front letter, you can still make a word.
func FindPrefix ¶ added in v0.0.4
func FindPrefix(g *SimpleGaddag, prefix string) bool
FindPrefix returns a boolean indicating if the prefix is in the GADDAG. This function is meant to be exported.
func FindWord ¶ added in v0.0.4
func FindWord(g *SimpleGaddag, word string) bool
func FindWordDawg ¶ added in v0.0.5
func FindWordDawg(g *SimpleGaddag, word string) bool
Finds a word in a SimpleGaddag that is a DAWG. The word can just be found verbatim.
Types ¶
type SimpleGaddag ¶ added in v0.0.3
type SimpleGaddag struct { // Nodes is just a slice of 32-bit elements, the node array. Nodes []uint32 // The bit-mask letter sets LetterSets []alphabet.LetterSet // contains filtered or unexported fields }
SimpleGaddag is the result of loading the gaddag back into memory. Rather than contain an entire tree of linked nodes, arcs, etc it will be easier and faster to do bitwise operations on a 32-bit array. A SimpleGaddag.Nodes is just a slice of 32-bit elements. It is created by serializeElements in make_gaddag.go. File Schema: [4-byte magic number] [1-byte length (LX_LEN)] [utf-8 encoded bytes with lexicon name, length of bytes being LX_LEN] [alphabetlength] [letters...] (up to 60+?) [lettersetlength] [lettersets] (64-bit binary bit masks) a set of [node] [arcs...] Where node is a 32-bit number: LetterSetIdx + (NumArcs << NumArcsBitLoc) Each arc is a 32-bit number: (letter << LetterBitLoc) + index of next node, where letter is an index from 0 to MaxAlphabetSize into alphabet (except for MaxAlphabetSize, which is the SeparationToken), and the index of the node is the index of the element in the SimpleGaddag.Nodes array.
If the node has no arcs, the arc array is empty.
func LoadGaddag ¶ added in v0.0.2
func LoadGaddag(filename string) *SimpleGaddag
LoadGaddag loads a gaddag from a file and returns the slice of nodes.
func (*SimpleGaddag) ArcToIdxLetter ¶ added in v0.0.3
func (g *SimpleGaddag) ArcToIdxLetter(arcIdx uint32) (uint32, alphabet.MachineLetter)
ArcToIdxLetter finds the index of the node pointed to by this arc and returns it and the letter.
func (*SimpleGaddag) GetAlphabet ¶ added in v0.0.5
func (g *SimpleGaddag) GetAlphabet() *alphabet.Alphabet
GetAlphabet returns the alphabet for this gaddag.
func (*SimpleGaddag) GetLetterSet ¶ added in v0.0.4
func (g *SimpleGaddag) GetLetterSet(nodeIdx uint32) alphabet.LetterSet
GetLetterSet gets the letter set of the node at nodeIdx.
func (*SimpleGaddag) GetRootNodeIndex ¶ added in v0.0.4
func (g *SimpleGaddag) GetRootNodeIndex() uint32
GetRootNodeIndex gets the index of the root node.
func (*SimpleGaddag) InLetterSet ¶ added in v0.0.4
func (g *SimpleGaddag) InLetterSet(letter alphabet.MachineLetter, nodeIdx uint32) bool
InLetterSet returns whether the `letter` is in the node at `nodeIdx`'s letter set.
func (*SimpleGaddag) LetterSetAsRunes ¶ added in v0.0.4
func (g *SimpleGaddag) LetterSetAsRunes(nodeIdx uint32) []rune
LetterSetAsRunes returns the letter set of the node at `nodeIdx` as a slice of runes.
func (*SimpleGaddag) LexiconName ¶ added in v0.2.0
func (g *SimpleGaddag) LexiconName() string
LexiconName returns the name of the lexicon. The name is not encoded in the gaddag itself, but in the name of the file. Hopefully this isn't a big problem.
func (*SimpleGaddag) NextNodeIdx ¶ added in v0.2.0
func (g *SimpleGaddag) NextNodeIdx(nodeIdx uint32, letter alphabet.MachineLetter) uint32
NextNodeIdx is analogous to NextArc in the Gordon paper. The main difference is that in Gordon, the initial state is an arc pointing to the first node. In our implementation of the GADDAG, the initial state is that first node. So we have to think in terms of the node that was pointed to, rather than the pointing arc. There is something slightly wrong with the paper as it does not seem possible to implement in exactly Gordon's way without running into issues. (See my notes in my `ujamaa` repo in gaddag.h)
func (*SimpleGaddag) NumArcs ¶ added in v0.0.5
func (g *SimpleGaddag) NumArcs(nodeIdx uint32) byte
NumArcs is simply the number of arcs for the given node.