match

package
v0.24.8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 18, 2024 License: BSD-3-Clause Imports: 13 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DebugStringify

func DebugStringify(ts []TokenValue) (ret string)

turn all of the passed tokens into a helpful string representation

func Equals

func Equals(ts []TokenValue, ws Span) (okay bool)

func FindExactMatch

func FindExactMatch(ts []TokenValue, spans []Span) (ret int)

search for a span in a list of spans; return the index of the span that matched.

func HasPrefix

func HasPrefix(ts []TokenValue, prefix []Word) (okay bool)

func Hash

func Hash(s string) uint64

func JoinWords

func JoinWords(ws []Word) string

func NewTokenizer

func NewTokenizer(n Notifier) charm.State

func NormalizeAll

func NormalizeAll(ts []TokenValue) (ret string, err error)

same as Normalize but errors if all of the tokens weren't consumed.

func NormalizeTokens added in v0.24.8

func NormalizeTokens(ts []TokenValue) (ret string, width int)

turn a series of string tokens into a normalized string returns the number of string tokens consumed. somewhat dubious because it mimics inflect.Normalize without calling it.

func Stringify

func Stringify(ts []TokenValue) (ret string, width int)

turn a series of string tokens into a space padded string returns the number of string tokens consumed.

func StripArticle

func StripArticle(str string) (ret string)

return the name after removing leading articles eats any errors it encounters and returns the original name

Types

type AfterDocument

type AfterDocument func(AsyncDoc) charm.State

handle the parsed document. the document data also includes the unprocessed content which ended the document. ( ex. deindentation )

type AsyncDoc added in v0.24.8

type AsyncDoc struct {
	// the final document ( or error if file.ReadTellRunes failed )
	Content any
	// contains filtered or unexported fields
}

reads a document via channels ( which allows reading a (sub) document to become a state in a larger document )

func (AsyncDoc) ParseUnhandledContent added in v0.24.8

func (doc AsyncDoc) ParseUnhandledContent(n charm.State) charm.State

Sub-documents are defined by their indentation level. And, on each new line they have to collect enough whitespace to determine whether the line is part of their content. If the line has a lesser indent, the doc ends, but it still has the whitespace it collected (which the parent doc needs.) ParseUnhandledContent() sends that whitespace to the passed state.

type Collector

type Collector struct {
	Tokens []TokenValue
	// lines is filled from Tokens on every new line.
	// its empty if BreakLines is false
	// Tokens can have values with trailing assignments.
	// ie. ':' isn't considered an end of line here....
	// tbd: it might be nice to change that only lines *or* tokens is valid.
	Lines        [][]TokenValue
	KeepComments bool
	BreakLines   bool
	LineOffset   int
}

implements Notifier to accumulate tokens from the parser

func (*Collector) Decoded

func (at *Collector) Decoded(tv TokenValue) error

func (*Collector) TokenizeString added in v0.24.8

func (c *Collector) TokenizeString(str string) (err error)

lineOffset adjusts the positions in the parsed tokens.

type Notifier

type Notifier interface {
	Decoded(TokenValue) error
}

callback when a new token exists tbd: maybe a channel instead?

type Pos

type Pos struct{ X, Y int }

position of a token

func (Pos) String

func (p Pos) String() string

type Span

type Span []Word

Span - implements Match for a chain of individual words.

func FindCommonArticles

func FindCommonArticles(ts []TokenValue) (ret Span, width int)

for now, the common articles are a fixed set. when the author specifies some particular indefinite article for a noun that article only gets used for printing the noun; it doesn't enhance the parsing of the story. ( it would take some work to lightly hold the relation between a name and an article then parse a sentence matching names to nouns in the fwiw: the articles in inform also seems to be predetermined in this way. )

func PanicSpan

func PanicSpan(s string) (ret Span)

transform a space separated string into a slice of hashes

func (Span) String

func (s Span) String() string

type SpanList

type SpanList []Span

func PanicSpans

func PanicSpans(strs ...string) (out SpanList)

func (SpanList) FindExactMatch

func (ws SpanList) FindExactMatch(ts []TokenValue) (ret Span, width int)

func (SpanList) FindPrefix

func (ws SpanList) FindPrefix(words []TokenValue) (ret Span, width int)

this is the same as FindPrefixIndex only it returns a Span instead of an index

func (SpanList) FindPrefixIndex

func (ws SpanList) FindPrefixIndex(words []TokenValue) (retWhich int, retWidth int)

see anything in our span list starts the passed words. for instance, if the span list contains the span "oh hello" then the words "oh hello world" will match returns the index of the index and length of the longest prefix

type Token

type Token int
const (
	Invalid       Token = iota // placeholder, not generated by the tokenizer
	Comma                      // a comma
	Comment                    // ex. `# something`, minus the hash
	Parenthetical              // ex. `( something )`, minus parens
	Quoted                     // ex. `"something"`, minus the quotes
	Stop                       // full stop or other terminal
	String                     // delimited by spaces and other special runes
	Tell                       // tell subdoc

)

types of tokens

func (Token) String

func (i Token) String() string

type TokenValue

type TokenValue struct {
	Token Token
	Pos   Pos
	Value any  // a string, expect for Tell subdocuments
	First bool // helper to know if this is the first token of a sentence
}

func TokenizeString added in v0.24.8

func TokenizeString(str string) (ret []TokenValue, err error)

uses Collector to turn the passed string into a slice of tokens. by default, throws out all comments and merges newlines.

func (TokenValue) Equals

func (w TokenValue) Equals(other uint64) bool

func (TokenValue) Hash

func (tv TokenValue) Hash() (ret uint64)

func (TokenValue) String

func (tv TokenValue) String() (ret string)

a string *representation* of the value

type Tokenizer

type Tokenizer struct {
	Notifier
	// contains filtered or unexported fields
}

read pieces of plain text documents

func TokenizerAtLine added in v0.24.8

func TokenizerAtLine(n Notifier, lineOfs int) Tokenizer

func (*Tokenizer) Decode

func (n *Tokenizer) Decode() charm.State

return a state to parse a stream of runes and notify as they are detected.

type Word

type Word struct {
	// contains filtered or unexported fields
}

func MakeWord

func MakeWord(slice string) Word

func (*Word) Hash

func (w *Word) Hash() uint64

func (*Word) String

func (w *Word) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL