parse

package
v0.5.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 9, 2018 License: BSD-3-Clause Imports: 12 Imported by: 3

Documentation

Overview

Package parse does the parsing stage after lexing

Package parse does the parsing stage after lexing, using a top-down recursive-descent (TDRD) strategy, with a special reverse mode to deal with left-associative binary expressions which otherwise end up being right-associative for TDRD parsing. Higher-level rules provide scope to lower-level ones, with a special EOS end-of-statement scope recognized for

Index

Constants

This section is empty.

Variables

View Source
var AstProps = ki.Props{}
View Source
var DepthLimit = 1000

DepthLimit is the infinite recursion prevention cutoff

View Source
var GuiActive = false

Set GuiActive to true if the gui (piview) is active -- ensures that the Ast tree is updated when nodes are swapped in reverse mode, and maybe other things

View Source
var KiT_Actions = kit.Enums.AddEnum(ActionsN, false, nil)
View Source
var KiT_Ast = kit.Types.AddType(&Ast{}, AstProps)
View Source
var KiT_AstActs = kit.Enums.AddEnum(AstActsN, false, nil)
View Source
var KiT_Rule = kit.Types.AddType(&Rule{}, RuleProps)
View Source
var KiT_Steps = kit.Enums.AddEnum(StepsN, false, nil)
View Source
var RuleMap map[string]*Rule

RuleMap is a map of all the rule names, for quick lookup

View Source
var RuleProps = ki.Props{}

Functions

This section is empty.

Types

type Actions

type Actions int

Actions are parsing actions to perform

const (
	// AddType means add name as type name
	AddType Actions = iota

	// AddConst means add name as constant
	AddConst

	// AddVar means add name as a variable
	AddVar

	ActionsN
)

The parsing acts

func (*Actions) FromString

func (i *Actions) FromString(s string) error

func (Actions) MarshalJSON

func (ev Actions) MarshalJSON() ([]byte, error)

func (Actions) String

func (i Actions) String() string

func (*Actions) UnmarshalJSON

func (ev *Actions) UnmarshalJSON(b []byte) error

type Ast

type Ast struct {
	ki.Node
	TokReg lex.Reg `desc:"region in source lexical tokens corresponding to this Ast node -- Ch = index in lex lines"`
	SrcReg lex.Reg `desc:"region in source file corresponding to this Ast node"`
	Src    string  `desc:"source code corresponding to this Ast node"`
}

Ast is a node in the abstract syntax tree generated by the parsing step the name of the node (from ki.Node) is the type of the element (e.g., expr, stmt, etc) These nodes are generated by the parse.Rule's by matching tokens

func (*Ast) SetTokReg

func (ast *Ast) SetTokReg(reg lex.Reg, src *lex.File)

SetTokReg sets the token region for this rule to given region

func (*Ast) SetTokRegEnd

func (ast *Ast) SetTokRegEnd(pos lex.Pos, src *lex.File)

SetTokRegEnd updates the ending token region to given position -- token regions are typically over-extended and get narrowed as tokens actually match

type AstActs

type AstActs int

AstActs are actions to perform on the Ast nodes

const (
	// NoAst means don't create an Ast node for this rule
	NoAst AstActs = iota

	// AddAst means create an Ast node for this rule, adding it to the current anchor Ast.
	// Any sub-rules within this rule are *not* added as children of this node -- see
	// SubAst and AnchorAst.  This is good for token-only terminal nodes and list elements
	// that should be added to a list.
	AddAst

	// SubAst means create an Ast node and add all the elements of *this rule* as
	// children of this new node (including sub-rules), *except* for the very last rule
	// which is assumed to be a recursive rule -- that one goes back up to the parent node.
	// This is good for adding more complex elements with sub-rules to a recursive list,
	// without creating a new hierarchical depth level for every such element.
	SubAst

	// AnchorAst means create an Ast node and set it as the anchor that subsequent
	// sub-nodes are added into.  This is for a new hierarchical depth level
	// where everything under this rule gets organized.
	AnchorAst

	// AnchorFirstAst means create an Ast node and set it as the anchor that subsequent
	// sub-nodes are added into, *only* if this is the first time that this rule has
	// matched within the current sequence (i.e., if the parent of this rule is the same
	// rule then don't add a new Ast node).  This is good for starting a new list
	// of recursively-defined elements, without creating increasing depth levels.
	AnchorFirstAst

	AstActsN
)

The Ast actions

func (*AstActs) FromString

func (i *AstActs) FromString(s string) error

func (AstActs) MarshalJSON

func (ev AstActs) MarshalJSON() ([]byte, error)

func (AstActs) String

func (i AstActs) String() string

func (*AstActs) UnmarshalJSON

func (ev *AstActs) UnmarshalJSON(b []byte) error

type MatchStack

type MatchStack []MatchState

MatchStack is the stack of rules that matched or ran for each token point

func (*MatchStack) Add

func (rs *MatchStack) Add(pr *Rule, scope lex.Reg, regs Matches, depth int)

Add given rule to stack

func (*MatchStack) Find

func (rs *MatchStack) Find(pr *Rule, scope lex.Reg) (*MatchState, bool)

Find looks for given rule and scope on the stack

type MatchState

type MatchState struct {
	Rule  *Rule   `desc:"rule that either matched or ran here"`
	Scope lex.Reg `desc:"scope for match"`
	Regs  Matches `desc:"regions of match for each sub-region"`
	Depth int     `desc:"parsing depth at which it matched -- matching depth is given by actual depth of stack"`
	Ran   bool    `desc:"if false, then it is just a match at this point"`
}

MatchState holds state info for rules that matched, recorded at starting position of match

func (MatchState) String

func (rs MatchState) String() string

String is fmt.Stringer

type Matches

type Matches []lex.Reg

Matches encodes the regions of each match, Err for no match

func (Matches) StartEnd

func (mm Matches) StartEnd() lex.Reg

StartEnd returns the first and last non-zero positions in the Matches list as a region

type Parser

type Parser interface {
	ki.Ki

	// Compile compiles string rules into their runnable elements
	Compile(ps *State) bool

	// Validate checks for any errors in the rules and issues warnings,
	// returns true if valid (no err) and false if invalid (errs)
	Validate(ps *State) bool

	// Parse tries to apply rule to given input state, returns rule that matched or nil
	// par is the parent rule that we're being called from
	// ast is the current ast node that we add to
	Parse(ps *State, par *Rule, ast *Ast, scope lex.Reg, optMap lex.TokenMap, depth int) *Rule

	// AsParseRule returns object as a parse.Rule
	AsParseRule() *Rule
}

Parser is the interface type for parsers -- likely not necessary except is essential for defining the BaseIface for gui in making new nodes

type Rule

type Rule struct {
	ki.Node
	Off        bool     `desc:"disable this rule -- useful for testing"`
	Desc       string   `desc:"description / comments about this rule"`
	Rule       string   `` /* 499-byte string literal not displayed */
	Ast        AstActs  `desc:"what action should be take for this node when it matches"`
	OptTokMap  bool     `` /* 253-byte string literal not displayed */
	Rules      RuleList `json:"-" xml:"-" desc:"rule elements compiled from Rule string"`
	ExclKeyIdx int      `` /* 157-byte string literal not displayed */
	ExclFwd    RuleList `json:"-" xml:"-" desc:"exclusionary forward-search rule elements compiled from Rule string"`
	ExclRev    RuleList `json:"-" xml:"-" desc:"exclusionary reverse-search rule elements compiled from Rule string"`
	Reverse    bool     `` /* 357-byte string literal not displayed */
	NoToks     bool     `inactive:"+" json:"-" xml:"-" desc:"no tokens in this rule -- operates by diff rules"`
}

The first step is matching which searches in order for matches within the children of parent nodes, and for explicit rule nodes, it looks first through all the explicit tokens in the rule. If there are no explicit tokens then matching defers to ONLY the first node listed by default -- you can add a @ prefix to indicate a rule that is also essential to match.

After a rule matches, it then proceeds through the rules narrowing the scope and calling the sub-nodes..

func (*Rule) AsParseRule

func (pr *Rule) AsParseRule() *Rule

func (*Rule) BaseIface

func (pr *Rule) BaseIface() reflect.Type

func (*Rule) Compile

func (pr *Rule) Compile(ps *State) bool

Compile compiles string rules into their runnable elements. Returns true if everything is ok, false if there were compile errors.

func (*Rule) CompileAll

func (pr *Rule) CompileAll(ps *State) bool

CompileAll is called on the top-level Rule to compile all nodes it calls SetRuleMap first. Returns true if everything is ok, false if there were compile errors

func (*Rule) CompileExcl

func (pr *Rule) CompileExcl(ps *State, rs []string, rist int) bool

CompileExcl compiles exclusionary rules starting at given point currently only working for single-token matching rule

func (*Rule) DoRules

func (pr *Rule) DoRules(ps *State, par *Rule, parAst *Ast, scope lex.Reg, mpos Matches, optMap lex.TokenMap, depth int) bool

DoRules after we have matched, goes through rest of the rules -- returns false if there were any issues encountered

func (*Rule) DoRulesRevBinExp

func (pr *Rule) DoRulesRevBinExp(ps *State, par *Rule, parAst *Ast, scope lex.Reg, mpos Matches, ourAst *Ast, optMap lex.TokenMap, depth int) bool

DoRulesRevBinExp reverse version of do rules for binary expression rule with one key token in the middle -- we just pay attention to scoping rest of sub-rules relative to that, and don't otherwise adjust scope or position. In particular all the position updating taking place in sup-rules is then just ignored and we set the position to the end position matched by the "last" rule (which was the first processed)

func (*Rule) Find

func (pr *Rule) Find(find string) []*Rule

Find looks for rules in the tree that contain given string in Rule or Name fields

func (*Rule) IsGroup

func (pr *Rule) IsGroup() bool

IsGroup returns true if this node is a group, else it should have rules

func (*Rule) Match

func (pr *Rule) Match(ps *State, parAst *Ast, scope lex.Reg, depth int, optMap lex.TokenMap) (bool, lex.Reg, Matches)

Match attempts to match the rule, returns true if it matches, and the match positions, along with any update to the scope

func (*Rule) MatchExclude

func (pr *Rule) MatchExclude(ps *State, scope lex.Reg, mpos Matches, depth int, optMap lex.TokenMap) bool

MatchExclude looks for matches of exclusion tokens -- if found, they exclude this rule return is true if exclude matches and rule should be excluded

func (*Rule) Parse

func (pr *Rule) Parse(ps *State, par *Rule, parAst *Ast, scope lex.Reg, optMap lex.TokenMap, depth int) *Rule

Parse tries to apply rule to given input state, returns rule that matched or nil par is the parent rule that we're being called from. parAst is the current ast node that we add to. scope is the region to search within, defined by parent or EOS if we have a terminal one

func (*Rule) ParseRules

func (pr *Rule) ParseRules(ps *State, par *Rule, parAst *Ast, scope lex.Reg, optMap lex.TokenMap, depth int) *Rule

ParseRules parses rules and returns this rule if it matches, nil if not

func (*Rule) Scope

func (pr *Rule) Scope(ps *State, parAst *Ast, scope lex.Reg) (lex.Reg, bool)

Scope finds the potential scope region for looking for tokens -- either from EOS position or State ScopeStack pushed from parents. Returns new scope and false if no valid scope found.

func (*Rule) SetRuleMap

func (pr *Rule) SetRuleMap(ps *State)

SetRuleMap is called on the top-level Rule and initializes the RuleMap

func (*Rule) StartParse

func (pr *Rule) StartParse(ps *State) *Rule

StartParse is called on the root of the parse rule tree to start the parsing process

func (*Rule) Validate

func (pr *Rule) Validate(ps *State) bool

Validate checks for any errors in the rules and issues warnings, returns true if valid (no err) and false if invalid (errs)

func (*Rule) WriteGrammar

func (pr *Rule) WriteGrammar(writer io.Writer, depth int)

WriteGrammar outputs the parser rules as a formatted grammar in a BNF-like format it is called recursively

type RuleEl

type RuleEl struct {
	Rule  *Rule          `desc:"sub-rule for this position -- nil if token"`
	Tok   token.KeyToken `desc:"token, None if rule"`
	Match bool           `` /* 166-byte string literal not displayed */
	Opt   bool           `desc:"this rule is optional -- will absorb tokens if they exist -- indicated with ? prefix"`
	StInc int            `` /* 173-byte string literal not displayed */
}

RuleEl is an element of a parsing rule -- either a pointer to another rule or a token

func (RuleEl) IsRule

func (re RuleEl) IsRule() bool

func (RuleEl) IsToken

func (re RuleEl) IsToken() bool

type RuleList

type RuleList []RuleEl

RuleList is a list (slice) of rule elements

func (RuleList) Last

func (rl RuleList) Last() *RuleEl

Last returns the last rule -- only used in cases where there are rules

type RuleSet

type RuleSet map[*Rule]struct{}

RuleSet is a map for representing binary presence of a rule

type ScopeRuleSet

type ScopeRuleSet map[lex.Reg]RuleSet

ScopeRuleSet is a map by scope of RuleSets, for non-matching rules

func (ScopeRuleSet) Add

func (rs ScopeRuleSet) Add(scope lex.Reg, pr *Rule)

Add a rule to scope set, with auto-alloc

func (ScopeRuleSet) Has

func (rs ScopeRuleSet) Has(scope lex.Reg, pr *Rule) bool

Has checks if scope rule set has given scope, rule

type State

type State struct {
	Src        *lex.File      `desc:"source and lexed version of source we're parsing"`
	Ast        *Ast           `desc:"root of the Ast abstract syntax tree we're updating"`
	EosPos     *[]lex.Pos     `desc:"positions *in token coordinates* of the EOS markers from PassTwo"`
	Pos        lex.Pos        `desc:"the current lex token position"`
	Errs       lex.ErrorList  `desc:"any error messages accumulated during parsing specifically"`
	Matches    [][]MatchStack `` /* 129-byte string literal not displayed */
	NonMatches ScopeRuleSet   `desc:"rules that did NOT match -- represented as a map by scope of a RuleSet"`
}

parse.State is the state maintained for parsing

func (*State) AddAst

func (ps *State) AddAst(parAst *Ast, rule string, reg lex.Reg) *Ast

AddAst adds a child Ast node to given parent Ast node

func (*State) AddMatch

func (ps *State) AddMatch(pr *Rule, scope lex.Reg, regs Matches, depth int)

AddMatch adds given rule to rule stack at given scope

func (*State) AddNonMatch

func (ps *State) AddNonMatch(scope lex.Reg, pr *Rule)

AddNonMatch adds given rule to non-matching rule set for this scope

func (*State) AllocRules

func (ps *State) AllocRules()

AllocRules allocate the match, nonmatch rule state in correspondence with the src state

func (*State) AtEof

func (ps *State) AtEof() bool

AtEof returns true if current position is at end of file

func (*State) Error

func (ps *State) Error(pos lex.Pos, msg string)

Error adds a parsing error at given lex token position

func (*State) FindEos

func (ps *State) FindEos(stpos lex.Pos, depth int) (lex.Pos, int)

FindEos finds the next EOS position at given depth

func (*State) FindToken

func (ps *State) FindToken(tkey token.KeyToken, reg lex.Reg) (lex.Pos, bool)

FindToken looks for token in given region, returns position where found, false if not. Only matches when depth is same as at reg.St start at the start of the search. All positions in token indexes.

func (*State) FindTokenReverse

func (ps *State) FindTokenReverse(tkey token.KeyToken, reg lex.Reg) (lex.Pos, bool)

FindTokenReverse looks *backwards* for token in given region, with same depth as reg.Ed-1 end where the search starts. Returns position where found, false if not. Automatically deals with possible confusion with unary operators -- if there are two ambiguous operators in a row, automatically gets the first one. This is mainly / only used for binary operator expressions (mathematical binary operators). All positions are in token indexes.

func (*State) Init

func (ps *State) Init(src *lex.File, ast *Ast, eospos *[]lex.Pos)

Init initializes the state at start of parsing

func (*State) IsMatch

func (ps *State) IsMatch(pr *Rule, scope lex.Reg) (*MatchState, bool)

IsMatch looks for rule at given scope in list of matches, if found returns match state info

func (*State) IsNonMatch

func (ps *State) IsNonMatch(scope lex.Reg, pr *Rule) bool

IsNonMatch looks for rule in nonmatch list at given scope

func (*State) MatchLex

func (ps *State) MatchLex(lx *lex.Lex, tkey token.KeyToken, isCat, isSubCat bool, cp lex.Pos) bool

MatchLex is our optimized matcher method, matching tkey depth as well

func (*State) MatchToken

func (ps *State) MatchToken(tkey token.KeyToken, pos lex.Pos) bool

MatchToken returns true if token matches at given position -- must be a valid position!

func (*State) NextSrcLine

func (ps *State) NextSrcLine() string

NextSrcLine returns the next line of text

func (*State) RuleString

func (ps *State) RuleString(full bool) string

RuleString returns the rule info for entire source -- if full then it includes the full stack at each point -- otherwise just the top of stack

type Steps

type Steps int

Steps are the different steps of the parsing processing

const (
	// Match happens when a rule matches
	Match Steps = iota

	// SubMatch is when a sub-rule within a rule matches
	SubMatch

	// NoMatch is when the rule fails to match (recorded at first non-match, which terminates
	// matching process
	NoMatch

	// Run is when the rule is running and iterating through its sub-rules
	Run

	StepsN
)

The parsing steps

func (*Steps) FromString

func (i *Steps) FromString(s string) error

func (Steps) MarshalJSON

func (ev Steps) MarshalJSON() ([]byte, error)

func (Steps) String

func (i Steps) String() string

func (*Steps) UnmarshalJSON

func (ev *Steps) UnmarshalJSON(b []byte) error

type TraceOpts

type TraceOpts struct {
	On           bool     `desc:"perform tracing"`
	Rules        string   `width:"50" desc:"trace specific named rules here (space separated) -- if blank, then all rules are traced"`
	Match        bool     `desc:"trace full rule matches -- when a rule fully matches"`
	SubMatch     bool     `desc:"trace sub-rule matches -- when the parts of each rule match"`
	NoMatch      bool     `` /* 144-byte string literal not displayed */
	Run          bool     `desc:"trace progress runing through each of the sub-rules when a rule has matched and is 'running'"`
	ScopeSrc     bool     `desc:"if true, shows the full scope source for every trace statement"`
	FullStackOut bool     `desc:"for the ParseOut display, whether to display the full stack of rules at each position, or just the deepest one"`
	RulesList    []string `view:"-" json:"-" xml:"-" desc:"list of rules"`
}

TraceOpts provides options for debugging / monitoring the rule matching and execution process

var Trace TraceOpts

parse.Trace controls the tracing options for debugging / monitoring the rule matching and execution process

func (*TraceOpts) CheckRule

func (pt *TraceOpts) CheckRule(rule string) bool

CheckRule checks if given rule should be traced

func (*TraceOpts) Init

func (pt *TraceOpts) Init()

Init intializes tracer after any changes

func (*TraceOpts) Out

func (pt *TraceOpts) Out(ps *State, pr *Rule, step Steps, pos lex.Pos, scope lex.Reg, ast *Ast, msg string) bool

Out outputs a trace message -- returns true if actually output

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL