Documentation ¶
Overview ¶
Package lexer contains a lexer for the Hydra parser. The lexer takes the output produced by the scanner and organizes the characters into labeled tokens. A token consists of a symbol, the physical file location of that symbol (expressed as a range from one line and column to another line and column, exclusive), and the semantic value of that token (e.g., a string token, TokString, will have the decoded and de-escaped value of that string literal as its semantic value).
To perform its work, the lexer relies on recognizers, which implement the Recognizer interface defined in recognizers.go. This vastly simplifies the task of unit testing the lexer by allowing the code that recognizes individual token types to be mocked out for the testing, and allows the recognizers to be handled in isolation. The specific structure of the breakdown is needed because recognizers are not 100% isolated: a string with flags will be passed through to the recognizer for identifiers, so it needs to be able to interface with the recognizer for strings.
The lexer is incredibly flexible, owing to the use of a Profile (see hydra/parser/common.Profile). This allows string flags, string escapes, string quote characters, keywords, and operators to be dynamically specified, and even changed on the fly. This capability means that one lexer may be used to process different versions of the Hydra language without needing to write a custom lexer for each, or to introduce ad-hoc complications to the lexer to accommodate them.
Index ¶
Constants ¶
const ( NumInt uint8 = 1 << iota // Number may be an integer NumFloat // Number may be a float NumWhole // Collecting the whole part of a float/int NumFract // Collecting the fraction NumExp // Collecting the exponent NumSign // Sign allowed next NumType = NumInt | NumFloat // Number type NumState = NumWhole | NumFract | NumExp // Number state )
Flags that define tracking data needed by the number recognizer.
const ( SkipLeadFF uint8 = 1 << iota // Skip leading form feeds SkipNL // Skip newlines as well )
Flags that may be given to skipSpaces.
Variables ¶
var NumFlags = utils.FlagSet8{ NumInt: "integer", NumFloat: "float", NumWhole: "whole state", NumFract: "fraction state", NumExp: "exponent state", NumSign: "sign allowed", }
NumFlags provides a mapping between number flags and the string describing them.
var SkipFlags = utils.FlagSet8{ SkipLeadFF: "skip leading form feeds", SkipNL: "skip newlines", }
SkipFlags is a mapping of skip flags to names.
Functions ¶
Types ¶
type RecogInit ¶
type RecogInit func(l *lexer) Recognizer
RecogInit is a function that initializes a recognizer. It will be passed the lexer object, and must return a Recognizer.
type Recognizer ¶
type Recognizer interface { // Recognize is called to recognize a lexical construct. Will // be called with the first character, and should push zero or // more tokens onto the lexer's tokens queue. Recognize(ch common.AugChar) }
Recognizer is a type describing recognizers. A recognizer is initialized with the lexer object and implements the logic necessary to recognize a sequence of characters from the scanner.
Note: some recognizers implement additional state; for instance, the string recognizer has state designed to interact with the recognizer for identifiers, to allow string flags to be recognized and processed.