lexer

package
v0.1.139 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 19, 2024 License: Apache-2.0 Imports: 3 Imported by: 0

Documentation

Overview

Package lexer tokenizes the input.

The first step in the Evy compilation process is tokenization. This involves breaking the Evy input code into individual tokens, such as keywords, operators, and identifiers. The lexer package is responsible for this task. It provides a Lexer type, which can be initialized using the New function. The Lexer.Next method returns the next Token in the input string. A Token is a data structure that represents a single token in the input code. The EOF token is a special token that indicates the end of the input code.

The parser then takes these tokens and creates an Abstract Syntax Tree(AST), which is a representation of the Evy code's structure. Finally, the evaluator walks the AST and executes the program.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IsIdent added in v0.1.120

func IsIdent(s string) bool

IsIdent returns true if the given string is a valid identifier.

Types

type Lexer

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer is a lexical analyzer for Evy source code.

func New

func New(input string) *Lexer

New creates a new Lexer for the given input string.

func (*Lexer) Next

func (l *Lexer) Next() *Token

Next returns the next Token in the input string. When the end of the input string is reached Next returns a Token with type EOF.

type Token

type Token struct {
	Literal string

	Offset int
	Line   int
	Col    int
	Type   TokenType
}

Token contains

  • type of the token, such as IDENT, PLUS or NUM_LIT
  • start location of the token in the input string
  • literal value of the token, used only for number literals, string literals and comments.

func (*Token) AsIdent added in v0.1.55

func (t *Token) AsIdent() *Token

AsIdent returns t as an IDENT token if t is a keyword and valid as an identifier, otherwise it returns t. This is to allow specific tokens that are also valid identifiers to be used in certain contexts.

func (*Token) Format

func (t *Token) Format() string

Format returns a string representation of the token that is useful in error messages. If the token has a relevant literal value, the literal is returned. Otherwise, the format of the token type is returned.

func (*Token) Location

func (t *Token) Location() string

Location returns a string representation of a token's start location in the form of: "line <line number> column <column number>".

func (*Token) String

func (t *Token) String() string

String implements the fmt.Stringer interface for Token.

func (*Token) TokenType

func (t *Token) TokenType() TokenType

TokenType returns the type of the token as represented by the TokenType type.

type TokenType

type TokenType int

TokenType represents the type of token, such as identifier IDENT, operator PLUS or literal NUM_LIT.

const (
	ILLEGAL TokenType = iota
	EOF
	COMMENT // `// a comment`

	// Identifiers and Literals.
	IDENT      // some_identifier
	NUM_LIT    // 123 or 456.78
	STRING_LIT // "a string 🧵"

	// Operators.
	DECLARE  // :=
	ASSIGN   // =
	PLUS     // +
	MINUS    // -
	BANG     // !
	ASTERISK // *
	SLASH    // /
	PERCENT  // %

	EQ     // ==
	NOT_EQ // !=
	LT     // <
	GT     // >
	LTEQ   // <=
	GTEQ   // >=

	// Delimiters.
	LPAREN   // (
	RPAREN   // )
	LBRACKET // [
	RBRACKET // ]
	LCURLY   // {
	RCURLY   // }

	COLON // :
	WS    // ' '
	NL    // '\n'
	DOT   // .
	DOT3  // ...

	// Keywords.
	NUM    // num
	STRING // string
	BOOL   // bool
	ANY    // any

	TRUE  // true
	FALSE // false
	AND   // and
	OR    // or

	IF     // if
	ELSE   // else
	FUNC   // func
	RETURN // return
	ON     // on
	FOR    // for
	RANGE  // range
	WHILE  // while
	BREAK  // break
	END    // end

	PKG    // package keyword, reserved for later use
	IMPORT // import keyword, reserved for later use
)

Token types are represented as constants and are the core field of the Token struct type.

func (TokenType) Format

func (t TokenType) Format() string

Format returns a string representation of the token type that is useful in error messages. The string representation is more descriptive than the string returned by the String() method.

func (TokenType) GoString

func (t TokenType) GoString() string

GoString implements the fmt.GoStringer interface for TokenType. Its return value is more useful than the iota value when debugging failed tests.

func (TokenType) String

func (t TokenType) String() string

String implements the fmt.Stringer interface for TokenType.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL