glean

command
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 28, 2022 License: MIT Imports: 8 Imported by: 0

Documentation

Overview

Glean generates parsers for context-free grammars. The generated parsers are written in Go. Glean scans Go source files for functions whose names begin "rule" or "Rule". The signatures of these functions provide the grammar rules, and the functions themselves provide the actions to implement the rules.

With no files listed, glean scans the the .go files of the package in the current directory, excluding _test.go files. Given a list of one or more files, glean scans those files. The files must all belong to the same package.

Usage:

glean [flags] [file...]

The flags are:

-o file
 Write the generated parser to this file. Default: parse.go
-t symbol
 Sets the target symbol that the parser will construct. Default: Target
-p prefix
 Apply the indicated prefix to all file scope names in the generated parser.
 Default: _glean_
-h
 Print some help information and exit.

Grammar rules are generated from functions meeting these conditions:

The name of the function is at least 5 characters long.
The name of the function begins "rule" or "Rule".
The function returns exactly one result.
Every argument type and result type consists of a simple identifier.

The result type of such a function is the symbol produced by the grammar rule; the argument types are the symbols consumed. For example, the function

func RuleAdd(Expr, Plus, Expr) Expr

corresponds to the BNF rule

<Expr> ::= <Expr> <Plus> <Expr>

By default, the parse function generated by glean has the signature

func _glean_Parse(tokens []interface{}) (Target, error)

This may be changed by passing flags to glean. The prefix specified with -p will replace _glean_ in the function name; the target symbol specified with -t will replace Target as the first result type.

Each of the tokens passed to _glean_Parse must have a type corresponding to a non-terminal symbol. That is, the token's type must appear as an argument type to one or more rule functions, but must not be used as a result type for any of those functions.

_glean_Parse will use a type switch to determine the type of each token. Because of this, it is not recommended to use type aliases for these types. For example,

type Expr = int
type Plus = int

With these definitions, _glean_Parse would not be able to distinguish between Expr and Plus tokens.

The target object returned by _glean_Parse is constructed by calling rule functions in the fashion corresponding to the way in which the tokens are parsed to find the target symbol.

The errors returned by _glean_Parse are defined in the package github.com/pat42smith/glean/gleanerrors. Compiling _glean_Parse requires access to this package and the Go standard library; no other packages are needed. Error reporting is rudimentary. In particular, only one error will be reported.

Glean generates an Earley parser. It can process any context-free grammar, even ambiguous ones. However, if _glean_Parse is given ambiguous input, it will report an error. Also, _glean_Parse may be quite slow for certain grammars.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL