re1

package
v0.0.0-...-dbc7887 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 17, 2019 License: MIT Imports: 5 Imported by: 3

Documentation

Overview

Package re1 implements a very simple regular expression language. The language is inspired by plan9 regular expressions (https://9fans.github.io/plan9port/man/man7/regexp.html), rsc's regexp blog posts (https://swtch.com/~rsc/regexp), and nominally by the much more sohpistocated RE2 library (https://github.com/google/re2).

The grammar is:

regexp = choice.
choice = concat [ "|" choice ].
concat = repeat [ concat ].
repeat = term { "*" | "+" | "?" }.
term = "." | "^" | "$" | "(" regexp ")" | charclass | literal.
charclass = "[" [ "^" ] ( classlit [ "-" classlit ] ) { classlit [ "-" classlit ] } "]".
A literal is any non-meta rune or a rune preceded by \.
A classlit is any non-"]", non-"-" rune or a rune preceded by \.

The meta characters are:

| choice
* zero or more, greedy
+ one or more, greedy
? zero or one
. any non-newline rune
^ beginning of file or line
$ end of file or line
() capturing group
[] character class (^ negates, - is a range)
\n newline
\t tab
\ otherwise is the literal of the following rune
  or is \ itself if there is no following rune.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Escape

func Escape(t string) string

Escape returns the argument with any meta-characters escaped.

Types

type Opts

type Opts struct {
	// Reverse compiles the expression for reverse matching.
	// This swaps the order of concatenations, and it swaps ^ and $.
	Reverse bool
	// Delimiter specifies a rune that delimits parsing if unescaped.
	Delimiter rune
	// ID is a user-specfied ID to identify the the regexp.
	// This is used to distinguish which regexp matched
	// when concatenating multiple regexps into one.
	ID int
}

Opts are compile-time options. The zero value is default.

type Regexp

type Regexp struct {
	// contains filtered or unexported fields
}

Regexp is a compiled regular expression.

func New

func New(t string, opts Opts) (*Regexp, string, error)

New compiles a regular expression. The expression is terminated by the end of the string, an un-escaped newline, or an un-escaped delimiter (if set in opts).

func Union

func Union(res ...*Regexp) *Regexp

Union returns a single regular expression that matches the union of a set of regular expressions. The Union of no regexps is nil.

The last element of the slice returned by a call to Find will be the ID (Opts.ID) of the component expression that matched.

The capture groups are numbered with respect to their corresponding numbers for the matched component regexp. For example, Union("(a)bc", "(d)ef") will return a match for "a" as capture group 1 if component expression "(a)bc" matches. However it will return a match for "d" as capture group 1 if component expression "(d)ef" matches.

func (*Regexp) Find

func (re *Regexp) Find(rr io.RuneReader) []int64

Find returns nil on no match or a slice with pairs of int64s for each sub-expression match (0 is the full match) and the last element is the matching regexp ID.

func (*Regexp) FindInRope

func (re *Regexp) FindInRope(ro rope.Rope, s, e int64) []int64

FindInRope returns the left-most, longest match of a regulax expression between byte offsets s (inclusive) and e (exclusive) in a rope.

func (*Regexp) FindReverseInRope

func (re *Regexp) FindReverseInRope(ro rope.Rope, s, e int64) []int64

FindReverseInRope returns the right-most, longest match of a reverse-compiled regulax expression between byte offsets s (inclusive) and e (exclusive) in a rope.

The receiver is assumed to be compiled for a reverse match.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL