Documentation
¶
Overview ¶
Package pcre provides access to the Perl Compatible Regular Expresion library, PCRE.
It implements two main types, Regexp and Matcher. Regexp objects store a compiled regular expression. They consist of two immutable parts: pcre and pcre_extra. Compile()/MustCompile() initialize pcre. Calling Study() on a compiled Regexp initializes pcre_extra. Compilation of regular expressions using Compile or MustCompile is slightly expensive, so these objects should be kept and reused, instead of compiling them from scratch for each matching attempt. CompileJIT and MustCompileJIT are way more expensive, because they run Study() after compiling a Regexp, but they tend to give much better perfomance: http://sljit.sourceforge.net/regex_perf.html
Matcher objects keeps the results of a match against a []byte or string subject. The Group and GroupString functions provide access to capture groups; both versions work no matter if the subject was a []byte or string, but the version with the matching type is slightly more efficient.
Matcher objects contain some temporary space and refer the original subject. They are mutable and can be reused (using Match, MatchString, Reset or ResetString).
For details on the regular expression language implemented by this package and the flags defined below, see the PCRE documentation. http://www.pcre.org/pcre.txt
Index ¶
- Constants
- type CompileError
- type Matcher
- func (m *Matcher) Exec(subject []byte, flags int) int
- func (m *Matcher) ExecString(subject string, flags int) int
- func (m *Matcher) Extract() [][]byte
- func (m *Matcher) ExtractString() []string
- func (m *Matcher) Group(group int) []byte
- func (m *Matcher) GroupIndices(group int) []int
- func (m *Matcher) GroupString(group int) string
- func (m *Matcher) Groups() int
- func (m *Matcher) Index() (loc []int)
- func (m *Matcher) Init(re *Regexp)
- func (m *Matcher) Match(subject []byte, flags int) bool
- func (m *Matcher) MatchString(subject string, flags int) bool
- func (m *Matcher) Matches() bool
- func (m *Matcher) Named(group string) ([]byte, error)
- func (m *Matcher) NamedPresent(group string) (bool, error)
- func (m *Matcher) NamedString(group string) (string, error)
- func (m *Matcher) Partial() bool
- func (m *Matcher) Present(group int) bool
- func (m *Matcher) Reset(re Regexp, subject []byte, flags int) bool
- func (m *Matcher) ResetString(re Regexp, subject string, flags int) bool
- type Regexp
- func (re *Regexp) FindIndex(bytes []byte, flags int) (loc []int)
- func (re Regexp) Groups() int
- func (re Regexp) Matcher(subject []byte, flags int) (m *Matcher)
- func (re Regexp) MatcherString(subject string, flags int) (m *Matcher)
- func (re Regexp) NewMatcher() (m *Matcher)
- func (re Regexp) ReplaceAll(bytes, repl []byte, flags int) []byte
- func (re Regexp) ReplaceAllString(in, repl string, flags int) string
Constants ¶
const ( MAJOR = 10 MINOR = 38 ANCHORED = 0x80000000 NO_UTF_CHECK = 0x40000000 ENDANCHORED = 0x20000000 ALLOW_EMPTY_CLASS = 0x00000001 /* C */ ALT_BSUX = 0x00000002 /* C */ AUTO_CALLOUT = 0x00000004 /* C */ CASELESS = 0x00000008 /* C */ DOLLAR_ENDONLY = 0x00000010 /* J M D */ DOTALL = 0x00000020 /* C */ DUPNAMES = 0x00000040 /* C */ EXTENDED = 0x00000080 /* C */ FIRSTLINE = 0x00000100 /* J M D */ MATCH_UNSET_BACKREF = 0x00000200 /* C J M */ MULTILINE = 0x00000400 /* C */ NEVER_UCP = 0x00000800 /* C */ NEVER_UTF = 0x00001000 /* C */ NO_AUTO_CAPTURE = 0x00002000 /* C */ NO_AUTO_POSSESS = 0x00004000 /* C */ NO_DOTSTAR_ANCHOR = 0x00008000 /* C */ NO_START_OPTIMIZE = 0x00010000 /* J M D */ UCP = 0x00020000 /* C J M D */ UNGREEDY = 0x00040000 /* C */ UTF = 0x00080000 /* C J M D */ NEVER_BACKSLASH_C = 0x00100000 /* C */ ALT_CIRCUMFLEX = 0x00200000 /* J M D */ ALT_VERBNAMES = 0x00400000 /* C */ USE_OFFSET_LIMIT = 0x00800000 /* J M D */ EXTENDED_MORE = 0x01000000 /* C */ LITERAL = 0x02000000 /* C */ MATCH_INVALID_UTF = 0x04000000 /* J M D */ EXTRA_ALLOW_SURROGATE_ESCAPES = 0x00000001 /* C */ EXTRA_BAD_ESCAPE_IS_LITERAL = 0x00000002 /* C */ EXTRA_MATCH_WORD = 0x00000004 /* C */ EXTRA_MATCH_LINE = 0x00000008 /* C */ EXTRA_ESCAPED_CR_IS_LF = 0x00000010 /* C */ EXTRA_ALT_BSUX = 0x00000020 /* C */ EXTRA_ALLOW_LOOKAROUND_BSK = 0x00000040 /* C */ JIT_COMPLETE = 0x00000001 /* For full matching */ JIT_PARTIAL_SOFT = 0x00000002 JIT_PARTIAL_HARD = 0x00000004 JIT_INVALID_UTF = 0x00000100 NOTBOL = 0x00000001 NOTEOL = 0x00000002 NOTEMPTY = 0x00000004 /* ) These two must be kept */ NOTEMPTY_ATSTART = 0x00000008 /* ) adjacent to each other. */ PARTIAL_SOFT = 0x00000010 PARTIAL_HARD = 0x00000020 DFA_RESTART = 0x00000040 /* pcre2_dfa_match() only */ DFA_SHORTEST = 0x00000080 /* pcre2_dfa_match() only */ SUBSTITUTE_GLOBAL = 0x00000100 /* pcre2_substitute() only */ SUBSTITUTE_EXTENDED = 0x00000200 /* pcre2_substitute() only */ SUBSTITUTE_UNSET_EMPTY = 0x00000400 /* pcre2_substitute() only */ SUBSTITUTE_UNKNOWN_UNSET = 0x00000800 /* pcre2_substitute() only */ SUBSTITUTE_OVERFLOW_LENGTH = 0x00001000 /* pcre2_substitute() only */ NO_JIT = 0x00002000 /* Not for pcre2_dfa_match() */ COPY_MATCHED_SUBJECT = 0x00004000 SUBSTITUTE_LITERAL = 0x00008000 /* pcre2_substitute() only */ SUBSTITUTE_MATCHED = 0x00010000 /* pcre2_substitute() only */ SUBSTITUTE_REPLACEMENT_ONLY = 0x00020000 /* pcre2_substitute() only */ CONVERT_UTF = 0x00000001 CONVERT_NO_UTF_CHECK = 0x00000002 CONVERT_POSIX_BASIC = 0x00000004 CONVERT_POSIX_EXTENDED = 0x00000008 CONVERT_GLOB = 0x00000010 CONVERT_GLOB_NO_WILD_SEPARATOR = 0x00000030 CONVERT_GLOB_NO_STARSTAR = 0x00000050 NEWLINE_CR = 1 NEWLINE_LF = 2 NEWLINE_CRLF = 3 NEWLINE_ANY = 4 NEWLINE_ANYCRLF = 5 NEWLINE_NUL = 6 BSR_UNICODE = 1 BSR_ANYCRLF = 2 ERROR_END_BACKSLASH = 101 ERROR_END_BACKSLASH_C = 102 ERROR_UNKNOWN_ESCAPE = 103 ERROR_QUANTIFIER_OUT_OF_ORDER = 104 ERROR_QUANTIFIER_TOO_BIG = 105 ERROR_MISSING_SQUARE_BRACKET = 106 ERROR_ESCAPE_INVALID_IN_CLASS = 107 ERROR_CLASS_RANGE_ORDER = 108 ERROR_QUANTIFIER_INVALID = 109 ERROR_INTERNAL_UNEXPECTED_REPEAT = 110 ERROR_INVALID_AFTER_PARENS_QUERY = 111 ERROR_POSIX_CLASS_NOT_IN_CLASS = 112 ERROR_POSIX_NO_SUPPORT_COLLATING = 113 ERROR_MISSING_CLOSING_PARENTHESIS = 114 ERROR_BAD_SUBPATTERN_REFERENCE = 115 ERROR_NULL_PATTERN = 116 ERROR_BAD_OPTIONS = 117 ERROR_MISSING_COMMENT_CLOSING = 118 ERROR_PARENTHESES_NEST_TOO_DEEP = 119 ERROR_PATTERN_TOO_LARGE = 120 ERROR_HEAP_FAILED = 121 ERROR_UNMATCHED_CLOSING_PARENTHESIS = 122 ERROR_INTERNAL_CODE_OVERFLOW = 123 ERROR_MISSING_CONDITION_CLOSING = 124 ERROR_LOOKBEHIND_NOT_FIXED_LENGTH = 125 ERROR_ZERO_RELATIVE_REFERENCE = 126 ERROR_TOO_MANY_CONDITION_BRANCHES = 127 ERROR_CONDITION_ASSERTION_EXPECTED = 128 ERROR_BAD_RELATIVE_REFERENCE = 129 ERROR_UNKNOWN_POSIX_CLASS = 130 ERROR_INTERNAL_STUDY_ERROR = 131 ERROR_UNICODE_NOT_SUPPORTED = 132 ERROR_PARENTHESES_STACK_CHECK = 133 ERROR_CODE_POINT_TOO_BIG = 134 ERROR_LOOKBEHIND_TOO_COMPLICATED = 135 ERROR_LOOKBEHIND_INVALID_BACKSLASH_C = 136 ERROR_UNSUPPORTED_ESCAPE_SEQUENCE = 137 ERROR_CALLOUT_NUMBER_TOO_BIG = 138 ERROR_MISSING_CALLOUT_CLOSING = 139 ERROR_ESCAPE_INVALID_IN_VERB = 140 ERROR_UNRECOGNIZED_AFTER_QUERY_P = 141 ERROR_MISSING_NAME_TERMINATOR = 142 ERROR_DUPLICATE_SUBPATTERN_NAME = 143 ERROR_INVALID_SUBPATTERN_NAME = 144 ERROR_UNICODE_PROPERTIES_UNAVAILABLE = 145 ERROR_MALFORMED_UNICODE_PROPERTY = 146 ERROR_UNKNOWN_UNICODE_PROPERTY = 147 ERROR_SUBPATTERN_NAME_TOO_LONG = 148 ERROR_TOO_MANY_NAMED_SUBPATTERNS = 149 ERROR_CLASS_INVALID_RANGE = 150 ERROR_OCTAL_BYTE_TOO_BIG = 151 ERROR_INTERNAL_OVERRAN_WORKSPACE = 152 ERROR_INTERNAL_MISSING_SUBPATTERN = 153 ERROR_DEFINE_TOO_MANY_BRANCHES = 154 ERROR_BACKSLASH_O_MISSING_BRACE = 155 ERROR_INTERNAL_UNKNOWN_NEWLINE = 156 ERROR_BACKSLASH_G_SYNTAX = 157 ERROR_PARENS_QUERY_R_MISSING_CLOSING = 158 ERROR_VERB_ARGUMENT_NOT_ALLOWED = 159 ERROR_VERB_UNKNOWN = 160 ERROR_SUBPATTERN_NUMBER_TOO_BIG = 161 ERROR_SUBPATTERN_NAME_EXPECTED = 162 ERROR_INTERNAL_PARSED_OVERFLOW = 163 ERROR_INVALID_OCTAL = 164 ERROR_SUBPATTERN_NAMES_MISMATCH = 165 ERROR_MARK_MISSING_ARGUMENT = 166 ERROR_INVALID_HEXADECIMAL = 167 ERROR_BACKSLASH_C_SYNTAX = 168 ERROR_BACKSLASH_K_SYNTAX = 169 ERROR_INTERNAL_BAD_CODE_LOOKBEHINDS = 170 ERROR_BACKSLASH_N_IN_CLASS = 171 ERROR_CALLOUT_STRING_TOO_LONG = 172 ERROR_UNICODE_DISALLOWED_CODE_POINT = 173 ERROR_UTF_IS_DISABLED = 174 ERROR_UCP_IS_DISABLED = 175 ERROR_VERB_NAME_TOO_LONG = 176 ERROR_BACKSLASH_U_CODE_POINT_TOO_BIG = 177 ERROR_MISSING_OCTAL_OR_HEX_DIGITS = 178 ERROR_VERSION_CONDITION_SYNTAX = 179 ERROR_INTERNAL_BAD_CODE_AUTO_POSSESS = 180 ERROR_CALLOUT_NO_STRING_DELIMITER = 181 ERROR_CALLOUT_BAD_STRING_DELIMITER = 182 ERROR_BACKSLASH_C_CALLER_DISABLED = 183 ERROR_QUERY_BARJX_NEST_TOO_DEEP = 184 ERROR_BACKSLASH_C_LIBRARY_DISABLED = 185 ERROR_PATTERN_TOO_COMPLICATED = 186 ERROR_LOOKBEHIND_TOO_LONG = 187 ERROR_PATTERN_STRING_TOO_LONG = 188 ERROR_INTERNAL_BAD_CODE = 189 ERROR_INTERNAL_BAD_CODE_IN_SKIP = 190 ERROR_NO_SURROGATES_IN_UTF16 = 191 ERROR_BAD_LITERAL_OPTIONS = 192 ERROR_SUPPORTED_ONLY_IN_UNICODE = 193 ERROR_INVALID_HYPHEN_IN_OPTIONS = 194 ERROR_ALPHA_ASSERTION_UNKNOWN = 195 ERROR_SCRIPT_RUN_NOT_AVAILABLE = 196 ERROR_TOO_MANY_CAPTURES = 197 ERROR_CONDITION_ATOMIC_ASSERTION_EXPECTED = 198 ERROR_BACKSLASH_K_IN_LOOKAROUND = 199 ERROR_NOMATCH = (-1) ERROR_PARTIAL = (-2) ERROR_UTF8_ERR1 = (-3) ERROR_UTF8_ERR2 = (-4) ERROR_UTF8_ERR3 = (-5) ERROR_UTF8_ERR4 = (-6) ERROR_UTF8_ERR5 = (-7) ERROR_UTF8_ERR6 = (-8) ERROR_UTF8_ERR7 = (-9) ERROR_UTF8_ERR8 = (-10) ERROR_UTF8_ERR9 = (-11) ERROR_UTF8_ERR10 = (-12) ERROR_UTF8_ERR11 = (-13) ERROR_UTF8_ERR12 = (-14) ERROR_UTF8_ERR13 = (-15) ERROR_UTF8_ERR14 = (-16) ERROR_UTF8_ERR15 = (-17) ERROR_UTF8_ERR16 = (-18) ERROR_UTF8_ERR17 = (-19) ERROR_UTF8_ERR18 = (-20) ERROR_UTF8_ERR19 = (-21) ERROR_UTF8_ERR20 = (-22) ERROR_UTF8_ERR21 = (-23) ERROR_UTF16_ERR1 = (-24) ERROR_UTF16_ERR2 = (-25) ERROR_UTF16_ERR3 = (-26) ERROR_UTF32_ERR1 = (-27) ERROR_UTF32_ERR2 = (-28) ERROR_BADDATA = (-29) ERROR_MIXEDTABLES = (-30) /* Name was changed */ ERROR_BADMAGIC = (-31) ERROR_BADMODE = (-32) ERROR_BADOFFSET = (-33) ERROR_BADOPTION = (-34) ERROR_BADREPLACEMENT = (-35) ERROR_BADUTFOFFSET = (-36) ERROR_CALLOUT = (-37) /* Never used by PCRE2 itself */ ERROR_DFA_BADRESTART = (-38) ERROR_DFA_RECURSE = (-39) ERROR_DFA_UCOND = (-40) ERROR_DFA_UFUNC = (-41) ERROR_DFA_UITEM = (-42) ERROR_DFA_WSSIZE = (-43) ERROR_INTERNAL = (-44) ERROR_JIT_BADOPTION = (-45) ERROR_JIT_STACKLIMIT = (-46) ERROR_MATCHLIMIT = (-47) ERROR_NOMEMORY = (-48) ERROR_NOSUBSTRING = (-49) ERROR_NOUNIQUESUBSTRING = (-50) ERROR_NULL = (-51) ERROR_RECURSELOOP = (-52) ERROR_DEPTHLIMIT = (-53) ERROR_RECURSIONLIMIT = (-53) /* Obsolete synonym */ ERROR_UNAVAILABLE = (-54) ERROR_UNSET = (-55) ERROR_BADOFFSETLIMIT = (-56) ERROR_BADREPESCAPE = (-57) ERROR_REPMISSINGBRACE = (-58) ERROR_BADSUBSTITUTION = (-59) ERROR_BADSUBSPATTERN = (-60) ERROR_TOOMANYREPLACE = (-61) ERROR_BADSERIALIZEDDATA = (-62) ERROR_HEAPLIMIT = (-63) ERROR_CONVERT_SYNTAX = (-64) ERROR_INTERNAL_DUPMATCH = (-65) ERROR_DFA_UINVALID_UTF = (-66) INFO_ALLOPTIONS = 0 INFO_ARGOPTIONS = 1 INFO_BACKREFMAX = 2 INFO_BSR = 3 INFO_CAPTURECOUNT = 4 INFO_FIRSTCODEUNIT = 5 INFO_FIRSTCODETYPE = 6 INFO_FIRSTBITMAP = 7 INFO_HASCRORLF = 8 INFO_JCHANGED = 9 INFO_JITSIZE = 10 INFO_LASTCODEUNIT = 11 INFO_LASTCODETYPE = 12 INFO_MATCHEMPTY = 13 INFO_MATCHLIMIT = 14 INFO_MAXLOOKBEHIND = 15 INFO_MINLENGTH = 16 INFO_NAMECOUNT = 17 INFO_NAMEENTRYSIZE = 18 INFO_NAMETABLE = 19 INFO_NEWLINE = 20 INFO_DEPTHLIMIT = 21 INFO_RECURSIONLIMIT = 21 /* Obsolete synonym */ INFO_SIZE = 22 INFO_HASBACKSLASHC = 23 INFO_FRAMESIZE = 24 INFO_HEAPLIMIT = 25 INFO_EXTRAOPTIONS = 26 CONFIG_BSR = 0 CONFIG_JIT = 1 CONFIG_JITTARGET = 2 CONFIG_LINKSIZE = 3 CONFIG_MATCHLIMIT = 4 CONFIG_NEWLINE = 5 CONFIG_PARENSLIMIT = 6 CONFIG_DEPTHLIMIT = 7 CONFIG_RECURSIONLIMIT = 7 /* Obsolete synonym */ CONFIG_STACKRECURSE = 8 /* Obsolete */ CONFIG_UNICODE = 9 CONFIG_UNICODE_VERSION = 10 CONFIG_VERSION = 11 CONFIG_HEAPLIMIT = 12 CONFIG_NEVER_BACKSLASH_C = 13 CONFIG_COMPILED_WIDTHS = 14 CONFIG_TABLES_LENGTH = 15 CALLOUT_STARTMATCH = 0x00000001 /* Set for each bumpalong */ CALLOUT_BACKTRACK = 0x00000002 /* Set after a backtrack */ )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CompileError ¶
type CompileError struct { Pattern string // The failed pattern Message string // The error message Offset int // Byte position of error }
CompileError holds details about a compilation error, as returned by the Compile function. The offset is the byte position in the pattern string at which the error was detected.
func (*CompileError) Error ¶
func (e *CompileError) Error() string
Error converts a compile error to a string
type Matcher ¶
type Matcher struct {
// contains filtered or unexported fields
}
Matcher objects provide a place for storing match results. They can be created by the Matcher and MatcherString functions, or they can be initialized with Reset or ResetString.
func (*Matcher) Exec ¶
Exec tries to match the specified byte slice to the current pattern. Returns the raw pcre_exec error code.
func (*Matcher) ExecString ¶
ExecString tries to match the specified subject string to the current pattern. It returns the raw pcre_exec error code.
func (*Matcher) Extract ¶
Extract returns a slice of byte slices for a single match. The first byte slice contains the complete match. Subsequent byte slices contain the captured groups. If there was no match then nil is returned.
func (*Matcher) ExtractString ¶
ExtractString returns a slice of strings for a single match. The first string contains the complete match. Subsequent strings in the slice contain the captured groups. If there was no match then nil is returned.
func (*Matcher) Group ¶
Group returns the numbered capture group of the last match (performed by Matcher, MatcherString, Reset, ResetString, Match, or MatchString). Group 0 is the part of the subject which matches the whole pattern; the first actual capture group is numbered 1. Capture groups which are not present return a nil slice.
func (*Matcher) GroupIndices ¶
GroupIndices returns the numbered capture group positions of the last match (performed by Matcher, MatcherString, Reset, ResetString, Match, or MatchString). Group 0 is the part of the subject which matches the whole pattern; the first actual capture group is numbered 1. Capture groups which are not present return a nil slice.
func (*Matcher) GroupString ¶
GroupString returns the numbered capture group as a string. Group 0 is the part of the subject which matches the whole pattern; the first actual capture group is numbered 1. Capture groups which are not present return an empty string.
func (*Matcher) Index ¶
Index returns the start and end of the first match, if a previous call to Matcher, MatcherString, Reset, ResetString, Match or MatchString succeeded. loc[0] is the start and loc[1] is the end.
func (*Matcher) Match ¶
Match tries to match the specified byte slice to the current pattern by calling Exec and collects the result. Returns true if the match succeeds.
func (*Matcher) MatchString ¶
MatchString tries to match the specified subject string to the current pattern by calling ExecString and collects the result. Returns true if the match succeeds.
func (*Matcher) Matches ¶
Matches returns true if a previous call to Matcher, MatcherString, Reset, ResetString, Match or MatchString succeeded.
func (*Matcher) Named ¶
Named returns the value of the named capture group. This is a nil slice if the capture group is not present. If the name does not refer to a group then error is non-nil.
func (*Matcher) NamedPresent ¶
NamedPresent returns true if the named capture group is present. If the name does not refer to a group then error is non-nil.
func (*Matcher) NamedString ¶
NamedString returns the value of the named capture group, or an empty string if the capture group is not present. If the name does not refer to a group then error is non-nil.
func (*Matcher) Partial ¶
Partial returns true if a previous call to Matcher, MatcherString, Reset, ResetString, Match or MatchString found a partial match.
func (*Matcher) Present ¶
Present returns true if the numbered capture group is present in the last match (performed by Matcher, MatcherString, Reset, ResetString, Match, or MatchString). Group numbers start at 1. A capture group can be present and match the empty string.
type Regexp ¶
type Regexp struct {
// contains filtered or unexported fields
}
Regexp holds a reference to a compiled regular expression. Use Compile or MustCompile to create such objects.
func Compile ¶
Compile the pattern and return a compiled regexp. If compilation fails, the second return value holds a *CompileError.
func MustCompile ¶
MustCompile compiles the pattern. If compilation fails, panic.
func (*Regexp) FindIndex ¶
FindIndex returns the start and end of the first match, or nil if no match. loc[0] is the start and loc[1] is the end.
func (Regexp) Matcher ¶
Matcher creates a new matcher object, with the byte slice as subject. It also starts a first match on subject. Test for success with Matches().
func (Regexp) MatcherString ¶
MatcherString creates a new matcher, with the specified subject string. It also starts a first match on subject. Test for success with Matches().
func (Regexp) NewMatcher ¶
NewMatcher creates a new matcher object for the given Regexp.
func (Regexp) ReplaceAll ¶
ReplaceAll returns a copy of a byte slice where all pattern matches are replaced by repl.