syntax

package

v1.21.10 Latest Latest Go to latest Published: Feb 7, 2024 License: MIT Imports: 0 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/shogo82148/std

Links

Open Source Insights

Documentation ¶

Overview ¶

パッケージ構文は正規表現を解析木に解析し、解析木をプログラムにコンパイルします。通常、正規表現のクライアントはこのパッケージではなく、regexpパッケージ（CompileやMatchなど）の機能を使用します。

＃構文

Perlフラグを使用して解析する場合、このパッケージが理解する正規表現の構文は次のとおりです。Parseに代替フラグを渡すことで、構文の一部を無効にすることもできます。

単一の文字：

.              任意の文字を含む文字（改行も含む）（フラグs=true）
[xyz]          文字クラス
[^xyz]         否定文字クラス
\d             Perl文字クラス
\D             否定Perl文字クラス
[[:alpha:]]    ASCII文字クラス
[[:^alpha:]]   否定ASCII文字クラス
\pN            Unicode文字クラス（一文字の名前）
\p{Greek}      Unicode文字クラス
\PN            否定Unicode文字クラス（一文字の名前）
\P{Greek}      否定Unicode文字クラス

複合：

xy             xの後にy
x|y            xまたはy（xを優先）

繰り返し：

x*             xを0回以上、できれば多くの回繰り返す
x+             xを1回以上、できれば多くの回繰り返す
x?             xを0回または1回、できれば1回繰り返す
x{n,m}         nまたはn+1または...またはm個のxを、できれば多くの回繰り返す
x{n,}          n個以上のxを、できれば多くの回繰り返す
x{n}           正確にn個のx
x*?            xを0回以上、できれば少ない回繰り返す
x+?            xを1回以上、できれば少ない回繰り返す
x??            xを0回または1回、できれば0回繰り返す
x{n,m}?        nまたはn+1または...またはm個のxを、できれば少ない回繰り返す
x{n,}?         n個以上のxを、できれば少ない回繰り返す
x{n}?          正確にn個のx

実装の制約：x{n,m}、x{n,}、およびx{n}の計数形式は、最小または最大の反復回数が1000を超える形式を拒否します。制限は無制限の繰り返しには適用されません。

グループ化：

(re)           番号付きのキャプチャグループ（サブマッチ）
(?P<name>re)   名前付き＆番号付きのキャプチャグループ（サブマッチ）
(?:re)         キャプチャしないグループ
(?flags)       現在のグループ内でフラグを設定する；キャプチャしない
(?flags:re)    re中にフラグを設定する；キャプチャしない

フラグの構文はxyz（設定）または-xyz（解除）またはxyz（設定）-z（解除）です。フラグは次のとおりです：

i              大文字小文字を区別しない（デフォルトはfalse）
m              マルチラインモード：～、$はテキストの始まり/終わりに加えて行の始まり/終わりにもマッチする（デフォルトはfalse）
s              .が\nにもマッチする（デフォルトはfalse）
U              マッチングの優先度を反転させる：x*とx*？やx+とx+？などの意味を入れ替える（デフォルトはfalse）

空の文字列：

^              テキストまたは行の先頭（フラグm=true）
$              テキストの終わり（\zではなく）または行の終わり（フラグm=true）
\A             テキストの先頭
\b             ASCIIの単語の境界（片側は\w、他側は\W、\A、または\z）
\B             ASCIIの単語の境界ではない
\z             テキストの終わり

エスケープシーケンス：

\a             ベル（== \007）
\f             改ページ（== \014）
\t             水平タブ（== \011）
\n             改行（== \012）
\r             キャリッジリターン（== \015）
\v             垂直タブ文字（== \013）
\*             リテラルの*（任意の句読点文字用）
\123           8進数の文字コード（最大3桁まで）
\x7F           16進数の文字コード（正確に2桁）
\x{10FFFF}     16進数の文字コード
\Q...\E        句読点を含む場合でも、リテラルのテキスト...

文字クラス要素：

x              単一の文字
A-Z            文字範囲（包括的）
\d             Perl文字クラス
[:foo:]        ASCII文字クラスfoo
\p{Foo}        Unicode文字クラスFoo
\pF            Unicode文字クラスF（一文字の名前）

文字クラス要素としての名前付き文字クラス：

[\d]           数字（== \d）
[^\d]          数字以外（== \D）
[\D]           数字以外（== \D）
[^\D]          英数字以外（== \d）
[[:name:]]     文字クラス[:name:]内の名前付きASCIIクラス（== [:name:]）
[^[:name:]]    否定文字クラス[:name:]内の名前付きASCIIクラス（== [:^name:]）
[\p{Name}]     文字クラス内の名前付きUnicodeプロパティ（== \p{Name}）
[^\p{Name}]    否定文字クラス内の名前付きUnicodeプロパティ（== \P{Name}）

Perl文字クラス（すべてASCIIのみ）：

\d             数字（== [0-9]）
\D             数字以外（== [^0-9]）
\s             空白（== [\t\n\f\r ]）
\S             空白以外（== [^\t\n\f\r ]）
\w             単語の文字（== [0-9A-Za-z_]）
\W             単語の文字以外（== [^0-9A-Za-z_]）

ASCII文字クラス：

[[:alnum:]]    英数字（== [0-9A-Za-z]）
[[:alpha:]]    英字（== [A-Za-z]）
[[:ascii:]]    ASCII（== [\x00-\x7F]）
[[:blank:]]    空白（== [\t ]）
[[:cntrl:]]    制御文字（== [\x00-\x1F\x7F]）
[[:digit:]]    数字（== [0-9]）
[[:graph:]]    グラフィカル（== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]）
[[:lower:]]    小文字（== [a-z]）
[[:print:]]    印刷可能（== [ -~] == [ [:graph:]]）
[[:punct:]]    句読点（== [!-/:-@[-`{-~]）
[[:space:]]    空白（== [\t\n\v\f\r ]）
[[:upper:]]    大文字（== [A-Z]）
[[:word:]]     単語の文字（== [0-9A-Za-z_]）
[[:xdigit:]]   16進数の数字（== [0-9A-Fa-f]）

Unicode文字クラスは、unicode.Categoriesおよびunicode.Scriptsのものです。

Index ¶

func IsWordChar(r rune) bool
type EmptyOp
- func EmptyOpContext(r1, r2 rune) EmptyOp
type Error
- func (e *Error) Error() string
type ErrorCode
- func (e ErrorCode) String() string
type Flags
type Inst
- func (i *Inst) MatchEmptyWidth(before rune, after rune) bool
- func (i *Inst) MatchRune(r rune) bool
- func (i *Inst) MatchRunePos(r rune) int
- func (i *Inst) String() string
type InstOp
- func (i InstOp) String() string
type Op
- func (i Op) String() string
type Prog
- func Compile(re *Regexp) (*Prog, error)
- func (p *Prog) Prefix() (prefix string, complete bool)
- func (p *Prog) StartCond() EmptyOp
- func (p *Prog) String() string
type Regexp
- func Parse(s string, flags Flags) (*Regexp, error)
- func (re *Regexp) CapNames() []string
- func (x *Regexp) Equal(y *Regexp) bool
- func (re *Regexp) MaxCap() int
- func (re *Regexp) Simplify() *Regexp
- func (re *Regexp) String() string

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func IsWordChar ¶

func IsWordChar(r rune) bool

IsWordCharは、\bおよび\Bゼロ幅のアサーションの評価中にrが「単語文字」と見なされるかどうかを報告します。これらのアサーションはASCIIのみです：単語文字は[A-Za-z0-9_]です。

Types ¶

type EmptyOp ¶

type EmptyOp uint8

EmptyOpは、ゼロ幅アサーションの種類または混合を指定します。

const (
	EmptyBeginLine EmptyOp = 1 << iota
	EmptyEndLine
	EmptyBeginText
	EmptyEndText
	EmptyWordBoundary
	EmptyNoWordBoundary
)

func EmptyOpContext ¶

func EmptyOpContext(r1, r2 rune) EmptyOp

EmptyOpContextは、r1とr2のルーンの間の位置で満たされるゼロ幅のアサーションを返します。 r1 == -1を渡すと、位置がテキストの先頭にあることを示します。 r2 == -1を渡すと、位置がテキストの末尾にあることを示します。

func (*Error) Error ¶

func (e *Error) Error() string

type ErrorCode ¶

type ErrorCode string

「ErrorCode」は正規表現の解析に失敗したことを説明します。

const (
	// 予期しないエラー
	ErrInternalError ErrorCode = "regexp/syntax: internal error"

	// パースエラー
	ErrInvalidCharClass      ErrorCode = "invalid character class"
	ErrInvalidCharRange      ErrorCode = "invalid character class range"
	ErrInvalidEscape         ErrorCode = "invalid escape sequence"
	ErrInvalidNamedCapture   ErrorCode = "invalid named capture"
	ErrInvalidPerlOp         ErrorCode = "invalid or unsupported Perl syntax"
	ErrInvalidRepeatOp       ErrorCode = "invalid nested repetition operator"
	ErrInvalidRepeatSize     ErrorCode = "invalid repeat count"
	ErrInvalidUTF8           ErrorCode = "invalid UTF-8"
	ErrMissingBracket        ErrorCode = "missing closing ]"
	ErrMissingParen          ErrorCode = "missing closing )"
	ErrMissingRepeatArgument ErrorCode = "missing argument to repetition operator"
	ErrTrailingBackslash     ErrorCode = "trailing backslash at end of expression"
	ErrUnexpectedParen       ErrorCode = "unexpected )"
	ErrNestingDepth          ErrorCode = "expression nests too deeply"
	ErrLarge                 ErrorCode = "expression too large"
)

func (ErrorCode) String ¶

func (e ErrorCode) String() string

type Flags ¶

type Flags uint16

Flagsはパーサーの動作を制御し、正規表現のコンテキストに関する情報を記録します。

const (
	FoldCase Flags = 1 << iota
	Literal
	ClassNL
	DotNL
	OneLine
	NonGreedy
	PerlX
	UnicodeGroups
	WasDollar
	Simple

	MatchNL = ClassNL | DotNL

	Perl        = ClassNL | OneLine | PerlX | UnicodeGroups
	POSIX Flags = 0
)

func (*Inst) MatchEmptyWidth ¶

func (i *Inst) MatchEmptyWidth(before rune, after rune) bool

MatchEmptyWidthは、runesの前と後の間に空の文字列がマッチしているかどうかを報告します。 i.Op == InstEmptyWidthの場合にのみ呼び出すべきです。

func (*Inst) MatchRune ¶

func (i *Inst) MatchRune(r rune) bool

MatchRune は指定した r に instruction が一致し、それを消費するかどうかを報告します。 i.Op == InstRune の場合にのみ呼び出すべきです。

func (*Inst) MatchRunePos ¶ added in v1.3.0

func (i *Inst) MatchRunePos(r rune) int

MatchRunePosは、命令がrと一致しているかどうか（そして消費するかどうか）を確認します。そうであれば、MatchRunePosは一致するルーンのペアのインデックスを返します（または、len(i.Rune) == 1の場合、ルーンの単一要素）。一致しない場合、MatchRunePosは-1を返します。 MatchRunePosは、i.Op == InstRuneの場合のみ呼び出す必要があります。

func (*Inst) String ¶

func (i *Inst) String() string

type InstOp ¶

type InstOp uint8

InstOpは命令のオペコードです。

const (
	InstAlt InstOp = iota
	InstAltMatch
	InstCapture
	InstEmptyWidth
	InstMatch
	InstFail
	InstNop
	InstRune
	InstRune1
	InstRuneAny
	InstRuneAnyNotNL
)

func (InstOp) String ¶ added in v1.3.0

func (i InstOp) String() string

type Op ¶

type Op uint8

Opは単一の正規表現演算子です。

const (
	OpNoMatch Op = 1 + iota
	OpEmptyMatch
	OpLiteral
	OpCharClass
	OpAnyCharNotNL
	OpAnyChar
	OpBeginLine
	OpEndLine
	OpBeginText
	OpEndText
	OpWordBoundary
	OpNoWordBoundary
	OpCapture
	OpStar
	OpPlus
	OpQuest
	OpRepeat
	OpConcat
	OpAlternate
)

func (Op) String ¶ added in v1.11.0

func (i Op) String() string

type Prog ¶

type Prog struct {
	Inst   []Inst
	Start  int
	NumCap int
}

Progはコンパイルされた正規表現プログラムです。

func Compile ¶

func Compile(re *Regexp) (*Prog, error)

Compileは正規表現を実行するためのプログラムにコンパイルします。正規表現はすでに簡素化されている必要があります（re.Simplifyから戻されたもの）。

func (*Prog) Prefix ¶

func (p *Prog) Prefix() (prefix string, complete bool)

Prefix は正規表現のすべての一致した結果が始まるリテラル文字列を返します。もし Prefix が完全な一致である場合、Complete は true になります。

func (*Prog) StartCond ¶

func (p *Prog) StartCond() EmptyOp

StartCondは、どのマッチにおいても真である必要がある先頭の空幅条件を返します。マッチが不可能な場合は、^EmptyOp(0)を返します。

func (*Prog) String ¶

func (p *Prog) String() string

type Regexp ¶

type Regexp struct {
	Op       Op
	Flags    Flags
	Sub      []*Regexp
	Sub0     [1]*Regexp
	Rune     []rune
	Rune0    [2]rune
	Min, Max int
	Cap      int
	Name     string
}

正規表現（RegExp）は正規表現構文木のノードです。

func Parse ¶

func Parse(s string, flags Flags) (*Regexp, error)

Parseは指定されたフラグによって制御された正規表現文字列sを解析し、正規表現の解析木を返します。構文はトップレベルのコメントに記載されています。

func (*Regexp) CapNames ¶

func (re *Regexp) CapNames() []string

CapNamesは正規表現を走査してキャプチャグループの名前を見つけます。

func (*Regexp) Equal ¶

func (x *Regexp) Equal(y *Regexp) bool

Equalはxとyが同じ構造を持っているかどうかを報告します。

func (*Regexp) MaxCap ¶

func (re *Regexp) MaxCap() int

MaxCapは正規表現を辿って最大のキャプチャーインデックスを見つけます。

func (*Regexp) Simplify ¶

func (re *Regexp) Simplify() *Regexp

Simplify returns a regexp equivalent to re but without counted repetitions and with various other simplifications, such as rewriting /(?:a+)+/ to /a+/. The resulting regexp will execute correctly but its string representation will not produce the same parse tree, because capturing parentheses may have been duplicated or removed. For example, the simplified form for /(x){1,2}/ is /(x)(x)?/ but both parentheses capture as $1. The returned regexp may share structure with or be the original.

func (*Regexp) String ¶

func (re *Regexp) String() string

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL