token

package
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 24, 2024 License: BSD-3-Clause Imports: 3 Imported by: 1

Documentation

Overview

Package token defines a complete set of all lexical tokens for any kind of language! It is based on the alecthomas/chroma / pygments lexical tokens plus all the more detailed tokens needed for actually parsing languages

Index

Constants

This section is empty.

Variables

View Source
var CatMap map[Tokens]Tokens

CatMap is the map into the category level for each token

Categories

View Source
var KeyTokenZero = KeyToken{}
View Source
var Names = map[Tokens]string{}/* 125 elements not displayed */

Names are the short tag names for each token, used e.g., for syntax highlighting These are based on alecthomas/chroma / pygments

View Source
var OpPunctMap = map[string]Tokens{
	"+": OpMathAdd,
	"-": OpMathSub,
	"*": OpMathMul,
	"/": OpMathDiv,
	"%": OpMathRem,

	"&":  OpBitAnd,
	"|":  OpBitOr,
	"~":  OpBitNot,
	"^":  OpBitXor,
	"<<": OpBitShiftLeft,
	">>": OpBitShiftRight,
	"&^": OpBitAndNot,

	"=":  OpAsgnAssign,
	"++": OpAsgnInc,
	"--": OpAsgnDec,
	"<-": OpAsgnArrow,
	":=": OpAsgnDefine,

	"+=": OpMathAsgnAdd,
	"-=": OpMathAsgnSub,
	"*=": OpMathAsgnMul,
	"/=": OpMathAsgnDiv,
	"%=": OpMathAsgnRem,

	"&=":  OpBitAsgnAnd,
	"|=":  OpBitAsgnOr,
	"^=":  OpBitAsgnXor,
	"<<=": OpBitAsgnShiftLeft,
	">>=": OpBitAsgnShiftRight,
	"&^=": OpBitAsgnAndNot,

	"&&": OpLogAnd,
	"||": OpLogOr,
	"!":  OpLogNot,

	"==": OpRelEqual,
	"!=": OpRelNotEqual,
	"<":  OpRelLess,
	">":  OpRelGreater,
	"<=": OpRelLtEq,
	">=": OpRelGtEq,

	"...": OpListEllipsis,

	"(": PunctGpLParen,
	")": PunctGpRParen,
	"[": PunctGpLBrack,
	"]": PunctGpRBrack,
	"{": PunctGpLBrace,
	"}": PunctGpRBrace,

	",": PunctSepComma,
	".": PunctSepPeriod,
	";": PunctSepSemicolon,
	":": PunctSepColon,

	"\"": PunctStrDblQuote,
	"'":  PunctStrQuote,
	"`":  PunctStrBacktick,
	"\\": PunctStrEsc,
}

OpPunctMap provides a lookup of operators and punctuation tokens by their usual string representation

View Source
var SubCatMap map[Tokens]Tokens

SubCatMap is the map into the sub-category level for each token

Sub-Categories

Functions

func InitCatMap

func InitCatMap()

InitCatMap initializes the CatMap

func InitSubCatMap

func InitSubCatMap()

InitSubCatMap initializes the SubCatMap

Types

type KeyToken

type KeyToken struct {
	Token Tokens
	Key   string
	Depth int
}

KeyToken combines a token and an optional keyword name for Keyword token types if Tok is in Keyword category, then Key string can be used to check for same keyword. Also has a Depth for matching against a particular nesting depth

func (KeyToken) Equal

func (kt KeyToken) Equal(okt KeyToken) bool

Equal compares equality of two tokens including keywords if token is in Keyword category. See also Match for version that uses category / subcategory matching

func (KeyToken) Match

func (kt KeyToken) Match(okt KeyToken) bool

Match compares equality of two tokens including keywords if token is in Keyword category. returns true if the two tokens match, in a category / subcategory sensitive manner: if receiver token is a category, then it matches other token if it is the same category and likewise for subcategory

func (KeyToken) MatchDepth

func (kt KeyToken) MatchDepth(okt KeyToken) bool

MatchDepth compares equality of two tokens including depth -- see Match for other matching criteria

func (KeyToken) String

func (kt KeyToken) String() string

func (KeyToken) StringKey

func (kt KeyToken) StringKey() string

StringKey encodes token into a string for optimized string-based map key lookup

type KeyTokenList

type KeyTokenList []KeyToken

KeyTokenList is a list (slice) of KeyTokens

func (KeyTokenList) Match

func (kl KeyTokenList) Match(okt KeyToken) bool

Match returns true if given keytoken matches any of the items on the list

type Tokens

type Tokens int32 //enums:enum

Tokens is a complete set of lexical tokens that encompasses all programming and text markup languages. It includes everything in alecthomas/chroma (pygments) and everything needed for Go, C, C++, Python, etc.

There are categories and sub-categories, and methods to get those from a given element. The first category is 'None'.

See http://pygments.org/docs/tokens/ for more docs on the different categories

Anything missing should be added via a pull request etc

const (
	// None is the nil token value -- for non-terminal cases or TBD
	None Tokens = iota

	// Error is an input that could not be tokenized due to syntax error etc
	Error

	// EOF is end of file
	EOF

	// EOL is end of line (typically implicit -- used for rule matching)
	EOL

	// EOS is end of statement -- a key meta-token -- in C it is ;, in Go it is either ; or EOL
	EOS

	// Background is for syntax highlight styles based on these tokens
	Background

	// Cat: Keywords (actual keyword is just the string)
	Keyword
	KeywordConstant
	KeywordDeclaration
	KeywordNamespace // incl package, import
	KeywordPseudo
	KeywordReserved
	KeywordType

	// Cat: Names.
	Name
	NameBuiltin       // e.g., true, false -- builtin values..
	NameBuiltinPseudo // e.g., this, self
	NameOther
	NamePseudo

	// SubCat: Type names
	NameType
	NameClass
	NameStruct
	NameField
	NameInterface
	NameConstant
	NameEnum
	NameEnumMember
	NameArray // includes slice etc
	NameMap
	NameObject
	NameTypeParam // for generics, templates

	// SubCat: Function names
	NameFunction
	NameDecorator     // function-like wrappers in python
	NameFunctionMagic // e.g., __init__ in python
	NameMethod
	NameOperator
	NameConstructor // includes destructor..
	NameException
	NameLabel // e.g., goto label
	NameEvent // for LSP -- not really sure what it is..

	// SubCat: Scoping names
	NameScope
	NameNamespace
	NameModule
	NamePackage
	NameLibrary

	// SubCat: NameVar -- variable names
	NameVar
	NameVarAnonymous
	NameVarClass
	NameVarGlobal
	NameVarInstance
	NameVarMagic
	NameVarParam

	// SubCat: Value -- data-like elements
	NameValue
	NameTag // e.g., HTML tag
	NameProperty
	NameAttribute // e.g., HTML attr
	NameEntity    // special entities. (e.g. &nbsp; in HTML).  seems like other..

	// Cat: Literals.
	Literal
	LiteralDate
	LiteralOther
	LiteralBool

	// SubCat: Literal Strings.
	LitStr
	LitStrAffix // unicode specifiers etc
	LitStrAtom
	LitStrBacktick
	LitStrBoolean
	LitStrChar
	LitStrDelimiter
	LitStrDoc // doc-specific strings where syntactically noted
	LitStrDouble
	LitStrEscape   // esc sequences within strings
	LitStrHeredoc  // in ruby, perl
	LitStrInterpol // interpolated parts of strings in #{foo} in Ruby
	LitStrName
	LitStrOther
	LitStrRegex
	LitStrSingle
	LitStrSymbol
	LitStrFile // filename

	// SubCat: Literal Numbers.
	LitNum
	LitNumBin
	LitNumFloat
	LitNumHex
	LitNumInteger
	LitNumIntegerLong
	LitNumOct
	LitNumImag

	// Cat: Operators.
	Operator
	OperatorWord

	// SubCat: Math operators
	OpMath
	OpMathAdd // +
	OpMathSub // -
	OpMathMul // *
	OpMathDiv // /
	OpMathRem // %

	// SubCat: Bitwise operators
	OpBit
	OpBitAnd        // &
	OpBitOr         // |
	OpBitNot        // ~
	OpBitXor        // ^
	OpBitShiftLeft  // <<
	OpBitShiftRight // >>
	OpBitAndNot     // &^

	// SubCat: Assign operators
	OpAsgn
	OpAsgnAssign // =
	OpAsgnInc    // ++
	OpAsgnDec    // --
	OpAsgnArrow  // <-
	OpAsgnDefine // :=

	// SubCat: Math Assign operators
	OpMathAsgn
	OpMathAsgnAdd // +=
	OpMathAsgnSub // -=
	OpMathAsgnMul // *=
	OpMathAsgnDiv // /=
	OpMathAsgnRem // %=

	// SubCat: Bitwise Assign operators
	OpBitAsgn
	OpBitAsgnAnd        // &=
	OpBitAsgnOr         // |=
	OpBitAsgnXor        // ^=
	OpBitAsgnShiftLeft  // <<=
	OpBitAsgnShiftRight // >>=
	OpBitAsgnAndNot     // &^=

	// SubCat: Logical operators
	OpLog
	OpLogAnd // &&
	OpLogOr  // ||
	OpLogNot // !

	// SubCat: Relational operators
	OpRel
	OpRelEqual    // ==
	OpRelNotEqual // !=
	OpRelLess     // <
	OpRelGreater  // >
	OpRelLtEq     // <=
	OpRelGtEq     // >=

	// SubCat: List operators
	OpList
	OpListEllipsis // ...

	// Cat: Punctuation.
	Punctuation

	// SubCat: Grouping punctuation
	PunctGp
	PunctGpLParen // (
	PunctGpRParen // )
	PunctGpLBrack // [
	PunctGpRBrack // ]
	PunctGpLBrace // {
	PunctGpRBrace // }

	// SubCat: Separator punctuation
	PunctSep
	PunctSepComma     // ,
	PunctSepPeriod    // .
	PunctSepSemicolon // ;
	PunctSepColon     // :

	// SubCat: String punctuation
	PunctStr
	PunctStrDblQuote // "
	PunctStrQuote    // '
	PunctStrBacktick // `
	PunctStrEsc      // \

	// Cat: Comments.
	Comment
	CommentHashbang
	CommentMultiline
	CommentSingle
	CommentSpecial

	// SubCat: Preprocessor "comments".
	CommentPreproc
	CommentPreprocFile

	// Cat: Text.
	Text
	TextWhitespace
	TextSymbol
	TextPunctuation
	TextSpellErr

	// SubCat: TextStyle (corresponds to Generic in chroma / pygments) todo: look in font deco for more
	TextStyle
	TextStyleDeleted // strike-through
	TextStyleEmph    // italics
	TextStyleError
	TextStyleHeading
	TextStyleInserted
	TextStyleOutput
	TextStylePrompt
	TextStyleStrong // bold
	TextStyleSubheading
	TextStyleTraceback
	TextStyleUnderline
	TextStyleLink
)

The list of tokens

const TokensN Tokens = 177

TokensN is the highest valid value for type Tokens, plus one.

func TokensValues

func TokensValues() []Tokens

TokensValues returns all possible values for the type Tokens.

func (Tokens) Cat

func (tk Tokens) Cat() Tokens

Cat returns the category that a given token lives in, using CatMap

func (Tokens) ClassName

func (tk Tokens) ClassName() string

ClassName returns the . prefixed CSS classname of the tag style for styling, a CSS property should exist with this name

func (Tokens) CombineRepeats

func (tk Tokens) CombineRepeats() bool

CombineRepeats are token types where repeated tokens of the same type should be combined together -- literals, comments, text

func (Tokens) Desc

func (i Tokens) Desc() string

Desc returns the description of the Tokens value.

func (Tokens) Icon added in v0.1.4

func (tk Tokens) Icon() icons.Icon

Icon returns the appropriate icon for the type of lexical item this is.

func (Tokens) InCat

func (tk Tokens) InCat(other Tokens) bool

InCat returns true if this is in same category as given token

func (Tokens) InSubCat

func (tk Tokens) InSubCat(other Tokens) bool

InCat returns true if this is in same sub-category as given token

func (Tokens) Int64

func (i Tokens) Int64() int64

Int64 returns the Tokens value as an int64.

func (Tokens) IsAmbigUnaryOp

func (tk Tokens) IsAmbigUnaryOp() bool

IsAmbigUnaryOp returns true if this token is an operator that could either be a Unary or Binary operator -- need special matching for this. includes * and & which are used for address operations in C-like languages

func (Tokens) IsCat

func (tk Tokens) IsCat() bool

IsCat returns true if this is a category-level token

func (Tokens) IsCode

func (tk Tokens) IsCode() bool

IsCode returns true if this token is in Keyword, Name, Operator, or Punctuation categs. these are recognized code (program) elements that can usefully be distinguished from other forms of raw text (e.g., for spell checking)

func (Tokens) IsKeyword

func (tk Tokens) IsKeyword() bool

IsKeyword returns true if this in the Keyword category

func (Tokens) IsPunctGpLeft

func (tk Tokens) IsPunctGpLeft() bool

IsPunctGpLeft returns true if token is a PunctGpL token -- left paren, brace, bracket

func (Tokens) IsPunctGpRight

func (tk Tokens) IsPunctGpRight() bool

IsPunctGpRight returns true if token is a PunctGpR token -- right paren, brace, bracket

func (Tokens) IsSubCat

func (tk Tokens) IsSubCat() bool

IsSubCat returns true if this is a sub-category-level token

func (Tokens) IsUnaryOp

func (tk Tokens) IsUnaryOp() bool

IsUnaryOp returns true if this token is an operator that is typically used as a Unary operator: - + & * ! ^ ! <-

func (Tokens) MarshalText

func (i Tokens) MarshalText() ([]byte, error)

MarshalText implements the encoding.TextMarshaler interface.

func (Tokens) Match

func (tk Tokens) Match(otk Tokens) bool

Match returns true if the two tokens match, in a category / subcategory sensitive manner: if receiver token is a category, then it matches other token if it is the same category and likewise for subcategory

func (Tokens) Parent

func (tk Tokens) Parent() Tokens

Parent returns the closest parent-level of this token (subcat or cat)

func (Tokens) PunctGpMatch

func (tk Tokens) PunctGpMatch() Tokens

PunctGpMatch returns the matching token for given PunctGp token

func (*Tokens) SetInt64

func (i *Tokens) SetInt64(in int64)

SetInt64 sets the Tokens value from an int64.

func (*Tokens) SetString

func (i *Tokens) SetString(s string) error

SetString sets the Tokens value from its string representation, and returns an error if the string is invalid.

func (Tokens) String

func (i Tokens) String() string

String returns the string representation of this Tokens value.

func (Tokens) StyleName

func (tk Tokens) StyleName() string

StyleName returns the abbreviated 2-3 letter style name of the tag

func (Tokens) SubCat

func (tk Tokens) SubCat() Tokens

SubCat returns the sub-category that a given token lives in, using SubCatMap

func (*Tokens) UnmarshalText

func (i *Tokens) UnmarshalText(text []byte) error

UnmarshalText implements the encoding.TextUnmarshaler interface.

func (Tokens) Values

func (i Tokens) Values() []enums.Enum

Values returns all possible values for the type Tokens.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL