parser

package
v1.7.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 25, 2024 License: MIT Imports: 10 Imported by: 835

Documentation

Overview

Package parser contains stuff that are related to parsing a Markdown text.

Index

Constants

This section is empty.

Variables

View Source
var LinkReferenceParagraphTransformer = &linkReferenceParagraphTransformer{}

LinkReferenceParagraphTransformer is a ParagraphTransformer implementation that parses and extracts link reference from paragraphs.

Functions

func DefaultBlockParsers

func DefaultBlockParsers() []util.PrioritizedValue

DefaultBlockParsers returns a new list of default BlockParsers. Priorities of default BlockParsers are:

SetextHeadingParser, 100
ThematicBreakParser, 200
ListParser, 300
ListItemParser, 400
CodeBlockParser, 500
ATXHeadingParser, 600
FencedCodeBlockParser, 700
BlockquoteParser, 800
HTMLBlockParser, 900
ParagraphParser, 1000

func DefaultInlineParsers

func DefaultInlineParsers() []util.PrioritizedValue

DefaultInlineParsers returns a new list of default InlineParsers. Priorities of default InlineParsers are:

CodeSpanParser, 100
LinkParser, 200
AutoLinkParser, 300
RawHTMLParser, 400
EmphasisParser, 500

func DefaultParagraphTransformers

func DefaultParagraphTransformers() []util.PrioritizedValue

DefaultParagraphTransformers returns a new list of default ParagraphTransformers. Priorities of default ParagraphTransformers are:

LinkReferenceParagraphTransformer, 100

func ProcessDelimiters

func ProcessDelimiters(bottom ast.Node, pc Context)

ProcessDelimiters processes the delimiter list in the context. Processing will be stop when reaching the bottom.

If you implement an inline parser that can have other inline nodes as children, you should call this function when nesting span has closed.

Types

type ASTTransformer

type ASTTransformer interface {
	// Transform transforms the given AST tree.
	Transform(node *ast.Document, reader text.Reader, pc Context)
}

ASTTransformer transforms entire Markdown document AST tree.

type Attribute added in v1.0.10

type Attribute struct {
	Name  []byte
	Value interface{}
}

An Attribute is an attribute of the markdown elements.

type Attributes added in v1.0.10

type Attributes []Attribute

An Attributes is a collection of attributes.

func ParseAttributes added in v1.0.8

func ParseAttributes(reader text.Reader) (Attributes, bool)

ParseAttributes parses attributes into a map. ParseAttributes returns a parsed attributes and true if could parse attributes, otherwise nil and false.

func (Attributes) Find added in v1.0.10

func (as Attributes) Find(name []byte) (interface{}, bool)

Find returns a (value, true) if an attribute correspond with given name is found, otherwise (nil, false).

type Block

type Block struct {
	// Node is a BlockNode.
	Node ast.Node
	// Parser is a BlockParser.
	Parser BlockParser
}

A Block struct holds a node and correspond parser pair.

type BlockParser

type BlockParser interface {
	// Trigger returns a list of characters that triggers Parse method of
	// this parser.
	// If Trigger returns a nil, Open will be called with any lines.
	Trigger() []byte

	// Open parses the current line and returns a result of parsing.
	//
	// Open must not parse beyond the current line.
	// If Open has been able to parse the current line, Open must advance a reader
	// position by consumed byte length.
	//
	// If Open has not been able to parse the current line, Open should returns
	// (nil, NoChildren). If Open has been able to parse the current line, Open
	// should returns a new Block node and returns HasChildren or NoChildren.
	Open(parent ast.Node, reader text.Reader, pc Context) (ast.Node, State)

	// Continue parses the current line and returns a result of parsing.
	//
	// Continue must not parse beyond the current line.
	// If Continue has been able to parse the current line, Continue must advance
	// a reader position by consumed byte length.
	//
	// If Continue has not been able to parse the current line, Continue should
	// returns Close. If Continue has been able to parse the current line,
	// Continue should returns (Continue | NoChildren) or
	// (Continue | HasChildren)
	Continue(node ast.Node, reader text.Reader, pc Context) State

	// Close will be called when the parser returns Close.
	Close(node ast.Node, reader text.Reader, pc Context)

	// CanInterruptParagraph returns true if the parser can interrupt paragraphs,
	// otherwise false.
	CanInterruptParagraph() bool

	// CanAcceptIndentedLine returns true if the parser can open new node when
	// the given line is being indented more than 3 spaces.
	CanAcceptIndentedLine() bool
}

A BlockParser interface parses a block level element like Paragraph, List, Blockquote etc.

func NewATXHeadingParser

func NewATXHeadingParser(opts ...HeadingOption) BlockParser

NewATXHeadingParser return a new BlockParser that can parse ATX headings.

func NewBlockquoteParser

func NewBlockquoteParser() BlockParser

NewBlockquoteParser returns a new BlockParser that parses blockquotes.

func NewCodeBlockParser

func NewCodeBlockParser() BlockParser

NewCodeBlockParser returns a new BlockParser that parses code blocks.

func NewFencedCodeBlockParser

func NewFencedCodeBlockParser() BlockParser

NewFencedCodeBlockParser returns a new BlockParser that parses fenced code blocks.

func NewHTMLBlockParser

func NewHTMLBlockParser() BlockParser

NewHTMLBlockParser return a new BlockParser that can parse html blocks.

func NewListItemParser

func NewListItemParser() BlockParser

NewListItemParser returns a new BlockParser that parses list items.

func NewListParser

func NewListParser() BlockParser

NewListParser returns a new BlockParser that parses lists. This parser must take precedence over the ListItemParser.

func NewParagraphParser

func NewParagraphParser() BlockParser

NewParagraphParser returns a new BlockParser that parses paragraphs.

func NewSetextHeadingParser

func NewSetextHeadingParser(opts ...HeadingOption) BlockParser

NewSetextHeadingParser return a new BlockParser that can parse Setext headings.

func NewThematicBreakParser added in v1.0.9

func NewThematicBreakParser() BlockParser

NewThematicBreakParser returns a new BlockParser that parses thematic breaks.

type CloseBlocker

type CloseBlocker interface {
	// CloseBlock will be called when a block is closed.
	CloseBlock(parent ast.Node, block text.Reader, pc Context)
}

A CloseBlocker interface is a callback function that will be called when block is closed in the inline parsing.

type Config

type Config struct {
	Options               map[OptionName]interface{}
	BlockParsers          util.PrioritizedSlice /*<BlockParser>*/
	InlineParsers         util.PrioritizedSlice /*<InlineParser>*/
	ParagraphTransformers util.PrioritizedSlice /*<ParagraphTransformer>*/
	ASTTransformers       util.PrioritizedSlice /*<ASTTransformer>*/
	EscapedSpace          bool
}

A Config struct is a data structure that holds configuration of the Parser.

func NewConfig

func NewConfig() *Config

NewConfig returns a new Config.

type Context

type Context interface {
	// String implements Stringer.
	String() string

	// Get returns a value associated with the given key.
	Get(ContextKey) interface{}

	// ComputeIfAbsent computes a value if a value associated with the given key is absent and returns the value.
	ComputeIfAbsent(ContextKey, func() interface{}) interface{}

	// Set sets the given value to the context.
	Set(ContextKey, interface{})

	// AddReference adds the given reference to this context.
	AddReference(Reference)

	// Reference returns (a reference, true) if a reference associated with
	// the given label exists, otherwise (nil, false).
	Reference(label string) (Reference, bool)

	// References returns a list of references.
	References() []Reference

	// IDs returns a collection of the element ids.
	IDs() IDs

	// BlockOffset returns a first non-space character position on current line.
	// This value is valid only for BlockParser.Open.
	// BlockOffset returns -1 if current line is blank.
	BlockOffset() int

	// BlockOffset sets a first non-space character position on current line.
	// This value is valid only for BlockParser.Open.
	SetBlockOffset(int)

	// BlockIndent returns an indent width on current line.
	// This value is valid only for BlockParser.Open.
	// BlockIndent returns -1 if current line is blank.
	BlockIndent() int

	// BlockIndent sets an indent width on current line.
	// This value is valid only for BlockParser.Open.
	SetBlockIndent(int)

	// FirstDelimiter returns a first delimiter of the current delimiter list.
	FirstDelimiter() *Delimiter

	// LastDelimiter returns a last delimiter of the current delimiter list.
	LastDelimiter() *Delimiter

	// PushDelimiter appends the given delimiter to the tail of the current
	// delimiter list.
	PushDelimiter(delimiter *Delimiter)

	// RemoveDelimiter removes the given delimiter from the current delimiter list.
	RemoveDelimiter(d *Delimiter)

	// ClearDelimiters clears the current delimiter list.
	ClearDelimiters(bottom ast.Node)

	// OpenedBlocks returns a list of nodes that are currently in parsing.
	OpenedBlocks() []Block

	// SetOpenedBlocks sets a list of nodes that are currently in parsing.
	SetOpenedBlocks([]Block)

	// LastOpenedBlock returns a last node that is currently in parsing.
	LastOpenedBlock() Block

	// IsInLinkLabel returns true if current position seems to be in link label.
	IsInLinkLabel() bool
}

A Context interface holds a information that are necessary to parse Markdown text.

func NewContext

func NewContext(options ...ContextOption) Context

NewContext returns a new Context.

type ContextConfig added in v1.1.11

type ContextConfig struct {
	IDs IDs
}

A ContextConfig struct is a data structure that holds configuration of the Context.

type ContextKey

type ContextKey int

ContextKey is a key that is used to set arbitrary values to the context.

var ContextKeyMax ContextKey

ContextKeyMax is a maximum value of the ContextKey.

func NewContextKey

func NewContextKey() ContextKey

NewContextKey return a new ContextKey value.

type ContextOption added in v1.1.11

type ContextOption func(*ContextConfig)

An ContextOption is a functional option type for the Context.

func WithIDs added in v1.1.11

func WithIDs(ids IDs) ContextOption

WithIDs is a functional option for the Context.

type Delimiter

type Delimiter struct {
	ast.BaseInline

	Segment text.Segment

	// CanOpen is set true if this delimiter can open a span for a new node.
	// See https://spec.commonmark.org/0.30/#can-open-emphasis for details.
	CanOpen bool

	// CanClose is set true if this delimiter can close a span for a new node.
	// See https://spec.commonmark.org/0.30/#can-open-emphasis for details.
	CanClose bool

	// Length is a remaining length of this delimiter.
	Length int

	// OriginalLength is a original length of this delimiter.
	OriginalLength int

	// Char is a character of this delimiter.
	Char byte

	// PreviousDelimiter is a previous sibling delimiter node of this delimiter.
	PreviousDelimiter *Delimiter

	// NextDelimiter is a next sibling delimiter node of this delimiter.
	NextDelimiter *Delimiter

	// Processor is a DelimiterProcessor associated with this delimiter.
	Processor DelimiterProcessor
}

A Delimiter struct represents a delimiter like '*' of the Markdown text.

func NewDelimiter

func NewDelimiter(canOpen, canClose bool, length int, char byte, processor DelimiterProcessor) *Delimiter

NewDelimiter returns a new Delimiter node.

func ScanDelimiter

func ScanDelimiter(line []byte, before rune, min int, processor DelimiterProcessor) *Delimiter

ScanDelimiter scans a delimiter by given DelimiterProcessor.

func (*Delimiter) CalcComsumption

func (d *Delimiter) CalcComsumption(closer *Delimiter) int

CalcComsumption calculates how many characters should be used for opening a new span correspond to given closer.

func (*Delimiter) ConsumeCharacters

func (d *Delimiter) ConsumeCharacters(n int)

ConsumeCharacters consumes delimiters.

func (*Delimiter) Dump

func (d *Delimiter) Dump(source []byte, level int)

Dump implements Node.Dump.

func (*Delimiter) Inline

func (d *Delimiter) Inline()

Inline implements Inline.Inline.

func (*Delimiter) Kind

func (d *Delimiter) Kind() ast.NodeKind

Kind implements Node.Kind.

func (*Delimiter) Text

func (d *Delimiter) Text(source []byte) []byte

Text implements Node.Text.

type DelimiterProcessor

type DelimiterProcessor interface {
	// IsDelimiter returns true if given character is a delimiter, otherwise false.
	IsDelimiter(byte) bool

	// CanOpenCloser returns true if given opener can close given closer, otherwise false.
	CanOpenCloser(opener, closer *Delimiter) bool

	// OnMatch will be called when new matched delimiter found.
	// OnMatch should return a new Node correspond to the matched delimiter.
	OnMatch(consumes int) ast.Node
}

A DelimiterProcessor interface provides a set of functions about Delimiter nodes.

type HeadingConfig

type HeadingConfig struct {
	AutoHeadingID bool
	Attribute     bool
}

A HeadingConfig struct is a data structure that holds configuration of the renderers related to headings.

func (*HeadingConfig) SetOption

func (b *HeadingConfig) SetOption(name OptionName, _ interface{})

SetOption implements SetOptioner.

type HeadingOption

type HeadingOption interface {
	Option
	SetHeadingOption(*HeadingConfig)
}

A HeadingOption interface sets options for heading parsers.

func WithAutoHeadingID

func WithAutoHeadingID() HeadingOption

WithAutoHeadingID is a functional option that enables custom heading ids and auto generated heading ids.

func WithHeadingAttribute

func WithHeadingAttribute() HeadingOption

WithHeadingAttribute is a functional option that enables custom heading attributes.

type IDs

type IDs interface {
	// Generate generates a new element id.
	Generate(value []byte, kind ast.NodeKind) []byte

	// Put puts a given element id to the used ids table.
	Put(value []byte)
}

An IDs interface is a collection of the element ids.

type InlineParser

type InlineParser interface {
	// Trigger returns a list of characters that triggers Parse method of
	// this parser.
	// Trigger characters must be a punctuation or a halfspace.
	// Halfspaces triggers this parser when character is any spaces characters or
	// a head of line
	Trigger() []byte

	// Parse parse the given block into an inline node.
	//
	// Parse can parse beyond the current line.
	// If Parse has been able to parse the current line, it must advance a reader
	// position by consumed byte length.
	Parse(parent ast.Node, block text.Reader, pc Context) ast.Node
}

An InlineParser interface parses an inline level element like CodeSpan, Link etc.

func NewAutoLinkParser

func NewAutoLinkParser() InlineParser

NewAutoLinkParser returns a new InlineParser that parses autolinks surrounded by '<' and '>' .

func NewCodeSpanParser

func NewCodeSpanParser() InlineParser

NewCodeSpanParser return a new InlineParser that parses inline codes surrounded by '`' .

func NewEmphasisParser

func NewEmphasisParser() InlineParser

NewEmphasisParser return a new InlineParser that parses emphasises.

func NewLinkParser

func NewLinkParser() InlineParser

NewLinkParser return a new InlineParser that parses links.

func NewRawHTMLParser

func NewRawHTMLParser() InlineParser

NewRawHTMLParser return a new InlineParser that can parse inline htmls.

type Option

type Option interface {
	SetParserOption(*Config)
}

An Option interface is a functional option type for the Parser.

func WithASTTransformers

func WithASTTransformers(ps ...util.PrioritizedValue) Option

WithASTTransformers is a functional option that allow you to add ASTTransformers to the parser.

func WithAttribute

func WithAttribute() Option

WithAttribute is a functional option that enables custom attributes.

func WithBlockParsers

func WithBlockParsers(bs ...util.PrioritizedValue) Option

WithBlockParsers is a functional option that allow you to add BlockParsers to the parser.

func WithEscapedSpace added in v1.5.1

func WithEscapedSpace() Option

WithEscapedSpace is a functional option indicates that a '\' escaped half-space(0x20) should not trigger parsers.

func WithInlineParsers

func WithInlineParsers(bs ...util.PrioritizedValue) Option

WithInlineParsers is a functional option that allow you to add InlineParsers to the parser.

func WithOption

func WithOption(name OptionName, value interface{}) Option

WithOption is a functional option that allow you to set an arbitrary option to the parser.

func WithParagraphTransformers

func WithParagraphTransformers(ps ...util.PrioritizedValue) Option

WithParagraphTransformers is a functional option that allow you to add ParagraphTransformers to the parser.

type OptionName

type OptionName string

OptionName is a name of parser options.

type ParagraphTransformer

type ParagraphTransformer interface {
	// Transform transforms the given paragraph.
	Transform(node *ast.Paragraph, reader text.Reader, pc Context)
}

A ParagraphTransformer transforms parsed Paragraph nodes. For example, link references are searched in parsed Paragraphs.

type ParseConfig

type ParseConfig struct {
	Context Context
}

A ParseConfig struct is a data structure that holds configuration of the Parser.Parse.

type ParseOption

type ParseOption func(c *ParseConfig)

A ParseOption is a functional option type for the Parser.Parse.

func WithContext

func WithContext(context Context) ParseOption

WithContext is a functional option that allow you to override a default context.

type Parser

type Parser interface {
	// Parse parses the given Markdown text into AST nodes.
	Parse(reader text.Reader, opts ...ParseOption) ast.Node

	// AddOption adds the given option to this parser.
	AddOptions(...Option)
}

A Parser interface parses Markdown text into AST nodes.

func NewParser

func NewParser(options ...Option) Parser

NewParser returns a new Parser with given options.

type Reference

type Reference interface {
	// String implements Stringer.
	String() string

	// Label returns a label of the reference.
	Label() []byte

	// Destination returns a destination(URL) of the reference.
	Destination() []byte

	// Title returns a title of the reference.
	Title() []byte
}

A Reference interface represents a link reference in Markdown text.

func NewReference

func NewReference(label, destination, title []byte) Reference

NewReference returns a new Reference.

type SetOptioner

type SetOptioner interface {
	// SetOption sets the given option to the object.
	// Unacceptable options may be passed.
	// Thus implementations must ignore unacceptable options.
	SetOption(name OptionName, value interface{})
}

A SetOptioner interface sets the given option to the object.

type State

type State int

State represents parser's state. State is designed to use as a bit flag.

const (
	// None is a default value of the [State].
	None State = 1 << iota

	// Continue indicates parser can continue parsing.
	Continue

	// Close indicates parser cannot parse anymore.
	Close

	// HasChildren indicates parser may have child blocks.
	HasChildren

	// NoChildren indicates parser does not have child blocks.
	NoChildren

	// RequireParagraph indicates parser requires that the last node
	// must be a paragraph and is not converted to other nodes by
	// ParagraphTransformers.
	RequireParagraph
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL