Documentation
¶
Overview ¶
Package commonmark provides a CommonMark parser.
Example ¶
package main import ( "os" "zombiezen.com/go/commonmark" ) func main() { // Convert CommonMark to a parse tree and any link references. blocks, refMap := commonmark.Parse([]byte("Hello, **World**!\n")) // Render parse tree to HTML. commonmark.RenderHTML(os.Stdout, blocks, refMap) }
Output: <p>Hello, <strong>World</strong>! </p>
Index ¶
- func IsEmailAddress(s string) bool
- func NormalizeURI(s string) string
- func Parse(source []byte) ([]*RootBlock, ReferenceMap)
- func RenderHTML(w io.Writer, blocks []*RootBlock, refMap ReferenceMap) error
- type Block
- func (b *Block) AsNode() Node
- func (b *Block) Child(i int) Node
- func (b *Block) ChildCount() int
- func (b *Block) HeadingLevel() int
- func (b *Block) InfoString() *Inline
- func (b *Block) IsOrderedList() bool
- func (b *Block) IsTightList() bool
- func (b *Block) Kind() BlockKind
- func (b *Block) ListItemNumber(source []byte) int
- func (b *Block) Span() Span
- type BlockKind
- type BlockParser
- type Inline
- func (inline *Inline) AsNode() Node
- func (inline *Inline) Child(i int) *Inline
- func (inline *Inline) ChildCount() int
- func (inline *Inline) IndentWidth() int
- func (inline *Inline) Kind() InlineKind
- func (inline *Inline) LinkDestination() *Inline
- func (inline *Inline) LinkReference() string
- func (inline *Inline) LinkTitle() *Inline
- func (inline *Inline) Span() Span
- func (inline *Inline) Text(source []byte) string
- type InlineKind
- type InlineParser
- type LinkDefinition
- type Node
- type ReferenceMap
- type ReferenceMatcher
- type RootBlock
- type Span
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func IsEmailAddress ¶
IsEmailAddress reports whether the string is a CommonMark email address.
func NormalizeURI ¶
NormalizeURI percent-encodes any characters in a string that are not reserved or unreserved URI characters. This is commonly used for transforming CommonMark link destinations into strings suitable for href or src attributes.
func Parse ¶
func Parse(source []byte) ([]*RootBlock, ReferenceMap)
Parse parses an in-memory UTF-8 CommonMark document and returns its blocks. As long as source does not contain NUL bytes, the blocks will use the original byte slice as their source.
func RenderHTML ¶
func RenderHTML(w io.Writer, blocks []*RootBlock, refMap ReferenceMap) error
RenderHTML writes the given sequence of parsed blocks to the given writer as HTML. It will return the first error encountered, if any.
Types ¶
type Block ¶
type Block struct {
// contains filtered or unexported fields
}
A Block is a structural element in a CommonMark document.
func (*Block) ChildCount ¶
ChildCount returns the number of children the node has. Calling ChildCount on nil returns 0.
func (*Block) HeadingLevel ¶
HeadingLevel returns the 1-based level for an ATXHeadingKind or SetextHeadingKind, or zero otherwise.
func (*Block) InfoString ¶
InfoString returns the info string node for a FencedCodeBlockKind block or nil otherwise.
func (*Block) IsOrderedList ¶
IsOrderedList reports whether the block is an ordered list or an ordered list item.
func (*Block) IsTightList ¶
IsTightList reports whether the block is an tight list or a tight list item.
func (*Block) ListItemNumber ¶
ListItemNumber returns the number of a ListItemKind block or -1 if the block does not represent an ordered list item.
type BlockKind ¶
type BlockKind uint16
BlockKind is an enumeration of values returned by *Block.Kind.
const ( // ParagraphKind is used for a block of text. ParagraphKind BlockKind = 1 + iota // ThematicBreakKind is used for a thematic break, also known as a horizontal rule. // It will not contain children. ThematicBreakKind // ATXHeadingKind is used for headings that start with hash marks. ATXHeadingKind // SetextHeadingKind is used for headings that end with a divider. SetextHeadingKind // IndentedCodeBlockKind is used for code blocks started by indentation. IndentedCodeBlockKind // FencedCodeBlockKind is used for code blocks started by backticks or tildes. FencedCodeBlockKind // HTMLBlockKind is used for blocks of raw HTML. // It should not be wrapped by any tags in rendered HTML output. HTMLBlockKind // LinkReferenceDefinitionKind is used for a [link reference definition]. // The first child is always a [LinkLabelKind], // the second child is always a [LinkDestinationKind], // and it may end with an optional [LinkTitleKind]. // // [link reference definition]: https://spec.commonmark.org/0.30/#link-reference-definition LinkReferenceDefinitionKind // BlockQuoteKind is used for block quotes. BlockQuoteKind // ListItemKind is used for items in an ordered or unordered list. // The first child will always be of [ListMarkerKind]. // If the item contains a paragraph and the item is "tight", // then the paragraph tag should be stripped. ListItemKind // ListKind is used for ordered or unordered lists. ListKind // ListMarkerKind is used to contain the marker in a [ListItemKind] node. // It is typically not rendered directly. ListMarkerKind )
func (BlockKind) IsCode ¶
IsCode reports whether the kind is IndentedCodeBlockKind or FencedCodeBlockKind.
func (BlockKind) IsHeading ¶
IsHeading reports whether the kind is ATXHeadingKind or SetextHeadingKind.
type BlockParser ¶
type BlockParser struct {
// contains filtered or unexported fields
}
A BlockParser splits a CommonMark document into blocks.
Example ¶
package main import ( "io" "os" "strings" "zombiezen.com/go/commonmark" ) func main() { input := strings.NewReader( "Hello, [World][]!\n" + "\n" + "[World]: https://www.example.com/\n", ) // Parse document into blocks (e.g. paragraphs, lists, etc.) // and collect link reference definitions. parser := commonmark.NewBlockParser(input) var blocks []*commonmark.RootBlock refMap := make(commonmark.ReferenceMap) for { block, err := parser.NextBlock() if err == io.EOF { break } if err != nil { // Not expecting an error from a string. panic(err) } // Add block to list. blocks = append(blocks, block) // Add any link reference definitions to map. refMap.Extract(block.Source, block.AsNode()) } // Finish parsing inside blocks. inlineParser := &commonmark.InlineParser{ ReferenceMatcher: refMap, } for _, block := range blocks { inlineParser.Rewrite(block) } // Render blocks as HTML. commonmark.RenderHTML(os.Stdout, blocks, refMap) }
Output: <p>Hello, <a href="https://www.example.com/">World</a>! </p>
func NewBlockParser ¶
func NewBlockParser(r io.Reader) *BlockParser
NewBlockParser returns a block parser that reads from r.
Block parsers maintain their own buffering and may read data from r beyond the blocks requested.
func (*BlockParser) NextBlock ¶
func (p *BlockParser) NextBlock() (*RootBlock, error)
NextBlock reads the next top-level block in the document, returning the first error encountered. Blocks returned by NextBlock will typically contain UnparsedKind nodes for any text: use *InlineParser.Rewrite to complete parsing.
type Inline ¶
type Inline struct {
// contains filtered or unexported fields
}
Inline represents CommonMark content elements like text, links, or emphasis.
func (*Inline) ChildCount ¶
ChildCount returns the number of children the node has. Calling ChildCount on nil returns 0.
func (*Inline) IndentWidth ¶
IndentWidth returns the number of spaces the IndentKind span represents, or zero if the node is nil or of a different type.
func (*Inline) Kind ¶
func (inline *Inline) Kind() InlineKind
Kind returns the type of inline node or zero if the node is nil.
func (*Inline) LinkDestination ¶
LinkDestination returns the destination child of a LinkKind node or nil if none is present or the node is not a link.
func (*Inline) LinkReference ¶
LinkReference returns the normalized form of a link label.
func (*Inline) LinkTitle ¶
LinkTitle returns the title child of a LinkKind node or nil if none is present or the node is not a link.
type InlineKind ¶
type InlineKind uint16
InlineKind is an enumeration of values returned by *Inline.Kind.
const ( // TextKind is used for literal text. TextKind InlineKind = 1 + iota // SoftLineBreakKind is rendered as either a space or as a hard line break, // depending on the renderer. SoftLineBreakKind // HardLineBreakKind is rendered as a line break. HardLineBreakKind // IndentKind represents one or more space characters // (the exact number can be retrieved by [*Inline.IndentWidth]). // It's placed in the parse tree // in situations where the number of logical spaces does not match the source. IndentKind // CharacterReferenceKind is used for ampersand escape characters // (e.g. "&"). CharacterReferenceKind // InfoStringKind is used for the [info string] of a fenced code block. // It's typically not rendered directly and its contents are implementation-defined. // // [info string]: https://spec.commonmark.org/0.30/#info-string InfoStringKind // EmphasisKind is used for text that has stress emphasis. EmphasisKind // StrongKind is used for text that has strong emphasis. StrongKind // LinkKind is used for hyperlinks. // The [*Inline.LinkDestination], [*Inline.LinkTitle], and [*Inline.LinkReference] methods // can be used to retrieve specific parts of the link. LinkKind // ImageKind is used for images. // The contents of the node are used as the image's text description. // Otherwise, ImageKind is similar to [LinkKind]. ImageKind // LinkDestinationKind is used as part of links and images // to indicate the destination or image source, respectively. LinkDestinationKind // LinkTitleKind is used as part of links and images // to hold advisory text typically rendered as a tooltip. LinkTitleKind // LinkLabelKind is used as either a link reference definition label // or in a link or image to reference a link reference definition. LinkLabelKind // CodeSpanKind is used for inline code in a non-code-block context. CodeSpanKind // AutolinkKind is used for [autolinks]. // The node's content is also the link's destination. // // [autolinks]: https://spec.commonmark.org/0.30/#autolinks AutolinkKind // HTMLTagKind is a container for one or more [RawHTMLKind] nodes // that represents an open tag, a closing tag, an HTML comment, // a processing instruction, a declaration, or a CDATA section. HTMLTagKind // RawHTMLKind is a text node that should be reproduced in HTML verbatim. RawHTMLKind // UnparsedKind is used for inline text that has not been tokenized. UnparsedKind )
func (InlineKind) String ¶ added in v0.2.0
func (i InlineKind) String() string
type InlineParser ¶
type InlineParser struct {
ReferenceMatcher ReferenceMatcher
}
An InlineParser converts UnparsedKind Inline nodes into inline trees.
func (*InlineParser) Rewrite ¶
func (p *InlineParser) Rewrite(root *RootBlock)
Rewrite replaces any UnparsedKind nodes in the given root block with parsed versions of the node.
type LinkDefinition ¶
LinkDefinition is the data of a link reference definition.
type Node ¶
type Node struct {
// contains filtered or unexported fields
}
Node is a pointer to a Block or an Inline.
func (Node) Block ¶
Block returns the referenced block or nil if the pointer does not reference a block.
type ReferenceMap ¶
type ReferenceMap map[string]LinkDefinition
ReferenceMap is a mapping of normalized labels to link definitions.
func (ReferenceMap) Extract ¶
func (m ReferenceMap) Extract(source []byte, node Node)
Extract adds any link reference definitions contained in node to the map. In case of conflicts, Extract will not replace any existing definitions in the map and will use the first definition in source order.
func (ReferenceMap) MatchReference ¶
func (m ReferenceMap) MatchReference(normalizedLabel string) bool
MatchReference reports whether the normalized label appears in the map.
type ReferenceMatcher ¶
A type that implements ReferenceMatcher can be checked for the presence of link reference definitions.
type RootBlock ¶
type RootBlock struct { // Source holds the bytes of the block read from the original source. // Any NUL bytes will have been replaced with the Unicode Replacement Character. Source []byte // StartLine is the 1-based line number of the first line of the block. StartLine int // StartOffset is the byte offset from the beginning of the original source // that this block starts at. StartOffset int64 // EndOffset is the byte offset from the beginning of the original source // that this block ends at. // Unless the original source contained NUL bytes, // EndOffset = StartOffset + len(Source). EndOffset int64 Block }
RootBlock represents a "top-level" block, that is, a block whose parent is the document. Root blocks store their CommonMark source and document position information. All other position information in the tree is relative to the beginning of the root block.
type Span ¶
type Span struct { // Start is the index of the first byte of the span, // relative to the beginning of the [RootBlock]. Start int // End is the end index of the span (exclusive), // relative to the beginning of the [RootBlock]. End int }
Span is a contiguous region of a document reference in a RootBlock.
func (Span) Intersect ¶
Intersect returns the intersection of two spans or an invalid span if none exists.