blackfriday

package
v0.0.0-...-68dbcc1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 16, 2014 License: MIT Imports: 5 Imported by: 0

README

Blackfriday

Blackfriday is a Markdown processor implemented in Go. It is paranoid about its input (so you can safely feed it user-supplied data), it is fast, it supports common extensions (tables, smart punctuation substitutions, etc.), and it is safe for all utf-8 (unicode) input.

HTML output is currently supported, along with Smartypants extensions. An experimental LaTeX output engine is also included.

It started as a translation from C of upskirt.

Installation

Blackfriday is compatible with Go 1. If you are using an older release of Go, consider using v1.1 of blackfriday, which was based on the last stable release of Go prior to Go 1. You can find it as a tagged commit on github.

With Go 1 and git installed:

go get github.com/russross/blackfriday

will download, compile, and install the package into your $GOPATH directory hierarchy. Alternatively, you can import it into a project:

import "github.com/russross/blackfriday"

and when you build that project with go build, blackfriday will be downloaded and installed automatically.

For basic usage, it is as simple as getting your input into a byte slice and calling:

output := blackfriday.MarkdownBasic(input)

This renders it with no extensions enabled. To get a more useful feature set, use this instead:

output := blackfriday.MarkdownCommon(input)

If you want to customize the set of options, first get a renderer (currently either the HTML or LaTeX output engines), then use it to call the more general Markdown function. For examples, see the implementations of MarkdownBasic and MarkdownCommon in markdown.go.

You can also check out blackfriday-tool for a more complete example of how to use it. Download and install it using:

go get github.com/russross/blackfriday-tool

This is a simple command-line tool that allows you to process a markdown file using a standalone program. You can also browse the source directly on github if you are just looking for some example code:

Note that if you have not already done so, installing blackfriday-tool will be sufficient to download and install blackfriday in addition to the tool itself. The tool binary will be installed in $GOPATH/bin. This is a statically-linked binary that can be copied to wherever you need it without worrying about dependencies and library versions.

Features

All features of upskirt are supported, including:

  • Compatibility. The Markdown v1.0.3 test suite passes with the --tidy option. Without --tidy, the differences are mostly in whitespace and entity escaping, where blackfriday is more consistent and cleaner.

  • Common extensions, including table support, fenced code blocks, autolinks, strikethroughs, non-strict emphasis, etc.

  • Safety. Blackfriday is paranoid when parsing, making it safe to feed untrusted user input without fear of bad things happening. The test suite stress tests this and there are no known inputs that make it crash. If you find one, please let me know and send me the input that does it.

  • Fast processing. It is fast enough to render on-demand in most web applications without having to cache the output.

  • Thread safety. You can run multiple parsers in different goroutines without ill effect. There is no dependence on global shared state.

  • Minimal dependencies. Blackfriday only depends on standard library packages in Go. The source code is pretty self-contained, so it is easy to add to any project, including Google App Engine projects.

  • Standards compliant. Output successfully validates using the W3C validation tool for HTML 4.01 and XHTML 1.0 Transitional.

Extensions

In addition to the standard markdown syntax, this package implements the following extensions:

  • Intra-word emphasis supression. The _ character is commonly used inside words when discussing code, so having markdown interpret it as an emphasis command is usually the wrong thing. Blackfriday lets you treat all emphasis markers as normal characters when they occur inside a word.

  • Tables. Tables can be created by drawing them in the input using a simple syntax:

    Name    | Age
    --------|------
    Bob     | 27
    Alice   | 23
    
  • Fenced code blocks. In addition to the normal 4-space indentation to mark code blocks, you can explicitly mark them and supply a language (to make syntax highlighting simple). Just mark it like this:

    ``` go
    func getTrue() bool {
        return true
    }
    ```
    

    You can use 3 or more backticks to mark the beginning of the block, and the same number to mark the end of the block.

  • Autolinking. Blackfriday can find URLs that have not been explicitly marked as links and turn them into links.

  • Strikethrough. Use two tildes (~~) to mark text that should be crossed out.

  • Hard line breaks. With this extension enabled (it is off by default in the MarkdownBasic and MarkdownCommon convenience functions), newlines in the input translate into line breaks in the output.

  • Smart quotes. Smartypants-style punctuation substitution is supported, turning normal double- and single-quote marks into curly quotes, etc.

  • LaTeX-style dash parsing is an additional option, where -- is translated into –, and --- is translated into —. This differs from most smartypants processors, which turn a single hyphen into an ndash and a double hyphen into an mdash.

  • Smart fractions, where anything that looks like a fraction is translated into suitable HTML (instead of just a few special cases like most smartypant processors). For example, 4/5 becomes <sup>4</sup>&frasl;<sub>5</sub>, which renders as 45.

LaTeX Output

A rudimentary LaTeX rendering backend is also included. To see an example of its usage, see main.go:

It renders some basic documents, but is only experimental at this point. In particular, it does not do any inline escaping, so input that happens to look like LaTeX code will be passed through without modification.

Todo

  • More unit testing
  • Markdown pretty-printer output engine
  • Improve unicode support. It does not understand all unicode rules (about what constitutes a letter, a punctuation symbol, etc.), so it may fail to detect word boundaries correctly in some instances. It is safe on all utf-8 input.

License

Blackfriday is distributed under the Simplified BSD License:

Copyright © 2011 Russ Ross All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Documentation

Overview

Blackfriday markdown processor.

Translates plain text with simple formatting rules into HTML or LaTeX.

Index

Constants

View Source
const (
	HTML_SKIP_HTML                = 1 << iota // skip preformatted HTML blocks
	HTML_SKIP_STYLE                           // skip embedded <style> elements
	HTML_SKIP_IMAGES                          // skip embedded images
	HTML_SKIP_LINKS                           // skip all links
	HTML_SKIP_SCRIPT                          // skip embedded <script> elements
	HTML_SAFELINK                             // only link to trusted protocols
	HTML_NOFOLLOW_LINKS                       // only link with rel="nofollow"
	HTML_TOC                                  // generate a table of contents
	HTML_OMIT_CONTENTS                        // skip the main contents (for a standalone table of contents)
	HTML_COMPLETE_PAGE                        // generate a complete HTML page
	HTML_GITHUB_BLOCKCODE                     // use github fenced code rendering rules
	HTML_USE_XHTML                            // generate XHTML output instead of HTML
	HTML_USE_SMARTYPANTS                      // enable smart punctuation substitutions
	HTML_SMARTYPANTS_FRACTIONS                // enable smart fractions (with HTML_USE_SMARTYPANTS)
	HTML_SMARTYPANTS_LATEX_DASHES             // enable LaTeX-style dashes (with HTML_USE_SMARTYPANTS)
)

Html renderer configuration options.

View Source
const (
	EXTENSION_NO_INTRA_EMPHASIS          = 1 << iota // ignore emphasis markers inside words
	EXTENSION_TABLES                                 // render tables
	EXTENSION_FENCED_CODE                            // render fenced code blocks
	EXTENSION_AUTOLINK                               // detect embedded URLs that are not explicitly marked
	EXTENSION_STRIKETHROUGH                          // strikethrough text using ~~test~~
	EXTENSION_LAX_HTML_BLOCKS                        // loosen up HTML block parsing rules
	EXTENSION_SPACE_HEADERS                          // be strict about prefix header rules
	EXTENSION_HARD_LINE_BREAK                        // translate newlines into line breaks
	EXTENSION_TAB_SIZE_EIGHT                         // expand tabs to eight spaces instead of four
	EXTENSION_FOOTNOTES                              // Pandoc-style footnotes
	EXTENSION_NO_EMPTY_LINE_BEFORE_BLOCK             // No need to insert an empty line to start a (code, quote, order list, unorder list)block
)

These are the supported markdown parsing extensions. OR these values together to select multiple extensions.

View Source
const (
	LINK_TYPE_NOT_AUTOLINK = iota
	LINK_TYPE_NORMAL
	LINK_TYPE_EMAIL
)

These are the possible flag values for the link renderer. Only a single one of these values will be used; they are not ORed together. These are mostly of interest if you are writing a new output format.

View Source
const (
	LIST_TYPE_ORDERED = 1 << iota
	LIST_ITEM_CONTAINS_BLOCK
	LIST_ITEM_BEGINNING_OF_LIST
	LIST_ITEM_END_OF_LIST
)

These are the possible flag values for the ListItem renderer. Multiple flag values may be ORed together. These are mostly of interest if you are writing a new output format.

View Source
const (
	TABLE_ALIGNMENT_LEFT = 1 << iota
	TABLE_ALIGNMENT_RIGHT
	TABLE_ALIGNMENT_CENTER = (TABLE_ALIGNMENT_LEFT | TABLE_ALIGNMENT_RIGHT)
)

These are the possible flag values for the table cell renderer. Only a single one of these values will be used; they are not ORed together. These are mostly of interest if you are writing a new output format.

View Source
const (
	TAB_SIZE_DEFAULT = 4
	TAB_SIZE_EIGHT   = 8
)

The size of a tab stop.

View Source
const VERSION = "1.1"

Variables

This section is empty.

Functions

func Markdown

func Markdown(input []byte, renderer Renderer, extensions int) []byte

Markdown is the main rendering function. It parses and renders a block of markdown-encoded text. The supplied Renderer is used to format the output, and extensions dictates which non-standard extensions are enabled.

To use the supplied Html or LaTeX renderers, see HtmlRenderer and LatexRenderer, respectively.

func MarkdownBasic

func MarkdownBasic(input []byte) []byte

MarkdownBasic is a convenience function for simple rendering. It processes markdown input with no extensions enabled.

func MarkdownCommon

func MarkdownCommon(input []byte) []byte

Call Markdown with most useful extensions enabled MarkdownCommon is a convenience function for simple rendering. It processes markdown input with common extensions enabled, including:

* Smartypants processing with smart fractions and LaTeX dashes

* Intra-word emphasis suppression

* Tables

* Fenced code blocks

* Autolinking

* Strikethrough support

* Strict header parsing

Types

type Html

type Html struct {
	// contains filtered or unexported fields
}

Html is a type that implements the Renderer interface for HTML output.

Do not create this directly, instead use the HtmlRenderer function.

func (options *Html) AutoLink(out *bytes.Buffer, link []byte, kind int)

func (*Html) BlockCode

func (options *Html) BlockCode(out *bytes.Buffer, text []byte, lang string)

func (*Html) BlockCodeGithub

func (options *Html) BlockCodeGithub(out *bytes.Buffer, text []byte, lang string)

GitHub style code block:

<pre lang="LANG"><code>
...
</code></pre>

Unlike other parsers, we store the language identifier in the <pre>, and don't let the user generate custom classes.

The language identifier in the <pre> block gets postprocessed and all the code inside gets syntax highlighted with Pygments. This is much safer than letting the user specify a CSS class for highlighting.

Note that we only generate HTML for the first specifier. E.g.

~~~~ {.python .numbered}        =>      <pre lang="python"><code>

func (*Html) BlockCodeNormal

func (options *Html) BlockCodeNormal(out *bytes.Buffer, text []byte, lang string)

func (*Html) BlockHtml

func (options *Html) BlockHtml(out *bytes.Buffer, text []byte)

func (*Html) BlockQuote

func (options *Html) BlockQuote(out *bytes.Buffer, text []byte)

func (*Html) CodeSpan

func (options *Html) CodeSpan(out *bytes.Buffer, text []byte)

func (*Html) DocumentFooter

func (options *Html) DocumentFooter(out *bytes.Buffer)

func (*Html) DocumentHeader

func (options *Html) DocumentHeader(out *bytes.Buffer)

func (*Html) DoubleEmphasis

func (options *Html) DoubleEmphasis(out *bytes.Buffer, text []byte)

func (*Html) Emphasis

func (options *Html) Emphasis(out *bytes.Buffer, text []byte)

func (*Html) Entity

func (options *Html) Entity(out *bytes.Buffer, entity []byte)

func (*Html) FootnoteItem

func (options *Html) FootnoteItem(out *bytes.Buffer, name, text []byte, flags int)

func (*Html) FootnoteRef

func (options *Html) FootnoteRef(out *bytes.Buffer, ref []byte, id int)

func (*Html) Footnotes

func (options *Html) Footnotes(out *bytes.Buffer, text func() bool)

func (*Html) HRule

func (options *Html) HRule(out *bytes.Buffer)

func (*Html) Header

func (options *Html) Header(out *bytes.Buffer, text func() bool, level int)

func (*Html) Image

func (options *Html) Image(out *bytes.Buffer, link []byte, title []byte, alt []byte)

func (*Html) LineBreak

func (options *Html) LineBreak(out *bytes.Buffer)
func (options *Html) Link(out *bytes.Buffer, link []byte, title []byte, content []byte)

func (*Html) List

func (options *Html) List(out *bytes.Buffer, text func() bool, flags int)

func (*Html) ListItem

func (options *Html) ListItem(out *bytes.Buffer, text []byte, flags int)

func (*Html) NormalText

func (options *Html) NormalText(out *bytes.Buffer, text []byte)

func (*Html) Paragraph

func (options *Html) Paragraph(out *bytes.Buffer, text func() bool)

func (*Html) RawHtmlTag

func (options *Html) RawHtmlTag(out *bytes.Buffer, text []byte)

func (*Html) Smartypants

func (options *Html) Smartypants(out *bytes.Buffer, text []byte)

func (*Html) StrikeThrough

func (options *Html) StrikeThrough(out *bytes.Buffer, text []byte)

func (*Html) Table

func (options *Html) Table(out *bytes.Buffer, header []byte, body []byte, columnData []int)

func (*Html) TableCell

func (options *Html) TableCell(out *bytes.Buffer, text []byte, align int)

func (*Html) TableHeaderCell

func (options *Html) TableHeaderCell(out *bytes.Buffer, text []byte, align int)

func (*Html) TableRow

func (options *Html) TableRow(out *bytes.Buffer, text []byte)

func (*Html) TocFinalize

func (options *Html) TocFinalize()

func (*Html) TocHeader

func (options *Html) TocHeader(text []byte, level int)

func (*Html) TripleEmphasis

func (options *Html) TripleEmphasis(out *bytes.Buffer, text []byte)

type Latex

type Latex struct {
}

Latex is a type that implements the Renderer interface for LaTeX output.

Do not create this directly, instead use the LatexRenderer function.

func (options *Latex) AutoLink(out *bytes.Buffer, link []byte, kind int)

func (*Latex) BlockCode

func (options *Latex) BlockCode(out *bytes.Buffer, text []byte, lang string)

render code chunks using verbatim, or listings if we have a language

func (*Latex) BlockHtml

func (options *Latex) BlockHtml(out *bytes.Buffer, text []byte)

func (*Latex) BlockQuote

func (options *Latex) BlockQuote(out *bytes.Buffer, text []byte)

func (*Latex) CodeSpan

func (options *Latex) CodeSpan(out *bytes.Buffer, text []byte)

func (*Latex) DocumentFooter

func (options *Latex) DocumentFooter(out *bytes.Buffer)

func (*Latex) DocumentHeader

func (options *Latex) DocumentHeader(out *bytes.Buffer)

header and footer

func (*Latex) DoubleEmphasis

func (options *Latex) DoubleEmphasis(out *bytes.Buffer, text []byte)

func (*Latex) Emphasis

func (options *Latex) Emphasis(out *bytes.Buffer, text []byte)

func (*Latex) Entity

func (options *Latex) Entity(out *bytes.Buffer, entity []byte)

func (*Latex) FootnoteItem

func (options *Latex) FootnoteItem(out *bytes.Buffer, name, text []byte, flags int)

func (*Latex) FootnoteRef

func (options *Latex) FootnoteRef(out *bytes.Buffer, ref []byte, id int)

TODO: this

func (*Latex) Footnotes

func (options *Latex) Footnotes(out *bytes.Buffer, text func() bool)

TODO: this

func (*Latex) HRule

func (options *Latex) HRule(out *bytes.Buffer)

func (*Latex) Header

func (options *Latex) Header(out *bytes.Buffer, text func() bool, level int)

func (*Latex) Image

func (options *Latex) Image(out *bytes.Buffer, link []byte, title []byte, alt []byte)

func (*Latex) LineBreak

func (options *Latex) LineBreak(out *bytes.Buffer)
func (options *Latex) Link(out *bytes.Buffer, link []byte, title []byte, content []byte)

func (*Latex) List

func (options *Latex) List(out *bytes.Buffer, text func() bool, flags int)

func (*Latex) ListItem

func (options *Latex) ListItem(out *bytes.Buffer, text []byte, flags int)

func (*Latex) NormalText

func (options *Latex) NormalText(out *bytes.Buffer, text []byte)

func (*Latex) Paragraph

func (options *Latex) Paragraph(out *bytes.Buffer, text func() bool)

func (*Latex) RawHtmlTag

func (options *Latex) RawHtmlTag(out *bytes.Buffer, tag []byte)

func (*Latex) StrikeThrough

func (options *Latex) StrikeThrough(out *bytes.Buffer, text []byte)

func (*Latex) Table

func (options *Latex) Table(out *bytes.Buffer, header []byte, body []byte, columnData []int)

func (*Latex) TableCell

func (options *Latex) TableCell(out *bytes.Buffer, text []byte, align int)

func (*Latex) TableHeaderCell

func (options *Latex) TableHeaderCell(out *bytes.Buffer, text []byte, align int)

func (*Latex) TableRow

func (options *Latex) TableRow(out *bytes.Buffer, text []byte)

func (*Latex) TripleEmphasis

func (options *Latex) TripleEmphasis(out *bytes.Buffer, text []byte)

type Renderer

type Renderer interface {
	// block-level callbacks
	BlockCode(out *bytes.Buffer, text []byte, lang string)
	BlockQuote(out *bytes.Buffer, text []byte)
	BlockHtml(out *bytes.Buffer, text []byte)
	Header(out *bytes.Buffer, text func() bool, level int)
	HRule(out *bytes.Buffer)
	List(out *bytes.Buffer, text func() bool, flags int)
	ListItem(out *bytes.Buffer, text []byte, flags int)
	Paragraph(out *bytes.Buffer, text func() bool)
	Table(out *bytes.Buffer, header []byte, body []byte, columnData []int)
	TableRow(out *bytes.Buffer, text []byte)
	TableHeaderCell(out *bytes.Buffer, text []byte, flags int)
	TableCell(out *bytes.Buffer, text []byte, flags int)
	Footnotes(out *bytes.Buffer, text func() bool)
	FootnoteItem(out *bytes.Buffer, name, text []byte, flags int)

	// Span-level callbacks
	AutoLink(out *bytes.Buffer, link []byte, kind int)
	CodeSpan(out *bytes.Buffer, text []byte)
	DoubleEmphasis(out *bytes.Buffer, text []byte)
	Emphasis(out *bytes.Buffer, text []byte)
	Image(out *bytes.Buffer, link []byte, title []byte, alt []byte)
	LineBreak(out *bytes.Buffer)
	Link(out *bytes.Buffer, link []byte, title []byte, content []byte)
	RawHtmlTag(out *bytes.Buffer, tag []byte)
	TripleEmphasis(out *bytes.Buffer, text []byte)
	StrikeThrough(out *bytes.Buffer, text []byte)
	FootnoteRef(out *bytes.Buffer, ref []byte, id int)

	// Low-level callbacks
	Entity(out *bytes.Buffer, entity []byte)
	NormalText(out *bytes.Buffer, text []byte)

	// Header and footer
	DocumentHeader(out *bytes.Buffer)
	DocumentFooter(out *bytes.Buffer)
}

Renderer is the rendering interface. This is mostly of interest if you are implementing a new rendering format.

When a byte slice is provided, it contains the (rendered) contents of the element.

When a callback is provided instead, it will write the contents of the respective element directly to the output buffer and return true on success. If the callback returns false, the rendering function should reset the output buffer as though it had never been called.

Currently Html and Latex implementations are provided

func HtmlRenderer

func HtmlRenderer(flags int, title string, css string) Renderer

HtmlRenderer creates and configures an Html object, which satisfies the Renderer interface.

flags is a set of HTML_* options ORed together. title is the title of the document, and css is a URL for the document's stylesheet. title and css are only used when HTML_COMPLETE_PAGE is selected.

func LatexRenderer

func LatexRenderer(flags int) Renderer

LatexRenderer creates and configures a Latex object, which satisfies the Renderer interface.

flags is a set of LATEX_* options ORed together (currently no such options are defined).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL