Documentation ¶
Overview ¶
Package md converts html to markdown.
converter := md.NewConverter("", true, nil) html = `<strong>Important</strong>` markdown, err := converter.ConvertString(html) if err != nil { log.Fatal(err) } fmt.Println("md ->", markdown)
Or if you are already using goquery:
markdown, err := converter.Convert(selec)
Index ¶
- Variables
- func AddSpaceIfNessesary(selec *goquery.Selection, markdown string) string
- func CalculateCodeFence(fenceChar rune, content string) string
- func CollectText(n *html.Node) string
- func DefaultGetAbsoluteURL(selec *goquery.Selection, rawURL string, domain string) string
- func DomainFromURL(rawURL string) string
- func EscapeMultiLine(content string) string
- func IndentMultiLineListItem(opt *Options, text string, spaces int) string
- func IndexWithText(s *goquery.Selection) int
- func IsInlineElement(e string) bool
- func String(text string) *string
- func TrimTrailingSpaces(text string) string
- func TrimpLeadingSpaces(text string) string
- type AdvancedResult
- type Afterhook
- type BeforeHook
- type Converter
- func (conv *Converter) AddRules(rules ...Rule) *Converter
- func (conv *Converter) After(hooks ...Afterhook) *Converter
- func (conv *Converter) Before(hooks ...BeforeHook) *Converter
- func (conv *Converter) ClearAfter() *Converter
- func (conv *Converter) ClearBefore() *Converter
- func (conv *Converter) Convert(selec *goquery.Selection) string
- func (conv *Converter) ConvertBytes(bytes []byte) ([]byte, error)
- func (conv *Converter) ConvertReader(reader io.Reader) (bytes.Buffer, error)
- func (conv *Converter) ConvertResponse(res *http.Response) (string, error)
- func (conv *Converter) ConvertString(html string) (string, error)
- func (conv *Converter) ConvertURL(url string) (string, error)
- func (conv *Converter) Keep(tags ...string) *Converter
- func (conv *Converter) Remove(tags ...string) *Converter
- func (conv *Converter) Use(plugins ...Plugin) *Converter
- type Options
- type Plugin
- type Rule
Constants ¶
This section is empty.
Variables ¶
var Timeout = time.Second * 10
Timeout for the http client
Functions ¶
func AddSpaceIfNessesary ¶
AddSpaceIfNessesary adds spaces to the text based on the neighbors. That makes sure that there is always a space to the side, to recognize the delimiter.
func CalculateCodeFence ¶
CalculateCodeFence can be passed the content of a code block and it returns how many fence characters (` or ~) should be used.
This is useful if the html content includes the same fence characters for example ``` -> https://stackoverflow.com/a/49268657
func CollectText ¶
CollectText returns the text of the node and all its children
func DefaultGetAbsoluteURL ¶
DefaultGetAbsoluteURL is the default function and can be overridden through `GetAbsoluteURL` in the options.
func DomainFromURL ¶
DomainFromURL returns `u.Host` from the parsed url.
func EscapeMultiLine ¶
EscapeMultiLine deals with multiline content inside a link
func IndentMultiLineListItem ¶
IndentMultiLineListItem makes sure that multiline list items are properly indented.
func IndexWithText ¶
IndexWithText is similar to goquery's Index function but returns the index of the current element while NOT counting the empty elements beforehand.
func IsInlineElement ¶
IsInlineElement can be used to check wether a node name (goquery.Nodename) is an html inline element and not a block element. Used in the rule for the p tag to check wether the text is inside a block element.
func TrimTrailingSpaces ¶
TrimTrailingSpaces removes unnecessary spaces from the end of lines.
func TrimpLeadingSpaces ¶
TrimpLeadingSpaces removes spaces from the beginning of a line but makes sure that list items and code blocks are not affected.
Types ¶
type AdvancedResult ¶
AdvancedResult is used for example for links. If you use LinkStyle:referenced the link href is placed at the bottom of the generated markdown (Footer).
type Afterhook ¶
Afterhook runs after the converter and can be used to transform the resulting markdown
type BeforeHook ¶
BeforeHook runs before the converter and can be used to transform the original html
type Converter ¶
type Converter struct {
// contains filtered or unexported fields
}
Converter is initialized by NewConverter.
func NewConverter ¶
NewConverter initializes a new converter and holds all the rules.
- `domain` is used for links and images to convert relative urls ("/image.png") to absolute urls.
- CommonMark is the default set of rules. Set enableCommonmark to false if you want to customize everything using AddRules and DONT want to fallback to default rules.
func (*Converter) AddRules ¶
AddRules adds the rules that are passed in to the converter.
By default it overrides the rule for that html tag. You can fall back to the default rule by returning nil.
func (*Converter) After ¶
After registers a hook that is run after the conversion. It can be used to transform the markdown document that is about to be returned.
For example, the default after hook trims the returned markdown.
func (*Converter) Before ¶
func (conv *Converter) Before(hooks ...BeforeHook) *Converter
Before registers a hook that is run before the conversion. It can be used to transform the original goquery html document.
For example, the default before hook adds an index to every link, so that the `a` tag rule (for "reference" "full") can have an incremental number.
func (*Converter) ClearAfter ¶
ClearAfter clears the current after hooks (including the default after hooks).
func (*Converter) ClearBefore ¶
ClearBefore clears the current before hooks (including the default before hooks).
func (*Converter) Convert ¶
Convert returns the content from a goquery selection. If you have a goquery document just pass in doc.Selection.
func (*Converter) ConvertBytes ¶
ConvertBytes returns the content from a html byte array.
func (*Converter) ConvertReader ¶
ConvertReader returns the content from a reader and returns a buffer.
func (*Converter) ConvertResponse ¶
ConvertResponse returns the content from a html response.
func (*Converter) ConvertString ¶
ConvertString returns the content from a html string. If you already have a goquery selection use `Convert`.
func (*Converter) ConvertURL ¶
ConvertURL returns the content from the page with that url.
type Options ¶
type Options struct { // "setext" or "atx" // default: "atx" HeadingStyle string // Any Thematic break // default: "* * *" HorizontalRule string // "-", "+", or "*" // default: "-" BulletListMarker string // "indented" or "fenced" // default: "indented" CodeBlockStyle string // “` or ~~~ // default: “` Fence string // _ or * // default: _ EmDelimiter string // ** or __ // default: ** StrongDelimiter string // inlined or referenced // default: inlined LinkStyle string // full, collapsed, or shortcut // default: full LinkReferenceStyle string // GetAbsoluteURL parses the `rawURL` and adds the `domain` to convert relative (/page.html) // urls to absolute urls (http://domain.com/page.html). // // The default is `DefaultGetAbsoluteURL`, unless you override it. That can also // be useful if you want to proxy the images. GetAbsoluteURL func(selec *goquery.Selection, rawURL string, domain string) string // The start symbols of escaped block. Note that program will match these symbols one by one // until matched, then use the same index for end symbol. // Set to [] to disable escaped block feature. // default: [] EscapeStart []string // The end symbols of escaped block. Set to [] to disable escaped block feature. // default: [] EscapeEnd []string // contains filtered or unexported fields }
Options to customize the output. You can change stuff like the character that is used for strong text.
type Rule ¶
type Rule struct { Filter []string Replacement func(content string, selec *goquery.Selection, options *Options) *string AdvancedReplacement func(content string, selec *goquery.Selection, options *Options) (res AdvancedResult, skip bool) }
Rule to convert certain html tags to markdown.
md.Rule{ Filter: []string{"del", "s", "strike"}, Replacement: func(content string, selec *goquery.Selection, opt *md.Options) *string { // You need to return a pointer to a string (md.String is just a helper function). // If you return nil the next function for that html element // will be picked. For example you could only convert an element // if it has a certain class name and fallback if not. return md.String("~" + content + "~") }, }
Directories ¶
Path | Synopsis |
---|---|
Package escape escapes characters that are commonly used in markdown like the * for strong/italic.
|
Package escape escapes characters that are commonly used in markdown like the * for strong/italic. |
examples
|
|
Package plugin contains all the rules that are not part of Commonmark like GitHub Flavored Markdown.
|
Package plugin contains all the rules that are not part of Commonmark like GitHub Flavored Markdown. |