Documentation ¶
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var DefaultWhitelist = &Whitelist{ Tags: []*Tag{ {"address", []string{}, []string{}}, {"article", []string{}, []string{}}, {"aside", []string{}, []string{}}, {"footer", []string{}, []string{}}, {"header", []string{}, []string{}}, {"h1", []string{}, []string{}}, {"h2", []string{}, []string{}}, {"h3", []string{}, []string{}}, {"h4", []string{}, []string{}}, {"h5", []string{}, []string{}}, {"h6", []string{}, []string{}}, {"hgroup", []string{}, []string{}}, {"main", []string{}, []string{}}, {"nav", []string{}, []string{}}, {"section", []string{}, []string{}}, {"blockquote", []string{}, []string{"cite"}}, {"dd", []string{}, []string{}}, {"div", []string{}, []string{}}, {"dl", []string{}, []string{}}, {"dt", []string{}, []string{}}, {"figcaption", []string{}, []string{}}, {"figure", []string{}, []string{}}, {"hr", []string{}, []string{}}, {"li", []string{}, []string{}}, {"main", []string{}, []string{}}, {"ol", []string{}, []string{}}, {"p", []string{}, []string{}}, {"pre", []string{}, []string{}}, {"ul", []string{}, []string{}}, {"a", []string{"rel", "target", "referrerpolicy"}, []string{"href"}}, {"abbr", []string{"title"}, []string{}}, {"b", []string{}, []string{}}, {"bdi", []string{}, []string{}}, {"bdo", []string{}, []string{}}, {"br", []string{}, []string{}}, {"cite", []string{}, []string{}}, {"code", []string{}, []string{}}, {"data", []string{"value"}, []string{}}, {"em", []string{}, []string{}}, {"i", []string{}, []string{}}, {"kbd", []string{}, []string{}}, {"mark", []string{}, []string{}}, {"q", []string{}, []string{"cite"}}, {"s", []string{}, []string{}}, {"small", []string{}, []string{}}, {"span", []string{}, []string{}}, {"strong", []string{}, []string{}}, {"sub", []string{}, []string{}}, {"sup", []string{}, []string{}}, {"time", []string{"datetime"}, []string{}}, {"u", []string{}, []string{}}, {"area", []string{"alt", "coords", "shape", "target", "rel", "referrerpolicy"}, []string{"href"}}, {"audio", []string{"autoplay", "controls", "crossorigin", "duration", "loop", "muted", "preload"}, []string{"src"}}, {"img", []string{"alt", "crossorigin", "height", "width", "loading", "referrerpolicy"}, []string{"src"}}, {"map", []string{"name"}, []string{}}, {"track", []string{"default", "kind", "label", "srclang"}, []string{"src"}}, {"video", []string{"autoplay", "buffered", "controls", "crossorigin", "duration", "loop", "muted", "preload", "height", "width"}, []string{"src", "poster"}}, {"picture", []string{}, []string{}}, {"source", []string{"type"}, []string{"src"}}, {"del", []string{}, []string{}}, {"ins", []string{}, []string{}}, {"caption", []string{}, []string{}}, {"col", []string{"span"}, []string{}}, {"colgroup", []string{}, []string{}}, {"table", []string{}, []string{}}, {"tbody", []string{}, []string{}}, {"td", []string{"colspan", "rowspan"}, []string{}}, {"tfoot", []string{}, []string{}}, {"th", []string{"colspan", "rowspan", "scope"}, []string{}}, {"thead", []string{}, []string{}}, {"tr", []string{}, []string{}}, {"details", []string{"open"}, []string{}}, {"summary", []string{}, []string{}}, }, GlobalAttr: []string{ "class", "id", }, }
DefaultWhitelist for HTML filter.
The whitelist contains most tags listed in https://developer.mozilla.org/en-US/docs/Web/HTML/Element . It is not recommended to modify the default list directly, use .Clone() and then modify the new one instead.
Functions ¶
func DefaultURLSanitizer ¶
DefaultURLSanitizer is a default and strict sanitizer. It only accepts
- URL with scheme http or https
- relative URL, such as abc, abc?xxx=1, abc#123
- absolute URL, such as /abc, /abc?xxx=1, /abc#123
func NewWriter ¶
NewWriter returns a new Writer, with DefaultWhitelist, writing sanitized HTML content to w.
Example ¶
// demo data data := strings.Repeat(`abc--> <a href="javascript:alert(1)">link1</a> <a href=http://example.com>link2</a> <!--`, 1024) expected := "abc-->" + strings.Repeat(` <a>link1</a> <a href="http://example.com">link2</a> `, 1024) // underlying writer for demo o := new(bytes.Buffer) // source reader for demo r := bytes.NewBufferString(data) sanitizedWriter := NewWriter(o) io.Copy(sanitizedWriter, r) // check the result, for demo only fmt.Print(o.String() == expected)
Output: true
func SanitizeString ¶
SanitizeString uses the DefaultWhitelist to sanitize the HTML string.
Types ¶
type HTMLSanitizer ¶
type HTMLSanitizer struct { *Whitelist // URLSanitizer is a func used to sanitize all the URLAttr. // URLSanitizer returns a sanitzed URL and a bool var indicating // whether the current attribute is acceptable. If not acceptable, // the current attribute will be ignored. // If the func is nil, then DefaultURLSanitizer will be used. URLSanitizer func(rawURL string) (sanitzed string, ok bool) }
HTMLSanitizer is a super fast HTML sanitizer for arbitrary HTML content. This is a whitelist-based santizer, of which the time complexity is O(n).
Example (CustomURLSanitizer) ¶
// only links with domain name example.com are allowed. sanitizer := NewHTMLSanitizer() sanitizer.URLSanitizer = func(rawURL string) (newURL string, ok bool) { newURL, ok = DefaultURLSanitizer(rawURL) if !ok { return } u, err := url.Parse(newURL) if err != nil { ok = false return } if u.Host == "example.com" { ok = true return } ok = false return } data := ` <a href="http://others.com">Link</a> <a href="https://example.com/xxx">Link with example.com</a> ` output, _ := sanitizer.SanitizeString(data) fmt.Print(output)
Output: <a>Link</a> <a href="https://example.com/xxx">Link with example.com</a>
Example (NoTagsAllowd) ¶
sanitizer := NewHTMLSanitizer() // just set Whitelist to nil to disable all tags sanitizer.Whitelist = nil // of course nothing will happen here sanitizer.RemoveTag("a") data := ` <a href="http://others.com">Link</a> <a href="https://example.com/xxx">Link with example.com</a> ` output, _ := sanitizer.SanitizeString(data) fmt.Print(output)
Output: Link Link with example.com
func NewHTMLSanitizer ¶
func NewHTMLSanitizer() *HTMLSanitizer
NewHTMLSanitizer creates a new HTMLSanitizer with the clone of the DefaultWhitelist.
func (*HTMLSanitizer) NewWriter ¶
func (f *HTMLSanitizer) NewWriter(w io.Writer) io.Writer
NewWriter returns a new Writer writing sanitized HTML content to w.
func (*HTMLSanitizer) Sanitize ¶
func (f *HTMLSanitizer) Sanitize(data []byte) ([]byte, error)
Sanitize the HTML data and return the sanitzed HTML.
func (*HTMLSanitizer) SanitizeString ¶
func (f *HTMLSanitizer) SanitizeString(data string) (string, error)
SanitizeString sanitizes the HTML string and return the sanitzed HTML.
type Tag ¶
type Tag struct { // Name for current tag, must be lowercase. Name string // Attr specifies the allowed attributes for current tag, // must be lowercase. // // e.g. colspan, rowspan Attr []string // URLAttr specifies the allowed, URL-relatedd attributes for current tag, // must be lowercase. // // e.g. src, href URLAttr []string }
Tag with its attributes.
type Whitelist ¶
type Whitelist struct { // Tags specifies all the allow tags. Tags []*Tag // GlobalAttr specifies the allowed attributes for all the tag. // It's very useful for some common attributes, such as `class`, `id`. // For security reasons, it's not recommended to set a glboal attr for // any URL-related attribute. GlobalAttr []string }
Whitelist speficies all the allowed HTML tags and its attributes for the filter.
func (*Whitelist) RemoveTag ¶
RemoveTag removes all tags name `name`, must be lowercase It is not recommended to modify the default list directly, use .Clone() and then modify the new one instead.
Example ¶
// sometimes we don't want user to pass HTML with <a> tag sanitizer := NewHTMLSanitizer() sanitizer.RemoveTag("a") data := ` <h1 ClaSs="h1">hello</h1> <p> Hello, world<br> Welcome to use <a href="https://github.com/sym01/htmlsanitizer">htmlsanitizer</a> </p>` output, _ := sanitizer.SanitizeString(data) fmt.Print(output)
Output: <h1 class="h1">hello</h1> <p> Hello, world<br> Welcome to use htmlsanitizer </p>