Documentation ¶
Overview ¶
Package html is an HTML5 lexer following the specifications at http://www.w3.org/TR/html5/syntax.html.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var EntitiesMap = map[string][]byte{}/* 1092 elements not displayed */
Entities are all named character entities.
var TextRevEntitiesMap = map[byte][]byte{ '<': []byte("<"), }
Functions ¶
Types ¶
type Hash ¶
type Hash uint32
Hash defines perfect hashes for a predefined list of strings
const ( A Hash = 0x1 // a Abbr Hash = 0x37a04 // abbr About Hash = 0x5 // about Accept Hash = 0x1106 // accept Accept_Charset Hash = 0x110e // accept-charset Action Hash = 0x23f06 // action Address Hash = 0x5a07 // address Align Hash = 0x32705 // align Alink Hash = 0x7005 // alink Allowfullscreen Hash = 0x2ad0f // allowfullscreen Amp_Boilerplate Hash = 0x610f // amp-boilerplate Area Hash = 0x1e304 // area Article Hash = 0x2707 // article Aside Hash = 0xb405 // aside Async Hash = 0xac05 // async Audio Hash = 0xd105 // audio Autofocus Hash = 0xe409 // autofocus Autoplay Hash = 0x10808 // autoplay Axis Hash = 0x11004 // axis B Hash = 0x101 // b Background Hash = 0x300a // background Base Hash = 0x19604 // base Bb Hash = 0x37b02 // bb Bdi Hash = 0x7503 // bdi Bdo Hash = 0x31f03 // bdo Bgcolor Hash = 0x12607 // bgcolor Blockquote Hash = 0x13e0a // blockquote Body Hash = 0xd04 // body Br Hash = 0x37c02 // br Button Hash = 0x14806 // button Canvas Hash = 0xb006 // canvas Caption Hash = 0x21f07 // caption Charset Hash = 0x1807 // charset Checked Hash = 0x1b307 // checked Cite Hash = 0xfb04 // cite Class Hash = 0x15905 // class Classid Hash = 0x15907 // classid Clear Hash = 0x2b05 // clear Code Hash = 0x19204 // code Codebase Hash = 0x19208 // codebase Codetype Hash = 0x1a408 // codetype Col Hash = 0x12803 // col Colgroup Hash = 0x1bb08 // colgroup Color Hash = 0x12805 // color Cols Hash = 0x1cf04 // cols Colspan Hash = 0x1cf07 // colspan Compact Hash = 0x1ec07 // compact Content Hash = 0x28407 // content Controls Hash = 0x20108 // controls Data Hash = 0x1f04 // data Datalist Hash = 0x1f08 // datalist Datatype Hash = 0x4d08 // datatype Dd Hash = 0x5b02 // dd Declare Hash = 0xb707 // declare Default Hash = 0x7f07 // default DefaultChecked Hash = 0x1730e // defaultChecked DefaultMuted Hash = 0x7f0c // defaultMuted DefaultSelected Hash = 0x8a0f // defaultSelected Defer Hash = 0x9805 // defer Del Hash = 0x10503 // del Details Hash = 0x15f07 // details Dfn Hash = 0x16c03 // dfn Dialog Hash = 0xa606 // dialog Dir Hash = 0x7603 // dir Disabled Hash = 0x18008 // disabled Div Hash = 0x18703 // div Dl Hash = 0x1b902 // dl Dt Hash = 0x23102 // dt Em Hash = 0x4302 // em Embed Hash = 0x4905 // embed Enabled Hash = 0x26c07 // enabled Enctype Hash = 0x1fa07 // enctype Face Hash = 0x5604 // face Fieldset Hash = 0x21408 // fieldset Figure Hash = 0x22606 // figure For Hash = 0x23b03 // for Form Hash = 0x23b04 // form Formaction Hash = 0x23b0a // formaction Formnovalidate Hash = 0x2450e // formnovalidate Frame Hash = 0x28c05 // frame Frameborder Hash = 0x28c0b // frameborder H1 Hash = 0x2e002 // h1 H2 Hash = 0x25302 // h2 H3 Hash = 0x25502 // h3 H4 Hash = 0x25702 // h4 H5 Hash = 0x25902 // h5 H6 Hash = 0x25b02 // h6 Head Hash = 0x2d204 // head Header Hash = 0x2d206 // header Hgroup Hash = 0x25d06 // hgroup Hidden Hash = 0x26806 // hidden Hr Hash = 0x32d02 // hr Href Hash = 0x32d04 // href Hreflang Hash = 0x32d08 // hreflang Html Hash = 0x27304 // html Http_Equiv Hash = 0x2770a // http-equiv I Hash = 0x2401 // i Icon Hash = 0x28304 // icon Id Hash = 0xb602 // id Iframe Hash = 0x28b06 // iframe Img Hash = 0x29703 // img Inert Hash = 0xf605 // inert Inlist Hash = 0x29a06 // inlist Input Hash = 0x2a405 // input Ins Hash = 0x2a903 // ins Ismap Hash = 0x11205 // ismap Itemscope Hash = 0xfc09 // itemscope Kbd Hash = 0x7403 // kbd Keygen Hash = 0x1f606 // keygen Label Hash = 0xbe05 // label Lang Hash = 0x33104 // lang Language Hash = 0x33108 // language Legend Hash = 0x2c506 // legend Li Hash = 0x2302 // li Link Hash = 0x7104 // link Longdesc Hash = 0xc208 // longdesc Main Hash = 0xf404 // main Manifest Hash = 0x2bc08 // manifest Map Hash = 0xee03 // map Mark Hash = 0x2cb04 // mark Math Hash = 0x2cf04 // math Max Hash = 0x2d803 // max Maxlength Hash = 0x2d809 // maxlength Media Hash = 0xa405 // media Menu Hash = 0x12204 // menu Meta Hash = 0x2e204 // meta Meter Hash = 0x2f705 // meter Method Hash = 0x2fc06 // method Multiple Hash = 0x30208 // multiple Muted Hash = 0x30a05 // muted Name Hash = 0xa204 // name Nohref Hash = 0x32b06 // nohref Noresize Hash = 0x13608 // noresize Noscript Hash = 0x14d08 // noscript Noshade Hash = 0x16e07 // noshade Novalidate Hash = 0x2490a // novalidate Nowrap Hash = 0x1d506 // nowrap Object Hash = 0xd506 // object Ol Hash = 0xcb02 // ol Open Hash = 0x32104 // open Optgroup Hash = 0x35608 // optgroup Option Hash = 0x30f06 // option Output Hash = 0x206 // output P Hash = 0x501 // p Param Hash = 0xf005 // param Pauseonexit Hash = 0x1160b // pauseonexit Picture Hash = 0x1c207 // picture Plaintext Hash = 0x1da09 // plaintext Poster Hash = 0x26206 // poster Pre Hash = 0x35d03 // pre Prefix Hash = 0x35d06 // prefix Profile Hash = 0x36407 // profile Progress Hash = 0x34208 // progress Property Hash = 0x31508 // property Q Hash = 0x14301 // q Rb Hash = 0x2f02 // rb Readonly Hash = 0x1e408 // readonly Rel Hash = 0xbc03 // rel Required Hash = 0x22a08 // required Resource Hash = 0x1c708 // resource Rev Hash = 0x7803 // rev Reversed Hash = 0x7808 // reversed Rows Hash = 0x9c04 // rows Rowspan Hash = 0x9c07 // rowspan Rp Hash = 0x6a02 // rp Rt Hash = 0x2802 // rt Rtc Hash = 0xf903 // rtc Ruby Hash = 0xe004 // ruby Rules Hash = 0x12c05 // rules S Hash = 0x1c01 // s Samp Hash = 0x6004 // samp Scope Hash = 0x10005 // scope Scoped Hash = 0x10006 // scoped Script Hash = 0x14f06 // script Scrolling Hash = 0xc809 // scrolling Seamless Hash = 0x19808 // seamless Section Hash = 0x13007 // section Select Hash = 0x16506 // select Selected Hash = 0x16508 // selected Shape Hash = 0x19f05 // shape Size Hash = 0x13a04 // size Slot Hash = 0x20804 // slot Small Hash = 0x2ab05 // small Sortable Hash = 0x2ef08 // sortable Source Hash = 0x1c906 // source Span Hash = 0x9f04 // span Src Hash = 0x34903 // src Srcset Hash = 0x34906 // srcset Start Hash = 0x2505 // start Strong Hash = 0x29e06 // strong Style Hash = 0x2c205 // style Sub Hash = 0x31d03 // sub Summary Hash = 0x33907 // summary Sup Hash = 0x34003 // sup Svg Hash = 0x34f03 // svg Tabindex Hash = 0x2e408 // tabindex Table Hash = 0x2f205 // table Target Hash = 0x706 // target Tbody Hash = 0xc05 // tbody Td Hash = 0x1e02 // td Template Hash = 0x4208 // template Text Hash = 0x1df04 // text Textarea Hash = 0x1df08 // textarea Tfoot Hash = 0xda05 // tfoot Th Hash = 0x2d102 // th Thead Hash = 0x2d105 // thead Time Hash = 0x12004 // time Title Hash = 0x15405 // title Tr Hash = 0x1f202 // tr Track Hash = 0x1f205 // track Translate Hash = 0x20b09 // translate Truespeed Hash = 0x23209 // truespeed Type Hash = 0x5104 // type Typemustmatch Hash = 0x1a80d // typemustmatch Typeof Hash = 0x5106 // typeof U Hash = 0x301 // u Ul Hash = 0x8302 // ul Undeterminate Hash = 0x370d // undeterminate Usemap Hash = 0xeb06 // usemap Valign Hash = 0x32606 // valign Value Hash = 0x18905 // value Valuetype Hash = 0x18909 // valuetype Var Hash = 0x28003 // var Video Hash = 0x35205 // video Visible Hash = 0x36b07 // visible Vlink Hash = 0x37205 // vlink Vocab Hash = 0x37705 // vocab Wbr Hash = 0x37e03 // wbr Xmlns Hash = 0x2eb05 // xmlns Xmp Hash = 0x36203 // xmp )
Unique hash definitions to be used instead of strings
type Lexer ¶
type Lexer struct {
// contains filtered or unexported fields
}
Lexer is the state for the lexer.
func NewLexer ¶
NewLexer returns a new Lexer for a given io.Reader.
Example ¶
l := NewLexer(bytes.NewBufferString("<span class='user'>John Doe</span>")) out := "" for { tt, data := l.Next() if tt == ErrorToken { break } out += string(data) } fmt.Println(out)
Output: <span class='user'>John Doe</span>
func (*Lexer) AttrVal ¶
AttrVal returns the attribute value when an AttributeToken was returned from Next.
func (*Lexer) Err ¶
Err returns the error encountered during lexing, this is often io.EOF but also other errors can be returned.
func (*Lexer) Next ¶
Next returns the next Token. It returns ErrorToken when an error was encountered. Using Err() one can retrieve the error message.
type TokenType ¶
type TokenType uint32
TokenType determines the type of token, eg. a number or a semicolon.