Documentation ¶
Overview ¶
Package exhtml provides functions to extract `*html.Node`, raw `[]byte` contents, links from a website.
Index ¶
- func DivWithAttr(doc *html.Node, attrName, attrValue string) []*html.Node
- func DivWithAttr2(raw []byte, attrName, attrValue string) []byte
- func ElementsByTag(doc *html.Node, name ...string) []*html.Node
- func ElementsByTag2(raw []byte, tags ...string) []byte
- func ElementsByTagAndClass(doc *html.Node, tag, class string) []*html.Node
- func ElementsByTagAndClass2(raw []byte, tag, class string) []byte
- func ElementsByTagAndId(doc *html.Node, tag, id string) []*html.Node
- func ElementsByTagAndId2(raw []byte, tag, id string) []byte
- func ElementsByTagAndType(doc *html.Node, tag, attrType string) []*html.Node
- func ElementsByTagAttr(doc *html.Node, tagName, attrName, attrValue string) []*html.Node
- func ElementsNext(doc *html.Node) []*html.Node
- func ElementsNextByTag(doc *html.Node, tag string) []*html.Node
- func ElementsRmByTag(doc *html.Node, name ...string)
- func ElementsRmByTagAttr(doc *html.Node, tag, attrName, attrValue string)
- func ElementsRmByTagClass(doc *html.Node, tag, class string)
- func ExtractLinks(weburl string) ([]string, error)
- func ExtractRss(weburl string) ([]string, error)
- func ExtractRssGuids(weburl string) ([]string, error)
- func ForEachNode(n *html.Node, pre, post func(n *html.Node))
- func GetRawAndDoc(url *url.URL, retryTimeout time.Duration) ([]byte, *html.Node, error)
- func MetasByItemprop(doc *html.Node, values ...string) []*html.Node
- func MetasByName(doc *html.Node, values ...string) []*html.Node
- func MetasByProperty(doc *html.Node, values ...string) []*html.Node
- func TagWithAttr(doc *html.Node, tag, attr string) []*html.Node
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func DivWithAttr2 ¶
func ElementsByTag2 ¶
func ElementsByTagAndClass ¶
func ElementsByTagAndClass2 ¶
func ElementsByTagAndId2 ¶
func ElementsByTagAndType ¶
func ElementsByTagAttr ¶
func ElementsRmByTag ¶
func ElementsRmByTagAttr ¶
ElementsRmByTagAttr rm nodes if attrName != "" rm by tag and attr else rm just by tag.
func ElementsRmByTagClass ¶
ElementsRmByTagClass rm nodes if class != "" rm by tag and class else rm just by tag.
func ExtractLinks ¶
ExtractLinks makes an HTTP GET request to the specified URL, parses the response as HTML, and returns the links in the HTML document.
func ExtractRss ¶
func ExtractRssGuids ¶
ExtractRssGuids get value from <guid>
func GetRawAndDoc ¶
GetRawAndDoc can get html raw bytes and html.Node by rawurl.
func MetasByItemprop ¶
MetasByItemprop focus on `<meta itemprop="dateModified" content="2020/09/29 11:27" />`
func MetasByName ¶
MetasByName focus on `<meta name="dateModified" content="2020/09/29 11:27" />`
func MetasByProperty ¶
MetasByProperty focus on `<meta property="dateModified" content="2020/09/29 11:27" />`
Types ¶
This section is empty.