exhtml

package module

v0.0.0-...-1440466 Latest Latest Go to latest Published: Nov 2, 2021 License: MIT Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/hi20160616/exhtml

Links

Open Source Insights

README ¶

exhtml

Extract links, raw content or *html.Node object from url, raw or *html.Node object.

Documentation ¶

Overview ¶

Package exhtml provides functions to extract `*html.Node`, raw `[]byte` contents, links from a website.

Index ¶

func DivWithAttr(doc *html.Node, attrName, attrValue string) []*html.Node
func DivWithAttr2(raw []byte, attrName, attrValue string) []byte
func ElementsByTag(doc *html.Node, name ...string) []*html.Node
func ElementsByTag2(raw []byte, tags ...string) []byte
func ElementsByTagAndClass(doc *html.Node, tag, class string) []*html.Node
func ElementsByTagAndClass2(raw []byte, tag, class string) []byte
func ElementsByTagAndId(doc *html.Node, tag, id string) []*html.Node
func ElementsByTagAndId2(raw []byte, tag, id string) []byte
func ElementsByTagAndType(doc *html.Node, tag, attrType string) []*html.Node
func ElementsByTagAttr(doc *html.Node, tagName, attrName, attrValue string) []*html.Node
func ElementsNext(doc *html.Node) []*html.Node
func ElementsNextByTag(doc *html.Node, tag string) []*html.Node
func ElementsRmByTag(doc *html.Node, name ...string)
func ElementsRmByTagAttr(doc *html.Node, tag, attrName, attrValue string)
func ElementsRmByTagClass(doc *html.Node, tag, class string)
func ExtractLinks(weburl string) ([]string, error)
func ExtractRss(weburl string) ([]string, error)
func ExtractRssGuids(weburl string) ([]string, error)
func ForEachNode(n *html.Node, pre, post func(n *html.Node))
func GetRawAndDoc(url *url.URL, retryTimeout time.Duration) ([]byte, *html.Node, error)
func MetasByItemprop(doc *html.Node, values ...string) []*html.Node
func MetasByName(doc *html.Node, values ...string) []*html.Node
func MetasByProperty(doc *html.Node, values ...string) []*html.Node
func TagWithAttr(doc *html.Node, tag, attr string) []*html.Node

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func DivWithAttr ¶

func DivWithAttr(doc *html.Node, attrName, attrValue string) []*html.Node

func DivWithAttr2 ¶

func DivWithAttr2(raw []byte, attrName, attrValue string) []byte

func ElementsByTag ¶

func ElementsByTag(doc *html.Node, name ...string) []*html.Node

func ElementsByTag2 ¶

func ElementsByTag2(raw []byte, tags ...string) []byte

func ElementsByTagAndClass ¶

func ElementsByTagAndClass(doc *html.Node, tag, class string) []*html.Node

func ElementsByTagAndClass2 ¶

func ElementsByTagAndClass2(raw []byte, tag, class string) []byte

func ElementsByTagAndId ¶

func ElementsByTagAndId(doc *html.Node, tag, id string) []*html.Node

func ElementsByTagAndId2 ¶

func ElementsByTagAndId2(raw []byte, tag, id string) []byte

func ElementsByTagAndType ¶

func ElementsByTagAndType(doc *html.Node, tag, attrType string) []*html.Node

func ElementsByTagAttr ¶

func ElementsByTagAttr(doc *html.Node, tagName, attrName, attrValue string) []*html.Node

func ElementsNext ¶

func ElementsNext(doc *html.Node) []*html.Node

func ElementsNextByTag ¶

func ElementsNextByTag(doc *html.Node, tag string) []*html.Node

func ElementsRmByTag ¶

func ElementsRmByTag(doc *html.Node, name ...string)

func ElementsRmByTagAttr ¶

func ElementsRmByTagAttr(doc *html.Node, tag, attrName, attrValue string)

ElementsRmByTagAttr rm nodes if attrName != "" rm by tag and attr else rm just by tag.

func ElementsRmByTagClass ¶

func ElementsRmByTagClass(doc *html.Node, tag, class string)

ElementsRmByTagClass rm nodes if class != "" rm by tag and class else rm just by tag.

func ExtractLinks ¶

func ExtractLinks(weburl string) ([]string, error)

ExtractLinks makes an HTTP GET request to the specified URL, parses the response as HTML, and returns the links in the HTML document.

func ExtractRss ¶

func ExtractRss(weburl string) ([]string, error)

func ExtractRssGuids ¶

func ExtractRssGuids(weburl string) ([]string, error)

ExtractRssGuids get value from <guid>

func ForEachNode ¶

func ForEachNode(n *html.Node, pre, post func(n *html.Node))

func GetRawAndDoc ¶

func GetRawAndDoc(url *url.URL, retryTimeout time.Duration) ([]byte, *html.Node, error)

GetRawAndDoc can get html raw bytes and html.Node by rawurl.

func MetasByItemprop ¶

func MetasByItemprop(doc *html.Node, values ...string) []*html.Node

MetasByItemprop focus on `<meta itemprop="dateModified" content="2020/09/29 11:27" />`

func MetasByName ¶

func MetasByName(doc *html.Node, values ...string) []*html.Node

MetasByName focus on `<meta name="dateModified" content="2020/09/29 11:27" />`

func MetasByProperty ¶

func MetasByProperty(doc *html.Node, values ...string) []*html.Node

MetasByProperty focus on `<meta property="dateModified" content="2020/09/29 11:27" />`

func TagWithAttr ¶

func TagWithAttr(doc *html.Node, tag, attr string) []*html.Node

Types ¶

This section is empty.

Source Files ¶

View all Source files

htmldoc.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL