Documentation ¶
Overview ¶
Package htmldoc provides interface to handle HTML documents. It is built on top of golang.org/x/net/html package.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( ErrSkip = errors.New("htmldoc: skip") ErrStop = errors.New("htmldoc: stop") )
ErrSkip and ErrStop are used by Traverse.
Functions ¶
func FindAttr ¶
FindAttr returns an Attribute in the given Node with the given key, or nil if there is no such attribute. FindAttr inspects only attributes with empty Namespace and ignores "foreign attributes."
func FindNode ¶
FindNode locates a descendant of the given Node (including the Node itself) which has the given tag. It returns the first descendant when there is more than one, and nil when there is none.
func GetAttr ¶
GetAttr returns the value of an attribute in the given Node with the given key, or empty string if there is no such attribute. GetAttr considers only attributes with empty Namespace and ignores "foreign attributes."
func Traverse ¶
Traverse performs a pre-order traversal on the parse tree n, calling f on each node. f can return ErrSkip to not traverse the subtree of the current node and ErrStop to terminate the traversal entirely without error. When f returns other non-nil error, Traverse abandons the traversal immediately and returns the encountered error.
Types ¶
type Document ¶
type Document struct { // Root is the root of the HTML parse tree. Note it is a DocumentNode, not // <html> element. Root *html.Node // Head is the node corresponding to <head> element in the parse tree. Head *html.Node // Body is the node corresponding to <body> element in the parse tree. Body *html.Node // URL represents where the document is located. URL *url.URL // BaseURL represents the base URL of the document. It is usually the same // as URL above, but can be altered by <base> element. BaseURL *url.URL }
Document represents an HTML document, holding the parse tree and the related information.
func NewDocument ¶
NewDocument creates and initializes a new Document from payload and url.
func (*Document) ResolveReference ¶
ResolveReference resolves a URI reference in Document to an absolute URI. The URI reference may be relative or absolute. ResolveReference always returns a new URL instance, even if the returned URL is identical to the reference.
type HTMLResponse ¶
HTMLResponse is an extension of exchange.Response for HTML responses.
func NewHTMLResponse ¶
func NewHTMLResponse(resp *exchange.Response) (*HTMLResponse, error)
NewHTMLResponse creates and initializes a new HTMLResponse.