Documentation ¶
Index ¶
- func CloneAndProcessList(outputNodes []*html.Node, pageURL *nurl.URL) *html.Node
- func CloneAndProcessTree(root *html.Node, pageURL *nurl.URL) *html.Node
- func Contains(node, child *html.Node) bool
- func GetAllSrcSetURLs(root *html.Node) []string
- func GetAncestors(nodes ...*html.Node) (map[*html.Node]int, *html.Node)
- func GetArea(node *html.Node) int
- func GetDisplayStyle(node *html.Node) string
- func GetFirstElementByTagName(root *html.Node, tagName string) *html.Node
- func GetFirstElementByTagNameInc(root *html.Node, tagName string) *html.Node
- func GetNearestCommonAncestor(nodes ...*html.Node) *html.Node
- func GetNodeDepth(node *html.Node) int
- func GetOutputNodes(root *html.Node) []*html.Node
- func GetParentElement(node *html.Node) *html.Node
- func GetParentNodes(node *html.Node) []*html.Node
- func GetSrcSetURLs(node *html.Node) []string
- func HasAncestor(node *html.Node, ancestorTagNames ...string) bool
- func HasRootDomain(url string, root string) bool
- func InnerText(node *html.Node) string
- func IsProbablyVisible(node *html.Node) bool
- func MakeAllLinksAbsolute(root *html.Node, pageURL *nurl.URL)
- func MakeAllSrcAttributesAbsolute(root *html.Node, pageURL *nurl.URL)
- func MakeAllSrcSetAbsolute(root *html.Node, pageURL *nurl.URL)
- func NodeName(node *html.Node) string
- func SomeNode(nodeList []*html.Node, fn func(*html.Node) bool) bool
- func StripAttributes(node *html.Node)
- func TreeClone(nodes []*html.Node) *html.Node
- func WalkNodes(root *html.Node, fnVisit func(*html.Node) bool, fnExit func(*html.Node))
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CloneAndProcessList ¶
CloneAndProcessList clones and process list of relevant nodes for output.
func CloneAndProcessTree ¶
CloneAndProcessTree clone and process a given node tree/subtree. In original dom-distiller this will ignore hidden elements, unfortunately we can't do that here, so we will include hidden elements as well. NEED-COMPUTE-CSS.
func GetAllSrcSetURLs ¶
func GetAncestors ¶
GetAncestors returns all ancestor of the `nodes` and also the nearest common ancestor.
func GetArea ¶
GetArea in original code returns area of a node by multiplying offsetWidth and offsetHeight. Since it's not possible in Go, we simply return 0. NEED-COMPUTE-CSS
func GetDisplayStyle ¶
GetDisplayStyle returns the default "display" in style property for the specified node.
func GetFirstElementByTagName ¶
GetFirstElementByTagName returns the first element with `tagName` in the tree rooted at `root`. Nil if none is found.
func GetFirstElementByTagNameInc ¶
GetFirstElementByTagNameInc returns the first element with `tagName` in the tree rooted at `root`, including root. Nil if none is found.
func GetNearestCommonAncestor ¶
GetNearestCommonAncestor returns the nearest common ancestor of nodes.
func GetNodeDepth ¶
GetNodeDepth the depth of the given node in the DOM tree.
func GetOutputNodes ¶
GetOutputNodes returns list of relevant nodes for output from a subtree.
func GetParentElement ¶
GetParentElement returns the nearest element parent.
func GetParentNodes ¶
GetParentNodes returns list of all the parents of this node starting with the node itself.
func GetSrcSetURLs ¶
func HasAncestor ¶
HasAncestor check if node has ancestor with specified tag names.
func HasRootDomain ¶
HasRootDomain checks if a provided URL has the specified root domain (ex. http://a.b.c/foo/bar has root domain of b.c).
func InnerText ¶
InnerText in JS and GWT is used to capture text from an element while excluding text from hidden children. A child is hidden if it's computed width is 0, whether because its CSS (e.g `display: none`, `visibility: hidden`, etc), or if the child has `hidden` attribute. Since we can't compute stylesheet, we only look at `hidden` attribute and inline style.
Besides excluding text from hidden children, difference between this function and `dom.TextContent` is the latter will skip <br> tag while this function will preserve <br> as whitespace. NEED-COMPUTE-CSS
func IsProbablyVisible ¶
IsProbablyVisible determines if a node is visible.
func MakeAllLinksAbsolute ¶
MakeAllLinksAbsolute makes all anchors and video posters absolute.
func MakeAllSrcAttributesAbsolute ¶
MakeAllSrcAttributesAbsolute makes all "img", "source", "track", and "video" tags have an absolute "src" attribute.
func MakeAllSrcSetAbsolute ¶
MakeAllSrcSetAbsolute makes all `srcset` within root absolute.
func NodeName ¶
NodeName returns the name of the current node as a string. See https://developer.mozilla.org/en-US/docs/Web/API/Node/nodeName
func SomeNode ¶
SomeNode iterates over a NodeList, return true if any of the provided iterate function calls returns true, false otherwise.
func StripAttributes ¶
func TreeClone ¶
TreeClone takes a list of nodes and returns a clone of the minimum tree in the DOM that contains all of them. This is done by going through each node, cloning its parent and adding children to that parent until the next node is not contained in that parent (originally). The list cannot contain a parent of any of the other nodes. Children of the nodes in the provided list are excluded.
This implementation doesn't come from the original dom-distiller code. Instead I created it from scratch to make it simpler and more Go idiomatic.
func WalkNodes ¶
WalkNodes used to walk the subtree of the DOM rooted at a particular root. It has two function parameters, i.e. fnVisit and fnExit :
- fnVisit is called when we reach a node during the walk. If it returns false, children of the node will be skipped and fnExit won't be called for this node.
- fnExit is called when exiting a node, after visiting all of its children.
Types ¶
This section is empty.