dom

package module
v0.0.0-...-73569d6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 15, 2023 License: MIT Imports: 13 Imported by: 69

README

DOM

GoDoc Travis CI Go Report Card Donate PayPal Donate Ko-fi

DOM is a Go package for manipulating HTML node. The methods that exist in this package has similar name and purpose as the DOM manipulation methods in JS, which make it useful when porting code from JS to Go. Currently it used in warc and go-readability.

Installation

To install this package, just run go get :

go get -u -v github.com/go-shiori/dom

Licenses

DOM is distributed under MIT license, which means you can use and modify it however you want. However, if you make an enhancement for it, if possible, please send a pull request. If you like this project, please consider donating to me either via PayPal or Ko-Fi.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AppendChild

func AppendChild(node *html.Node, child *html.Node)

AppendChild adds a node to the end of the list of children of a specified parent node. If the given child is a reference to an existing node in the document, AppendChild() moves it from its current position to the new position.

func ChildNodes

func ChildNodes(node *html.Node) []*html.Node

ChildNodes returns list of a node's direct children.

func Children

func Children(node *html.Node) []*html.Node

Children returns an HTMLCollection of the direct child elements of Node.

func ClassName

func ClassName(node *html.Node) string

ClassName returns the value of the class attribute of the specified element.

func Clone

func Clone(src *html.Node, deep bool) *html.Node

Clone returns a clone of the node and (if specified) its children. However, it will be detached from the original's parents and siblings.

func CreateElement

func CreateElement(tagName string) *html.Node

CreateElement creates a new ElementNode with specified tag.

func CreateTextNode

func CreateTextNode(data string) *html.Node

CreateTextNode creates a new Text node.

func DetachChild

func DetachChild(child *html.Node)

func DocumentElement

func DocumentElement(doc *html.Node) *html.Node

DocumentElement returns the Element that is the root element of the document. Since we are working with HTML document, the root will be <html> element for HTML documents).

func FastParse

func FastParse(r io.Reader) (*html.Node, error)

FastParse parses html.Node from the specified reader without caring about text encoding. It always assume that the input uses UTF-8 encoding.

func FirstElementChild

func FirstElementChild(node *html.Node) *html.Node

FirstElementChild returns the object's first child Element, or nil if there are no child elements.

func ForEachNode

func ForEachNode(nodeList []*html.Node, fn func(*html.Node, int))

ForEachNode iterates over a NodeList and runs fn on each node.

func GetAllNodesWithTag

func GetAllNodesWithTag(node *html.Node, tagNames ...string) []*html.Node

GetAllNodesWithTag is wrapper for GetElementsByTagName() which allow to get several tags at once.

func GetAttribute

func GetAttribute(node *html.Node, attrName string) string

GetAttribute returns the value of a specified attribute on the element. If the given attribute does not exist, the value returned will be an empty string.

func GetElementByID

func GetElementByID(doc *html.Node, id string) *html.Node

GetElementByID returns a Node object representing the element whose id property matches the specified string.

func GetElementsByClassName

func GetElementsByClassName(doc *html.Node, classNames string) []*html.Node

GetElementsByClassName returns an array of all child elements which have all of the given class name(s).

func GetElementsByTagName

func GetElementsByTagName(doc *html.Node, tagName string) []*html.Node

GetElementsByTagName returns a collection of all elements in the document with the specified tag name, as an array of Node object. The special tag "*" will represents all elements.

func HasAttribute

func HasAttribute(node *html.Node, attrName string) bool

HasAttribute returns a Boolean value indicating whether the specified node has the specified attribute or not.

func ID

func ID(node *html.Node) string

ID returns the value of the id attribute of the specified element.

func IncludeNode

func IncludeNode(nodeList []*html.Node, node *html.Node) bool

IncludeNode determines if node is included inside nodeList.

func InnerHTML

func InnerHTML(node *html.Node) string

InnerHTML returns the HTML content (inner HTML) of an element. The returned HTML value is escaped.

func InnerText

func InnerText(node *html.Node) string

InnerText in JS used to capture text from an element while excluding text from hidden children. A child is hidden if it's computed width is 0, whether because its CSS (e.g `display: none`, `visibility: hidden`, etc), or if the child has `hidden` attribute. Since we can't compute stylesheet, we only look at `hidden` attribute and inline style.

Besides excluding text from hidden children, difference between this function and `TextContent` is the latter will skip <br> tag while this function will preserve <br> as newline.

func IsVoidElement

func IsVoidElement(n *html.Node) bool

IsVoidElement check whether a node can have any contents or not. Return true if element is void (can't have any children).

func NextElementSibling

func NextElementSibling(node *html.Node) *html.Node

NextElementSibling returns the Element immediately following the specified one in its parent's children list, or nil if the specified Element is the last one in the list.

func OuterHTML

func OuterHTML(node *html.Node) string

OuterHTML returns an HTML serialization of the element and its descendants. The returned HTML value is escaped.

func Parse

func Parse(r io.Reader) (*html.Node, error)

Parse parses html.Node from the specified reader while converting the character encoding into UTF-8. This function is useful to correctly parse web pages that uses custom text encoding, e.g. web pages from Asian websites. However, since it has to detect charset before parsing, this function is quite slow and expensive so if you sure the reader uses valid UTF-8, just use FastParse.

func PrependChild

func PrependChild(node *html.Node, child *html.Node)

PrependChild works like AppendChild() except it adds a node to the beginning of the list of children of a specified parent node.

func PreviousElementSibling

func PreviousElementSibling(node *html.Node) *html.Node

PreviousElementSibling returns the the Element immediately prior to the specified one in its parent's children list, or null if the specified element is the first one in the list.

func QuerySelector

func QuerySelector(doc *html.Node, selectors string) *html.Node

QuerySelector returns the first document's element that match the specified group of selectors.

func QuerySelectorAll

func QuerySelectorAll(doc *html.Node, selectors string) []*html.Node

QuerySelectorAll returns array of document's elements that match the specified group of selectors.

func RemoveAttribute

func RemoveAttribute(node *html.Node, attrName string)

RemoveAttribute removes attribute with given name.

func RemoveNodes

func RemoveNodes(nodeList []*html.Node, filterFn func(*html.Node) bool)

RemoveNodes iterates over a NodeList, calls `filterFn` for each node and removes node if function returned `true`. If function is not passed, removes all the nodes in node list.

func ReplaceChild

func ReplaceChild(parent *html.Node, newChild *html.Node, oldChild *html.Node) (*html.Node, *html.Node)

ReplaceChild replaces a child node within the given (parent) node. If the new child is already exist in document, ReplaceChild() will move it from its current position to replace old child. Returns both the new and old child.

TODO: I'm note sure but I *think* there are some issues here. Check later I guess.

func SetAttribute

func SetAttribute(node *html.Node, attrName string, attrValue string)

SetAttribute sets attribute for node. If attribute already exists, it will be replaced.

func SetInnerHTML

func SetInnerHTML(node *html.Node, rawHTML string)

SetInnerHTML sets inner HTML of the specified node.

func SetTextContent

func SetTextContent(node *html.Node, text string)

SetTextContent sets the text content of the specified node.

func TagName

func TagName(node *html.Node) string

TagName returns the tag name of a Node. If it's not ElementNode, return empty string.

func TextContent

func TextContent(node *html.Node) string

TextContent returns the text content of the specified node, and all its descendants.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL