Documentation ¶
Overview ¶
Package xmlquery provides extract data from XML documents using XPath expression.
Index ¶
- Variables
- func AddAttr(n *Node, key, val string)
- func AddChild(parent, n *Node)
- func AddSibling(sibling, n *Node)
- func FindEach(top *Node, expr string, cb func(int, *Node))
- func FindEachWithBreak(top *Node, expr string, cb func(int, *Node) bool)
- func RemoveFromTree(n *Node)
- type Attr
- type DecoderOptions
- type Node
- func Find(top *Node, expr string) []*Node
- func FindOne(top *Node, expr string) *Node
- func LoadURL(url string) (*Node, error)
- func Parse(r io.Reader) (*Node, error)
- func ParseWithOptions(r io.Reader, options ParserOptions) (*Node, error)
- func Query(top *Node, expr string) (*Node, error)
- func QueryAll(top *Node, expr string) ([]*Node, error)
- func QuerySelector(top *Node, selector *xpath.Expr) *Node
- func QuerySelectorAll(top *Node, selector *xpath.Expr) []*Node
- func (n *Node) InnerText() string
- func (n *Node) OutputXML(self bool) string
- func (n *Node) OutputXMLWithOptions(opts ...OutputOption) string
- func (n *Node) RemoveAttr(key string)
- func (n *Node) SelectAttr(name string) string
- func (n *Node) SelectElement(name string) *Node
- func (n *Node) SelectElements(name string) []*Node
- func (n *Node) SetAttr(key, value string)
- type NodeNavigator
- func (x *NodeNavigator) Copy() xpath.NodeNavigator
- func (x *NodeNavigator) Current() *Node
- func (x *NodeNavigator) LocalName() string
- func (x *NodeNavigator) MoveTo(other xpath.NodeNavigator) bool
- func (x *NodeNavigator) MoveToChild() bool
- func (x *NodeNavigator) MoveToFirst() bool
- func (x *NodeNavigator) MoveToNext() bool
- func (x *NodeNavigator) MoveToNextAttribute() bool
- func (x *NodeNavigator) MoveToParent() bool
- func (x *NodeNavigator) MoveToPrevious() bool
- func (x *NodeNavigator) MoveToRoot()
- func (x *NodeNavigator) NamespaceURL() string
- func (x *NodeNavigator) NodeType() xpath.NodeType
- func (x *NodeNavigator) Prefix() string
- func (x *NodeNavigator) String() string
- func (x *NodeNavigator) Value() string
- type NodeType
- type OutputOption
- type ParserOptions
- type StreamParser
Constants ¶
This section is empty.
Variables ¶
var DisableSelectorCache = false
DisableSelectorCache will disable caching for the query selector if value is true.
var SelectorCacheMaxEntries = 50
SelectorCacheMaxEntries allows how many selector object can be caching. Default is 50. Will disable caching if SelectorCacheMaxEntries <= 0.
Functions ¶
func AddChild ¶
func AddChild(parent, n *Node)
AddChild adds a new node 'n' to a node 'parent' as its last child.
func AddSibling ¶
func AddSibling(sibling, n *Node)
AddSibling adds a new node 'n' as a sibling of a given node 'sibling'. Note it is not necessarily true that the new node 'n' would be added immediately after 'sibling'. If 'sibling' isn't the last child of its parent, then the new node 'n' will be added at the end of the sibling chain of their parent.
func FindEach ¶
FindEach searches the html.Node and calls functions cb. Important: this method is deprecated, instead, use for .. = range Find(){}.
func FindEachWithBreak ¶
FindEachWithBreak functions the same as FindEach but allows to break the loop by returning false from the callback function `cb`. Important: this method is deprecated, instead, use .. = range Find(){}.
func RemoveFromTree ¶
func RemoveFromTree(n *Node)
RemoveFromTree removes a node and its subtree from the document tree it is in. If the node is the root of the tree, then it's no-op.
Types ¶
type DecoderOptions ¶
DecoderOptions implement the very same options than the standard encoding/xml package. Please refer to this documentation: https://golang.org/pkg/encoding/xml/#Decoder
type Node ¶
type Node struct {
Parent, FirstChild, LastChild, PrevSibling, NextSibling *Node
Type NodeType
Data string
Prefix string
NamespaceURI string
Attr []Attr
// contains filtered or unexported fields
}
A Node consists of a NodeType and some Data (tag name for element nodes, content for text) and are part of a tree of Nodes.
func Find ¶
Find is like QueryAll but panics if `expr` is not a valid XPath expression. See `QueryAll()` function.
func FindOne ¶
FindOne is like Query but panics if `expr` is not a valid XPath expression. See `Query()` function.
func ParseWithOptions ¶
func ParseWithOptions(r io.Reader, options ParserOptions) (*Node, error)
ParseWithOptions is like parse, but with custom options
func Query ¶
Query searches the XML Node that matches by the specified XPath expr, and returns first matched element.
func QueryAll ¶
QueryAll searches the XML Node that matches by the specified XPath expr. Returns an error if the expression `expr` cannot be parsed.
func QuerySelector ¶
QuerySelector returns the first matched XML Node by the specified XPath selector.
func QuerySelectorAll ¶
QuerySelectorAll searches all of the XML Node that matches the specified XPath selectors.
func (*Node) OutputXMLWithOptions ¶
func (n *Node) OutputXMLWithOptions(opts ...OutputOption) string
OutputXMLWithOptions returns the text that including tags name.
func (*Node) RemoveAttr ¶
RemoveAttr removes the attribute with the specified name.
func (*Node) SelectAttr ¶
SelectAttr returns the attribute value with the specified name.
func (*Node) SelectElement ¶
SelectElement finds child elements with the specified name.
func (*Node) SelectElements ¶
SelectElements finds child elements with the specified name.
type NodeNavigator ¶
type NodeNavigator struct {
// contains filtered or unexported fields
}
func CreateXPathNavigator ¶
func CreateXPathNavigator(top *Node) *NodeNavigator
CreateXPathNavigator creates a new xpath.NodeNavigator for the specified XML Node.
func (*NodeNavigator) Copy ¶
func (x *NodeNavigator) Copy() xpath.NodeNavigator
func (*NodeNavigator) Current ¶
func (x *NodeNavigator) Current() *Node
func (*NodeNavigator) LocalName ¶
func (x *NodeNavigator) LocalName() string
func (*NodeNavigator) MoveTo ¶
func (x *NodeNavigator) MoveTo(other xpath.NodeNavigator) bool
func (*NodeNavigator) MoveToChild ¶
func (x *NodeNavigator) MoveToChild() bool
func (*NodeNavigator) MoveToFirst ¶
func (x *NodeNavigator) MoveToFirst() bool
func (*NodeNavigator) MoveToNext ¶
func (x *NodeNavigator) MoveToNext() bool
func (*NodeNavigator) MoveToNextAttribute ¶
func (x *NodeNavigator) MoveToNextAttribute() bool
func (*NodeNavigator) MoveToParent ¶
func (x *NodeNavigator) MoveToParent() bool
func (*NodeNavigator) MoveToPrevious ¶
func (x *NodeNavigator) MoveToPrevious() bool
func (*NodeNavigator) MoveToRoot ¶
func (x *NodeNavigator) MoveToRoot()
func (*NodeNavigator) NamespaceURL ¶
func (x *NodeNavigator) NamespaceURL() string
func (*NodeNavigator) NodeType ¶
func (x *NodeNavigator) NodeType() xpath.NodeType
func (*NodeNavigator) Prefix ¶
func (x *NodeNavigator) Prefix() string
func (*NodeNavigator) String ¶
func (x *NodeNavigator) String() string
func (*NodeNavigator) Value ¶
func (x *NodeNavigator) Value() string
type NodeType ¶
type NodeType uint
A NodeType is the type of a Node.
const ( // DocumentNode is a document object that, as the root of the document tree, // provides access to the entire XML document. DocumentNode NodeType = iota // DeclarationNode is the document type declaration, indicated by the // following tag (for example, <!DOCTYPE...> ). DeclarationNode // ElementNode is an element (for example, <item> ). ElementNode // TextNode is the text content of a node. TextNode // CharDataNode node <![CDATA[content]]> CharDataNode // CommentNode a comment (for example, <!-- my comment --> ). CommentNode // AttributeNode is an attribute of element. AttributeNode )
type OutputOption ¶
type OutputOption func(*outputConfiguration)
func WithEmptyTagSupport ¶
func WithEmptyTagSupport() OutputOption
WithEmptyTagSupport empty tags should be written as <empty/> and not as <empty></empty>
func WithOutputSelf ¶
func WithOutputSelf() OutputOption
WithOutputSelf configures the Node to print the root node itself
func WithoutComments ¶
func WithoutComments() OutputOption
WithoutComments will skip comments in output
type ParserOptions ¶
type ParserOptions struct { Decoder *DecoderOptions Prefixes map[string]string }
type StreamParser ¶
type StreamParser struct {
// contains filtered or unexported fields
}
StreamParser enables loading and parsing an XML document in a streaming fashion.
func CreateStreamParser ¶
func CreateStreamParser(r io.Reader, streamElementXPath string, streamElementFilter ...string) (*StreamParser, error)
CreateStreamParser creates a StreamParser. Argument streamElementXPath is required. Argument streamElementFilter is optional and should only be used in advanced scenarios.
Scenario 1: simple case:
xml := `<AAA><BBB>b1</BBB><BBB>b2</BBB></AAA>` sp, err := CreateStreamParser(strings.NewReader(xml), "/AAA/BBB") if err != nil { panic(err) } for { n, err := sp.Read() if err != nil { break } fmt.Println(n.OutputXML(true)) }
Output will be:
<BBB>b1</BBB> <BBB>b2</BBB>
Scenario 2: advanced case:
xml := `<AAA><BBB>b1</BBB><BBB>b2</BBB></AAA>` sp, err := CreateStreamParser(strings.NewReader(xml), "/AAA/BBB", "/AAA/BBB[. != 'b1']") if err != nil { panic(err) } for { n, err := sp.Read() if err != nil { break } fmt.Println(n.OutputXML(true)) }
Output will be:
<BBB>b2</BBB>
As the argument names indicate, streamElementXPath should be used for providing xpath query pointing to the target element node only, no extra filtering on the element itself or its children; while streamElementFilter, if needed, can provide additional filtering on the target element and its children.
CreateStreamParser returns an error if either streamElementXPath or streamElementFilter, if provided, cannot be successfully parsed and compiled into a valid xpath query.
func CreateStreamParserWithOptions ¶
func CreateStreamParserWithOptions( r io.Reader, options ParserOptions, streamElementXPath string, streamElementFilter ...string, ) (*StreamParser, error)
CreateStreamParserWithOptions is like CreateStreamParser, but with custom options
func (*StreamParser) Read ¶
func (sp *StreamParser) Read() (*Node, error)
Read returns a target node that satisfies the XPath specified by caller at StreamParser creation time. If there is no more satisfying target nodes after reading the rest of the XML document, io.EOF will be returned. At any time, any XML parsing error encountered will be returned, and the stream parsing stopped. Calling Read() after an error is returned (including io.EOF) results undefined behavior. Also note, due to the streaming nature, calling Read() will automatically remove any previous target node(s) from the document tree.