xmltree

package
v1.0.0-beta.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 27, 2024 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package xmltree converts XML documents into a tree of Go values.

The xmltree package provides types and routines for accessing and manipulating XML documents as trees, along with functionality to resolve XML namespace prefixes at any point in the tree.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Encode

func Encode(w io.Writer, el *Element) error

Encode writes the XML encoding of the Element to w. Encode returns any errors encountered writing to w.

func Equal

func Equal(a, b *Element) bool

Equal returns true if two xmltree.Elements are equal, ignoring differences in white space, sub-element order, and namespace prefixes.

func Marshal

func Marshal(el *Element) []byte

Marshal produces the XML encoding of an Element as a self-contained document. The xmltree package may adjust the declarations of XML namespaces if the Element has been modified, or is part of a larger scope, such that the document produced by Marshal is a valid XML document.

The return value of Marshal will use the utf-8 encoding regardless of the original encoding of the source document.

Example
package main

import (
	"fmt"
	"log"

	"github.com/weasel-software/go-xml/xmltree"
)

func main() {
	input := []byte(`<?xml version="1.0" encoding="UTF-8"?>
	<toc>
	  <chapter-list>
	    <chapter>
	      <title>Civilizing Huck.Miss Watson.Tom Sawyer Waits.</title>
	      <number>1</number>
	    </chapter>
	    <chapter>
	      <title>The Boys Escape Jim.Torn Sawyer's Gang.Deep-laid Plans.</title>
	      <number>2</number>
	    </chapter>
	    <chapter>
	      <title>A Good Going-over.Grace Triumphant."One of Tom Sawyers's Lies".</title>
	      <number>3</number>
	    </chapter>
	    <chapter>
	      <title>Huck and the Judge.Superstition.</title>
	      <number>4</number>
	    </chapter>
	  </chapter-list>
	</toc>`)

	var chapters []*xmltree.Element
	root, err := xmltree.Parse(input)
	if err != nil {
		log.Fatal(err)
	}

	for _, el := range root.Search("", "chapter") {
		for _, child := range el.Search("", "title") {
			el.Content = child.Content
		}
		el.Children = nil
		chapters = append(chapters, el)
	}
	root.Children = chapters
	fmt.Printf("%s\n", xmltree.MarshalIndent(root, "", "  "))

}
Output:

<toc>
  <chapter>Civilizing Huck.Miss Watson.Tom Sawyer Waits.</chapter>
  <chapter>The Boys Escape Jim.Torn Sawyer's Gang.Deep-laid Plans.</chapter>
  <chapter>A Good Going-over.Grace Triumphant."One of Tom Sawyers's Lies".</chapter>
  <chapter>Huck and the Judge.Superstition.</chapter>
</toc>

func MarshalIndent

func MarshalIndent(el *Element, prefix, indent string) []byte

MarshalIndent is like Marshal, but adds line breaks for each successive element. Each line begins with prefix and is followed by zero or more copies of indent according to the nesting depth.

func Unmarshal

func Unmarshal(el *Element, v interface{}) error

Unmarshal parses the XML encoding of the Element and stores the result in the value pointed to by v. Unmarshal follows the same rules as xml.Unmarshal, but only parses the portion of the XML document contained by the Element.

Example
package main

import (
	"fmt"
	"log"

	"github.com/weasel-software/go-xml/xmltree"
)

func main() {
	input := []byte(`<mediawiki xml:lang="en">
	  <page>
	    <title>Page title</title>
	    <restrictions>edit=sysop:move=sysop</restrictions>
	    <revision>
	      <timestamp>2001-01-15T13:15:00Z</timestamp>
	      <contributor><username>Foobar</username></contributor>
	      <comment>I have just one thing to say!</comment>
	      <text>A bunch of [[text]] here.</text>
	      <minor />
	    </revision>
	    <revision>
	      <timestamp>2001-01-15T13:10:27Z</timestamp>
	      <contributor><ip>10.0.0.2</ip></contributor>
	      <comment>new!</comment>
	      <text>An earlier [[revision]].</text>
	    </revision>
	  </page>
	  
	  <page>
	    <title>Talk:Page title</title>
	    <revision>
	      <timestamp>2001-01-15T14:03:00Z</timestamp>
	      <contributor><ip>10.0.0.2</ip></contributor>
	      <comment>hey</comment>
	      <text>WHYD YOU LOCK PAGE??!!! i was editing that jerk</text>
	    </revision>
	  </page>
	</mediawiki>`)

	type revision struct {
		Timestamp   string   `xml:"timestamp"`
		Contributor string   `xml:"contributor>ip"`
		Comment     string   `xml:"comment"`
		Text        []string `xml:"text"`
	}

	root, err := xmltree.Parse(input)
	if err != nil {
		log.Fatal(err)
	}

	// Pull all <revision> items from the input
	for _, el := range root.Search("", "revision") {
		var rev revision
		if err := xmltree.Unmarshal(el, &rev); err != nil {
			log.Print(err)
			continue
		}
		fmt.Println(rev.Timestamp, rev.Comment)
	}

}
Output:

2001-01-15T13:15:00Z I have just one thing to say!
2001-01-15T13:10:27Z new!
2001-01-15T14:03:00Z hey

Types

type Element

type Element struct {
	xml.StartElement
	// The XML namespace scope at this element's location in the
	// document.
	Scope
	// The raw content contained within this element's start and
	// end tags. Uses the underlying byte array passed to Parse.
	Content []byte
	// Sub-elements contained within this element.
	Children []*Element
}

An Element represents a single element in an XML document. Elements may have zero or more children. The byte array used by the Content field is shared among all elements in the document, and should not be modified. An Element also captures xml namespace prefixes, so that arbitrary QNames in attribute values can be resolved.

func Parse

func Parse(doc []byte) (*Element, error)

Parse builds a tree of Elements by reading an XML document. The byte slice passed to Parse is expected to be a valid XML document with a single root element.

func (*Element) Attr

func (el *Element) Attr(space, local string) string

Attr gets the value of the first attribute whose name matches the space and local arguments. If space is the empty string, only attributes' local names are considered when looking for a match. If an attribute could not be found, the empty string is returned.

func (*Element) Flatten

func (el *Element) Flatten() []*Element

Flatten produces a slice of Element pointers referring to the children of el, and their children, in depth-first order.

func (*Element) Search

func (root *Element) Search(space, local string) []*Element

Search searches the Element tree for Elements with an xml tag matching the name and xml namespace. If space is the empty string, any namespace is matched.

Example
package main

import (
	"fmt"
	"log"

	"github.com/weasel-software/go-xml/xmltree"
)

func main() {
	data := `
	  <Staff>
        <Person>
            <FullName>Ira Glass</FullName>
        </Person>
        <Person>
            <FullName>Tom Magliozzi</FullName>
        </Person>
        <Person>
            <FullName>Terry Gross</FullName>
        </Person>
    </Staff>
	`
	root, err := xmltree.Parse([]byte(data))
	if err != nil {
		log.Fatal(err)
	}
	for _, el := range root.Search("", "FullName") {
		fmt.Printf("%s\n", el.Content)
	}

}
Output:

Ira Glass
Tom Magliozzi
Terry Gross

func (*Element) SearchFunc

func (root *Element) SearchFunc(fn func(*Element) bool) []*Element

SearchFunc traverses the Element tree in depth-first order and returns a slice of Elements for which the function fn returns true.

Example
package main

import (
	"fmt"
	"log"

	"github.com/weasel-software/go-xml/xmltree"
)

func main() {
	data := `
	  <People>
        <Person>
            <FullName>Grace R. Emlin</FullName>
            <Email where="home">
                <Addr>gre@example.com</Addr>
            </Email>
            <Email where='work'>
                <Addr>gre@work.com</Addr>
            </Email>
        </Person>
        <Person>
            <FullName>Michael P. Thompson</FullName>
            <Email where="home">
                <Addr>michaelp@example.com</Addr>
            </Email>
            <Email where='work'>
                <Addr>michaelp@work.com</Addr>
                <Addr>michael.thompson@work.com</Addr>
            </Email>
        </Person>
    </People>
	`

	root, err := xmltree.Parse([]byte(data))
	if err != nil {
		log.Fatal(err)
	}

	workEmails := root.SearchFunc(func(el *xmltree.Element) bool {
		return el.Name.Local == "Email" && el.Attr("", "where") == "work"
	})

	for _, el := range workEmails {
		for _, addr := range el.Children {
			fmt.Printf("%s\n", addr.Content)
		}
	}

}
Output:

gre@work.com
michaelp@work.com
michael.thompson@work.com

func (*Element) SetAttr

func (el *Element) SetAttr(space, local, value string)

SetAttr adds an XML attribute to an Element's existing Attributes. If the attribute already exists, it is replaced.

func (*Element) String

func (el *Element) String() string

String returns the XML encoding of an Element and its children as a string.

type Scope

type Scope struct {
	// contains filtered or unexported fields
}

A Scope represents the xml namespace scope at a given position in the document.

func (*Scope) JoinScope

func (outer *Scope) JoinScope(inner *Scope) *Scope

The JoinScope method joins two Scopes together. When resolving prefixes using the returned scope, the prefix list in the argument Scope is searched before that of the receiver Scope.

func (*Scope) Prefix

func (scope *Scope) Prefix(name xml.Name) (qname string)

Prefix is the inverse of Resolve. It uses the closest prefix defined for a namespace to create a string of the form prefix:local. If the namespace cannot be found, or is the default namespace, an unqualified name is returned.

func (*Scope) Resolve

func (scope *Scope) Resolve(qname string) xml.Name

Resolve translates an XML QName (namespace-prefixed string) to an xml.Name with a canonicalized namespace in its Space field. This can be used when working with XSD documents, which put QNames in attribute values. If qname does not have a prefix, the default namespace is used.If a namespace prefix cannot be resolved, the returned value's Space field will be the unresolved prefix. Use the ResolveNS function to detect when a namespace prefix cannot be resolved.

func (*Scope) ResolveDefault

func (scope *Scope) ResolveDefault(qname, defaultns string) xml.Name

ResolveDefault is like Resolve, but allows for the default namespace to be overridden. The namespace of strings without a namespace prefix (known as an NCName in XML terminology) will be defaultns.

func (*Scope) ResolveNS

func (scope *Scope) ResolveNS(qname string) (xml.Name, bool)

The ResolveNS method is like Resolve, but returns false for its second return value if a namespace prefix cannot be resolved.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL