libxml2

package module

v1.0.0 Latest Latest Go to latest Published: Jun 6, 2019 License: MIT Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/KemalovMaulen/libxml2

Links

Open Source Insights

README ¶

libxml2

Interface to libxml2, with DOM interface.

Why?

I needed to write go-xmlsec. This means we need to build trees using libxml2, and then muck with it in xmlsec: Two separate packages in Go means we cannot (safely) pass around C.xmlFooPtr objects (also, you pay a penalty for pointer types). This package carefully avoid references to C.xmlFooPtr types and uses uintptr to pass data around, so other libraries that needs to interact with libxml2 can safely interact with it.

Status

This library should be considered alpha grade. API may still change.
Much of commonly used functionalities from libxml2 that I use are there already, and are known to be functional

Package Layout:

Name	Description
libxml2	Globally available utility functions, such as `ParseString`
types	Common data types, such as `types.Node`
parser	Parser routines
dom	DOM-like manipulation of XML document/nodes
xpath	XPath related tools
xsd	XML Schema related tools
clib	Wrapper around C libxml2 library - DO NOT TOUCH IF UNSURE

Features

Create XML documents using DOM-like interface:

  d := dom.CreateDocument()
  e, err := d.CreateElement("foo")
  if err != nil {
    println(err)
    return
  }
  d.SetDocumentElement(e)
  ...

Parse documents:

  d, err := libxml2.ParseString(xmlstring)
  if err != nil {
    println(err)
    return
  }

Use XPath to extract node values:

  text := xpath.String(node.Find("//xpath/expression"))

Examples

Basic XML Example

import (
  "log"
  "net/http"

  "github.com/KemalovMaulen/libxml2"
  "github.com/KemalovMaulen/libxml2/parser"
  "github.com/KemalovMaulen/libxml2/types"
  "github.com/KemalovMaulen/libxml2/xpath"
)

func ExampleXML() {
  res, err := http.Get("http://blog.golang.org/feed.atom")
  if err != nil {
    panic("failed to get blog.golang.org: " + err.Error())
  }

  p := parser.New()
  doc, err := p.ParseReader(res.Body)
  defer res.Body.Close()

  if err != nil {
    panic("failed to parse XML: " + err.Error())
  }
  defer doc.Free()

  doc.Walk(func(n types.Node) error {
    log.Printf(n.NodeName())
    return nil
  })

  root, err := doc.DocumentElement()
  if err != nil {
    log.Printf("Failed to fetch document element: %s", err)
    return
  }

  ctx, err := xpath.NewContext(root)
  if err != nil {
    log.Printf("Failed to create xpath context: %s", err)
    return
  }
  defer ctx.Free()

  ctx.RegisterNS("atom", "http://www.w3.org/2005/Atom")
  title := xpath.String(ctx.Find("/atom:feed/atom:title/text()"))
  log.Printf("feed title = %s", title)
}

Basic HTML Example

func ExampleHTML() {
  res, err := http.Get("http://golang.org")
  if err != nil {
    panic("failed to get golang.org: " + err.Error())
  }

  doc, err := libxml2.ParseHTMLReader(res.Body)
  if err != nil {
    panic("failed to parse HTML: " + err.Error())
  }
  defer doc.Free()

  doc.Walk(func(n types.Node) error {
    log.Printf(n.NodeName())
    return nil
  })

  nodes := xpath.NodeList(doc.Find(`//div[@id="menu"]/a`))
  for i := 0; i < len(nodes); i++ {
    log.Printf("Found node: %s", nodes[i].NodeName())
  }
}

XSD Validation

import (
  "io/ioutil"
  "log"
  "os"
  "path/filepath"

  "github.com/KemalovMaulen/libxml2"
  "github.com/KemalovMaulen/libxml2/xsd"
)

func ExampleXSD() {
  xsdfile := filepath.Join("test", "xmldsig-core-schema.xsd")
  f, err := os.Open(xsdfile)
  if err != nil {
    log.Printf("failed to open file: %s", err)
    return
  }
  defer f.Close()

  buf, err := ioutil.ReadAll(f)
  if err != nil {
    log.Printf("failed to read file: %s", err)
    return
  }

  s, err := xsd.Parse(buf)
  if err != nil {
    log.Printf("failed to parse XSD: %s", err)
    return
  }
  defer s.Free()

  d, err := libxml2.ParseString(`<foo></foo>`)
  if err != nil {
    log.Printf("failed to parse XML: %s", err)
    return
  }

  if err := s.Validate(d); err != nil {
    for _, e := range err.(xsd.SchemaValidationError).Errors() {
      log.Printf("error: %s", e.Error())
    }
    return
  }

  log.Printf("validation successful!")
}

Caveats

Other libraries

There exists many similar libraries. I want speed, I want DOM, and I want XPath.When all of these are met, I'd be happy to switch to another library.

For now my closest contender was xmlpath, but as of this writing it suffers in the speed (for xpath) area a bit:

shoebill% go test -v -run=none -benchmem -benchtime=5s -bench .
PASS
BenchmarkXmlpathXmlpath-4     500000         11737 ns/op         721 B/op          6 allocs/op
BenchmarkLibxml2Xmlpath-4    1000000          7627 ns/op         368 B/op         15 allocs/op
BenchmarkEncodingXMLDOM-4    2000000          4079 ns/op        4560 B/op          9 allocs/op
BenchmarkLibxml2DOM-4        1000000         11454 ns/op         264 B/op          7 allocs/op
ok      github.com/KemalovMaulen/libxml2  37.597s

Credits

Work on this library was generously sponsored by HDE Inc (https://www.hde.co.jp)

Documentation ¶

Overview ¶

Package libxml2 is an interface to libxml2 library, providing XML and HTML parsers with DOM interface. The inspiration is Perl5's XML::LibXML module.

This library is still in very early stages of development. API may still change without notice.

For the time being, the API is being written so that thye are as close as we can get to DOM Layer 3, but some methods will, for the time being, be punted and aliases for simpler methods that don't necessarily check for the DOM's correctness will be used.

Also, the return values are still shaky -- I'm still debating how to handle error cases gracefully.

Index ¶

func Parse(buf []byte, o ...parser.Option) (types.Document, error)
func ParseHTML(content []byte, options ...parser.HTMLOption) (types.Document, error)
func ParseHTMLReader(in io.Reader, options ...parser.HTMLOption) (types.Document, error)
func ParseHTMLString(content string, options ...parser.HTMLOption) (types.Document, error)
func ParseReader(rdr io.Reader, o ...parser.Option) (types.Document, error)
func ParseString(s string, o ...parser.Option) (types.Document, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Parse ¶

func Parse(buf []byte, o ...parser.Option) (types.Document, error)

Parse parses the given buffer and returns a Document.

func ParseHTML ¶

func ParseHTML(content []byte, options ...parser.HTMLOption) (types.Document, error)

ParseHTML parses an HTML document. You can omit the options argument, or you can provide one bitwise-or'ed option

func ParseHTMLReader ¶

func ParseHTMLReader(in io.Reader, options ...parser.HTMLOption) (types.Document, error)

ParseHTMLReader parses an HTML document. You can omit the options argument, or you can provide one bitwise-or'ed option

func ParseHTMLString ¶

func ParseHTMLString(content string, options ...parser.HTMLOption) (types.Document, error)

ParseHTMLString parses an HTML document. You can omit the options argument, or you can provide one bitwise-or'ed option

func ParseReader ¶

func ParseReader(rdr io.Reader, o ...parser.Option) (types.Document, error)

ParseReader parses XML from the given io.Reader and returns a Document.

func ParseString ¶

func ParseString(s string, o ...parser.Option) (types.Document, error)

ParseString parses the given string and returns a Document.

Types ¶

This section is empty.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
clib Package clib holds all of the dirty C interaction for go-libxml2.	Package clib holds all of the dirty C interaction for go-libxml2.
dom
internal
cmd/genwrapnode
debug
parser
types Package types exist to provide with common types that are used through out in go-libxml2.	Package types exist to provide with common types that are used through out in go-libxml2.
xpath Package xpath contains tools to handle XPath evaluation.	Package xpath contains tools to handle XPath evaluation.
xsd Package xsd contains some of the tools available from libxml2 that allows you to validate your XML against an XSD This is basically all you need to do: schema, err := xsd.Parse(xsdsrc) if err != nil { panic(err) } defer schema.Free() if err := schema.Validate(doc); err != nil{ for _, e := range err.(SchemaValidationErr).Error() { println(e.Error()) } }	Package xsd contains some of the tools available from libxml2 that allows you to validate your XML against an XSD This is basically all you need to do: schema, err := xsd.Parse(xsdsrc) if err != nil { panic(err) } defer schema.Free() if err := schema.Validate(doc); err != nil{ for _, e := range err.(SchemaValidationErr).Error() { println(e.Error()) } }

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

libxml2

Why?

Status

Package Layout:

Features

Examples

Basic XML Example

Basic HTML Example

XSD Validation

Caveats

Other libraries

See Also

Credits

Documentation ¶

Overview ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func Parse ¶

func ParseHTML ¶

func ParseHTMLReader ¶

func ParseHTMLString ¶

func ParseReader ¶

func ParseString ¶

Types ¶

Source Files ¶

Directories ¶