libxml2

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 6, 2019 License: MIT Imports: 7 Imported by: 0

README

libxml2

Interface to libxml2, with DOM interface.

Build Status

GoDoc

Why?

I needed to write go-xmlsec. This means we need to build trees using libxml2, and then muck with it in xmlsec: Two separate packages in Go means we cannot (safely) pass around C.xmlFooPtr objects (also, you pay a penalty for pointer types). This package carefully avoid references to C.xmlFooPtr types and uses uintptr to pass data around, so other libraries that needs to interact with libxml2 can safely interact with it.

Status

  • This library should be considered alpha grade. API may still change.
  • Much of commonly used functionalities from libxml2 that I use are there already, and are known to be functional

Package Layout:

Name Description
libxml2 Globally available utility functions, such as ParseString
types Common data types, such as types.Node
parser Parser routines
dom DOM-like manipulation of XML document/nodes
xpath XPath related tools
xsd XML Schema related tools
clib Wrapper around C libxml2 library - DO NOT TOUCH IF UNSURE

Features

Create XML documents using DOM-like interface:

  d := dom.CreateDocument()
  e, err := d.CreateElement("foo")
  if err != nil {
    println(err)
    return
  }
  d.SetDocumentElement(e)
  ...

Parse documents:

  d, err := libxml2.ParseString(xmlstring)
  if err != nil {
    println(err)
    return
  }

Use XPath to extract node values:

  text := xpath.String(node.Find("//xpath/expression"))

Examples

Basic XML Example
import (
  "log"
  "net/http"

  "github.com/KemalovMaulen/libxml2"
  "github.com/KemalovMaulen/libxml2/parser"
  "github.com/KemalovMaulen/libxml2/types"
  "github.com/KemalovMaulen/libxml2/xpath"
)

func ExampleXML() {
  res, err := http.Get("http://blog.golang.org/feed.atom")
  if err != nil {
    panic("failed to get blog.golang.org: " + err.Error())
  }

  p := parser.New()
  doc, err := p.ParseReader(res.Body)
  defer res.Body.Close()

  if err != nil {
    panic("failed to parse XML: " + err.Error())
  }
  defer doc.Free()

  doc.Walk(func(n types.Node) error {
    log.Printf(n.NodeName())
    return nil
  })

  root, err := doc.DocumentElement()
  if err != nil {
    log.Printf("Failed to fetch document element: %s", err)
    return
  }

  ctx, err := xpath.NewContext(root)
  if err != nil {
    log.Printf("Failed to create xpath context: %s", err)
    return
  }
  defer ctx.Free()

  ctx.RegisterNS("atom", "http://www.w3.org/2005/Atom")
  title := xpath.String(ctx.Find("/atom:feed/atom:title/text()"))
  log.Printf("feed title = %s", title)
}
Basic HTML Example
func ExampleHTML() {
  res, err := http.Get("http://golang.org")
  if err != nil {
    panic("failed to get golang.org: " + err.Error())
  }

  doc, err := libxml2.ParseHTMLReader(res.Body)
  if err != nil {
    panic("failed to parse HTML: " + err.Error())
  }
  defer doc.Free()

  doc.Walk(func(n types.Node) error {
    log.Printf(n.NodeName())
    return nil
  })

  nodes := xpath.NodeList(doc.Find(`//div[@id="menu"]/a`))
  for i := 0; i < len(nodes); i++ {
    log.Printf("Found node: %s", nodes[i].NodeName())
  }
}
XSD Validation
import (
  "io/ioutil"
  "log"
  "os"
  "path/filepath"

  "github.com/KemalovMaulen/libxml2"
  "github.com/KemalovMaulen/libxml2/xsd"
)

func ExampleXSD() {
  xsdfile := filepath.Join("test", "xmldsig-core-schema.xsd")
  f, err := os.Open(xsdfile)
  if err != nil {
    log.Printf("failed to open file: %s", err)
    return
  }
  defer f.Close()

  buf, err := ioutil.ReadAll(f)
  if err != nil {
    log.Printf("failed to read file: %s", err)
    return
  }

  s, err := xsd.Parse(buf)
  if err != nil {
    log.Printf("failed to parse XSD: %s", err)
    return
  }
  defer s.Free()

  d, err := libxml2.ParseString(`<foo></foo>`)
  if err != nil {
    log.Printf("failed to parse XML: %s", err)
    return
  }

  if err := s.Validate(d); err != nil {
    for _, e := range err.(xsd.SchemaValidationError).Errors() {
      log.Printf("error: %s", e.Error())
    }
    return
  }

  log.Printf("validation successful!")
}

Caveats

Other libraries

There exists many similar libraries. I want speed, I want DOM, and I want XPath.When all of these are met, I'd be happy to switch to another library.

For now my closest contender was xmlpath, but as of this writing it suffers in the speed (for xpath) area a bit:

shoebill% go test -v -run=none -benchmem -benchtime=5s -bench .
PASS
BenchmarkXmlpathXmlpath-4     500000         11737 ns/op         721 B/op          6 allocs/op
BenchmarkLibxml2Xmlpath-4    1000000          7627 ns/op         368 B/op         15 allocs/op
BenchmarkEncodingXMLDOM-4    2000000          4079 ns/op        4560 B/op          9 allocs/op
BenchmarkLibxml2DOM-4        1000000         11454 ns/op         264 B/op          7 allocs/op
ok      github.com/KemalovMaulen/libxml2  37.597s

See Also

Credits

Documentation

Overview

Package libxml2 is an interface to libxml2 library, providing XML and HTML parsers with DOM interface. The inspiration is Perl5's XML::LibXML module.

This library is still in very early stages of development. API may still change without notice.

For the time being, the API is being written so that thye are as close as we can get to DOM Layer 3, but some methods will, for the time being, be punted and aliases for simpler methods that don't necessarily check for the DOM's correctness will be used.

Also, the return values are still shaky -- I'm still debating how to handle error cases gracefully.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Parse

func Parse(buf []byte, o ...parser.Option) (types.Document, error)

Parse parses the given buffer and returns a Document.

func ParseHTML

func ParseHTML(content []byte, options ...parser.HTMLOption) (types.Document, error)

ParseHTML parses an HTML document. You can omit the options argument, or you can provide one bitwise-or'ed option

func ParseHTMLReader

func ParseHTMLReader(in io.Reader, options ...parser.HTMLOption) (types.Document, error)

ParseHTMLReader parses an HTML document. You can omit the options argument, or you can provide one bitwise-or'ed option

func ParseHTMLString

func ParseHTMLString(content string, options ...parser.HTMLOption) (types.Document, error)

ParseHTMLString parses an HTML document. You can omit the options argument, or you can provide one bitwise-or'ed option

func ParseReader

func ParseReader(rdr io.Reader, o ...parser.Option) (types.Document, error)

ParseReader parses XML from the given io.Reader and returns a Document.

func ParseString

func ParseString(s string, o ...parser.Option) (types.Document, error)

ParseString parses the given string and returns a Document.

Types

This section is empty.

Directories

Path Synopsis
Package clib holds all of the dirty C interaction for go-libxml2.
Package clib holds all of the dirty C interaction for go-libxml2.
internal
Package types exist to provide with common types that are used through out in go-libxml2.
Package types exist to provide with common types that are used through out in go-libxml2.
Package xpath contains tools to handle XPath evaluation.
Package xpath contains tools to handle XPath evaluation.
Package xsd contains some of the tools available from libxml2 that allows you to validate your XML against an XSD This is basically all you need to do: schema, err := xsd.Parse(xsdsrc) if err != nil { panic(err) } defer schema.Free() if err := schema.Validate(doc); err != nil{ for _, e := range err.(SchemaValidationErr).Error() { println(e.Error()) } }
Package xsd contains some of the tools available from libxml2 that allows you to validate your XML against an XSD This is basically all you need to do: schema, err := xsd.Parse(xsdsrc) if err != nil { panic(err) } defer schema.Free() if err := schema.Validate(doc); err != nil{ for _, e := range err.(SchemaValidationErr).Error() { println(e.Error()) } }

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL