decoder

package
v0.0.0-...-a989888 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 28, 2023 License: MIT Imports: 4 Imported by: 0

Documentation

Overview

Package decoder implements a high performance decoder for wiki pages.

It uses xml.Decoder.RawToken to speed up parsing and Page.UnmarshalXML to reduce allocation.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

Decoder is an XML decoder tailored to the Wikipedia dataset.

func New

func New(r io.Reader) (*Decoder, error)

New instanciates a new Decoder.

It fail if it cannot find the mediawiki and siteinfo elements from the dataset.

func (*Decoder) Err

func (d *Decoder) Err() error

Err returns any error encountered by Next.

func (*Decoder) Next

func (d *Decoder) Next() bool

Next moves to the next element.

func (*Decoder) Scan

func (d *Decoder) Scan(v xml.Unmarshaler) error

Scan scans an element.

Beware, UnmarshalXML must be implemented by calling RawToken.

type Page

type Page struct {
	ID        int64     `xml:"-"`
	UpdatedAt time.Time `xml:"revision>timestamp"`
	Title     string    `xml:"title"`
	Text      string    `xml:"revision>text"`
}

Page represents a page from Wikipedia.

func (*Page) UnmarshalXML

func (p *Page) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML unmarshals an XML element into the page.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL