wordpressxml

package module
v0.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 7, 2024 License: MIT Imports: 10 Imported by: 0

README

WordPress XML Parser

Build Status Go Report Card Docs License

Overview

The go-wordpressxml package provides a WordPress WXR (WordPress eXtended RSS) XML parser.

Documentation

Documentation is provided using godoc and available on GoDoc.org.

Installation

Installing any of the packages will install the entire library. For example:

$ go get github.com/grokify/go-wordpressxml

Usage

import (
	"github.com/grokify/go-wordpressxml"
)

func main() {
	wp := wordpressxml.NewWordPressXML()
	err := wp.ReadFile("myblog.wordpress.2016-08-13.xml")
	if err != nil {
		panic(err)
	}
	wp.WriteMetaCSV("articles.csv")
}

Notes

Since WordPress uses content:encoded and excerpt:encoded, Go's XML built-in parser treats both of these as the field encoded in different namespaces. This parser retrieves these fields as an array of encoded and then moves the data into the Content property.

Contributing

Features, Issues, and Pull Requests are always welcome.

To contribute:

  1. Fork it ( http://github.com/grokify/go-wordpressxml/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Please report issues and feature requests on GitHub.

Documentation

Overview

wordpressxml provides WordPress XML parser with metadata

wordpressxml provides WordPress XML parser with metadata

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ReadFileWXR

func ReadFileWXR(filename string) (wxr.Wxr, error)

Types

type Author

type Author struct {
	AuthorID          int        `xml:"author_id"`
	AuthorLogin       string     `xml:"author_login"`
	AuthorEmail       string     `xml:"author_email"`
	AuthorDisplayName string     `xml:"author_display_name"`
	AuthorFirstName   string     `xml:"author_first_name"`
	AuthorLastName    string     `xml:"author_last_name"`
	AuthorArticles    []ItemThin `xml:"-"`
}

Author is the WordPress XML author object.

type Category

type Category struct {
	Domain      string `xml:"domain,attr"`
	DisplayName string `xml:",chardata"`
	URLSlug     string `xml:"nicename,attr"`
}

type Channel

type Channel struct {
	Title   string   `xml:"title"`
	Link    string   `xml:"link"`
	Authors []Author `xml:"author"`
	Items   []Item   `xml:"item"`
}

type Comment

type Comment struct {
	ID          int    `xml:"comment_id"`
	Parent      int    `xml:"comment_parent"`
	Author      string `xml:"comment_author"`
	AuthorEmail string `xml:"comment_author_email"`
	AuthorURL   string `xml:"comment_author_url"`
	DateGmt     string `xml:"comment_date_gmt"`
	Content     string `xml:"comment_content"`
	IndentLevel int    `xml:"-"`
}

type Item

type Item struct {
	ID              int        `xml:"post_id"`
	Title           string     `xml:"title"`
	Creator         string     `xml:"creator"`
	Encoded         []string   `xml:"encoded"`
	IsSticky        int        `xml:"is_sticky"`
	Link            string     `xml:"link"`
	PubDate         string     `xml:"pubDate"`
	Description     string     `xml:"description"`
	PostDate        string     `xml:"post_date"`
	PostDateGMT     string     `xml:"post_date_gmt"`
	PostName        string     `xml:"post_name"`
	PostType        string     `xml:"post_type"`
	Status          string     `xml:"status"`
	Categories      []Category `xml:"category"`
	Comments        []Comment  `xml:"comment"`
	Content         string
	PostDatetime    time.Time
	PostDatetimeGMT time.Time
	PubDatetime     time.Time
}

Item is a WordPress XML item which can be a post, page or other object.

type ItemThin

type ItemThin struct {
	Title string
	Index int
}

ItemThin is a WordPress XML item that is used as additional metadata in the Author object.

type RSS added in v0.2.0

type RSS struct {
	Channel Channel `xml:"channel"`
}

type WordPressXML added in v0.2.0

type WordPressXML struct {
	Channel        Channel `xml:"channel"`
	CreatorCounts  map[string]int
	CreatorToIndex map[string]int
}

func NewWordPressXML added in v0.2.0

func NewWordPressXML() WordPressXML

func (*WordPressXML) ArticlesMetaTable added in v0.2.0

func (wpxml *WordPressXML) ArticlesMetaTable() table.Table

ArticlesMetaTable generates the data to be written out as a CSV.

func (*WordPressXML) AuthorForLogin added in v0.2.0

func (wpxml *WordPressXML) AuthorForLogin(authorLogin string) (Author, error)

AuthorForLogin returns the Author object for a given AuthorLogin or username.

func (*WordPressXML) AuthorsToIndex added in v0.2.0

func (wpxml *WordPressXML) AuthorsToIndex() map[string]int

func (*WordPressXML) ItemsToHTML added in v0.3.0

func (wpxml *WordPressXML) ItemsToHTML(filepath, title string) error

ItemsToHTML generates a simple HTML file from the items in a WordPress blog.

func (*WordPressXML) ReadFile added in v0.2.0

func (wpxml *WordPressXML) ReadFile(filepath string) error

ReadXml reads a WordPress XML file from the provided path.

func (*WordPressXML) WriteMetaCSV added in v0.2.0

func (wpxml *WordPressXML) WriteMetaCSV(filepath string) error

WriteMetaCsv writes articles metadata as a CSV file.

Directories

Path Synopsis
hugo converts Wordpress XML to Hugo posts.
hugo converts Wordpress XML to Hugo posts.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL