pandoc_client

package module
v0.0.0-...-46e0f52 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 5, 2022 License: BSD-3-Clause Imports: 11 Imported by: 0

README

Pandoc Client

This repository holds an experimental Go package that functions as a client to pandoc-server. Caltech Library builds many of it static websites using Pandoc to render Markdown (and other formats) to HTML (or other formats). This is typically done via scripting (e.g. Bash, Python) or a dedicated application (e.g. something written in Go). For a small number of pages in a slow change website doing an "exec" makes sense. This approach breaks down when you have a website (e.g. https://feeds.library.caltech.edu) that has 118545 documents and is growing.

Concept

The pandoc server is launched via systemd when the machine starts up. It listens on a localhost port ONLY. When the site building processes startup they can read the documents that need to be converted from either a database (e.g. SQLite3, MySQL, Postgres) or the file system. That content is then turned into a structure that the Pandoc server understands and is sent to it as a POST per Pandoc Server documentation. The response then is written to disk (or S3 bucket) as appropriate.

This should result in a relatively simple Go package and can work with io.Reader, io.Writer types for maximum flexibility. Combined with other services we hypothisis is that we would see improved performance in rendering the website. The expectation is that pandoc-server launches once, it is a single process (so no overhead on startup). We have the existing overhead of the data source so that doesn't change. The documents are small for the most part so the network overhead between the client and pandoc-server should be minimal (they are running on the same machine after all). The write of the rendered document should be the same as our previous approach. The wind down of the process from the exec is avoided. We should be able to run conversions in parallel without worrying about running out of process handles. More parallel writes should imply that the overall time of the updates can be lowered.

Requirements

  • Go 1.19.2 or better
  • Pandoc 2.19 or better
  • A data source (e.g. file system with markdown documents)
  • A place to write the output (e.g. a file system with render documents)

Documentation

Overview

Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Index

Constants

View Source
const (
	// Version of package
	Version = "0.0.1"

	License = `` /* 1524-byte string literal not displayed */

)

Variables

View Source
var (
	// DefaultExtTypes maps file extensions to document types. This allows the "to", "from"
	// Pandoc options to be set based on file extension. This can be overwritten by setting
	// `.ext_types` in the JSON configuraiton file.
	DefaultExtTypes = map[string]string{
		".md":   "markdown",
		".html": "html5",
	}
)

Functions

This section is empty.

Types

type Config

type Config struct {
	// Port defaults to 3030, it is the port number that pandoc-server listens on
	Port string `json:"port,omitempty"`
	// From is the doc type you are converting from, e.g. markdown
	From string `json:"from,omitempty"`
	// To is the doc type you are converting to, e.g. html5
	To string `json:"to,omitempty"`
	//
	// For the following fields see https://pandoc.org/pandoc-server.html#root-endpoint
	//
	ShiftHeadingLevel     int                    `json:"shift-heading-level-by,omitempty"`
	IdentedCodeClasses    []string               `json:"indented-code-classes,omitempty"`
	DefaultImageExtension string                 `json:"default-image-extension,omitempty"`
	Metadata              string                 `json:"metadata,omitempty"`
	TabStop               int                    `json:"tab-stop,omitempty"`
	TrackChanges          string                 `json:"track-changes,omitempty"`
	Abbreviations         []string               `json:"abbreviations,omitempty"`
	Standalone            bool                   `json:"standalone,omitempty"`
	Text                  string                 `json:"text,omitempty"`
	Template              string                 `json:"template,omitempty"`
	Variables             map[string]interface{} `json:"variables,omitempty"`
	DPI                   int                    `json:"dpi,omitemtpy"`
	Wrap                  string                 `json:"wrap,omitempty"`
	Columns               int                    `json:"columns,omitempty"`
	TableOfContents       bool                   `json:"table-of-contents,omitempty"`
	TOCDepth              int                    `json:"toc-depth,omitempty"`
	StripComments         bool                   `json:"strip-comments,omitempty"`
	HighlightStyle        string                 `json:"highlight-style,omitempty"`
	EmbedResources        string                 `json:"embed-resources,omitempty"`
	HTMLQTags             bool                   `json:"html-q-tags,omitempty"`
	Ascii                 bool                   `json:"ascii,omitempty"`
	ReferenceLinks        bool                   `json:"reference-links,omitempty"`
	ReferenceLocation     string                 `json:"reference-location,omitempty"`
	SetExtHeaders         string                 `json:"setext-headers,omitempty"`
	TopLevelDivision      string                 `json:"top-level-division,omitempty"`
	NumberSections        string                 `json:"number-sections,omitempty"`
	NumberOffset          []int                  `json:"number-offset,omitempty"`
	HTMLMathMethod        string                 `json:"html-math-method,omitempty"`
	Listings              bool                   `json:"listings,omitempty"`
	Incremental           bool                   `json:"incremental,omitempty"`
	SideLevel             int                    `json:"slide-level,omitempty"`
	SectionDivs           bool                   `json:"section-divs,omitempty"`
	EmailObfuscation      string                 `json:"email-obfuscation,omitempty"`
	IdentifierPrefix      string                 `json:"identifier-prefix,omitempty"`
	TitlePrefix           string                 `json:"title-prefix,omitempty"`
	ReferenceDoc          string                 `json:"reference-doc,omitempty"`
	EPubCoverImage        string                 `json:"epub-cover-image,omitempty"`
	EPubMetadata          string                 `json:"epub-metadata,omitempty"`
	EPubChapterLevel      int                    `json:"epub-chapter-level,omitempty"`
	EPubSubdirectory      string                 `json:"epub-subdirectory,omitempty"`
	EPubFonts             string                 `json:"epub-fonts,omitempty"`
	IpynbOutput           string                 `json:"ipynb-output,omitempty"`
	Citeproc              bool                   `json:"citeproc,omitempty"`
	Bibliography          []string               `json:"bibliography,omitempty"`
	Csl                   string                 `json:"csl,omitempty"`
	CiteMethod            string                 `json:"cite-method,omitempty"`
	Files                 []string               `json:files,omitempty"`

	// Verbose if set true then include logging on success as well as error
	Verbose bool
}

func Load

func Load(fName string) (*Config, error)

Load will read a JSON file containing config attributes and return a config struct and error.

func (*Config) Convert

func (cfg *Config) Convert(input io.Reader) ([]byte, error)

Pandoc a takes the configuration settings and sends a request to the Pandoc server with contents read from the io.Reader and returns a slice of bytes and error.

```

 // Setup our client configuration
	cfg := pandoc_client.Config{
		Standalone: true,
		From: "markdown",
		To: "html5",
	}
	src, err := os.ReadFile("htdocs/index.md")
	// ... handle error
	txt, err :=  cfg.Convert(bytes.NewReader(src))
	if err := os.WriteFile("htdocs/index.html", src, 0664); err != nil {
	    // ... handle error
	}

```

func (*Config) RootEndpoint

func (cfg *Config) RootEndpoint() ([]byte, error)

RootEndpoint takes content type and sends the request to the Pandoc Server Root end point based on the state of configuration struct used.

func (*Config) Walk

func (cfg *Config) Walk(startPath string, fromExt string, toExt string) error

Walk takes a path and walks the directories converting the files that map to the From values in the configuration.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL