url2epub

package module

v0.4.0 Latest Latest Go to latest Published: Aug 14, 2022 License: BSD-3-Clause Imports: 24 Imported by: 3

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/fishy/url2epub

Links

Open Source Insights

README ¶

url2epub

Create ePub files from URLs

Overview

The root directory provides a Go library that creates ePub files out of URLs, with limitations.

rmapi/ directory provides a Go library that implements reMarkable API, so that the ePub files generated can be sent to reMarkable paper tablet directly.

tgbot/ directory provides a Go library that implements partial Telegram bot API, so all this can be done in a Telegram message.

cloudrun/ directory provides the Google Cloud Run implementation of the Telegram Bot that does all this, and also serving REST APIs.

License

BSD 3-Clause.

Documentation ¶

Overview ¶

Package url2epub fetches http(s) URL and extracts ePub files from them.

Index ¶

Constants
func DrainAndClose(r io.ReadCloser) error
func Epub(args EpubArgs) (id string, err error)
type EpubArgs
type GetHTMLArgs
type Node
- func FromNode(n *html.Node) *Node
- func GetHTML(ctx context.Context, args GetHTMLArgs) (*Node, *url.URL, error)
type ReadableArgs

Constants ¶

View Source

const EpubMimeType = `application/epub+zip`

EpubMimeType is the mime type for epub.

Variables ¶

This section is empty.

Functions ¶

func DrainAndClose ¶

func DrainAndClose(r io.ReadCloser) error

DrainAndClose drains and closes r.

func Epub ¶

func Epub(args EpubArgs) (id string, err error)

Epub creates an Epub 3.0 file from given content.

Types ¶

type EpubArgs ¶

type EpubArgs struct {
	// The destination to write the epub content to.
	Dest io.Writer

	// The title of the epub.
	Title string

	// The node pointing to the html tag.
	Node *html.Node

	// Images map:
	// key: image local filename
	// value: image content
	Images map[string]io.Reader
}

EpubArgs defines the args used by Epub function.

type GetHTMLArgs ¶

type GetHTMLArgs struct {
	// The HTTP GET URL, required.
	URL string

	// The User-Agent header to use, optional.
	UserAgent string

	// The bearer token for the twitter client.
	// If non-empty and the URL is a twitter URL,
	// it uses Twitter API to get the thread instead of the raw HTML.
	TwitterBearer string
}

GetHTMLArgs define the arguments used by GetHTML function.

type Node ¶

type Node html.Node

Node is typedef'd *html.Node with helper functions attached.

func FromNode ¶

func FromNode(n *html.Node) *Node

FromNode casts *html.Node into *Node.

func GetHTML ¶

func GetHTML(ctx context.Context, args GetHTMLArgs) (*Node, *url.URL, error)

GetHTML does HTTP get requests on HTML content.

It's different from standard http.Get in the following ways:

- If there are redirects happening during the request, returned URL will be the URL of the last (final) request.

- Instead of returning *http.Response, it returns parsed *html.Node, with Type being ElementNode and DataAtom being Html (instead of root node, which is usually DoctypeNode).

- The client used by Get does not have timeout set. It's expected that a deadline is set in the ctx passed in.

func (Node) AsNode ¶

func (n Node) AsNode() html.Node

AsNode casts n back to *html.Node

func (*Node) FindFirstAtomNode ¶

func (n *Node) FindFirstAtomNode(a atom.Atom) *Node

FindFirstAtomNode returns n itself or the first node in its descendants, with Type == html.ElementNode and DataAtom == a, using depth first search.

If none of n's descendants matches, nil will be returned.

func (Node) ForEachChild ¶

func (n Node) ForEachChild(f func(child *Node) bool)

ForEachChild calls f on each of n's children.

If f returns false, ForEachChild stops the iteration.

func (*Node) GetAMPurl ¶

func (n *Node) GetAMPurl() string

GetAMPurl returns the amp URL of the document, if any.

func (*Node) GetLang ¶

func (n *Node) GetLang() string

GetLang returns the lang attribute of html node, if any.

func (*Node) GetTitle ¶

func (n *Node) GetTitle() (title string)

GetTitle returns the title of the document, if any.

Note that if og:title exists in the meta header, it's preferred over title.

func (*Node) IsAMP ¶

func (n *Node) IsAMP() bool

IsAMP returns true if root is an AMP html document.

func (*Node) Readable ¶

func (n *Node) Readable(ctx context.Context, args ReadableArgs) (*html.Node, map[string]io.Reader, error)

Readable strips node n into a readable one, with all images downloaded and replaced.

type ReadableArgs ¶

type ReadableArgs struct {
	// Base URL of the document, used in case the image URLs are relative.
	BaseURL *url.URL

	// User-Agent to be used to download images.
	UserAgent string

	// Directory prefix for downloaded images.
	ImagesDir string

	// If Grayscale is set to true,
	// all images will be grayscaled and encoded as jpegs.
	//
	// If any error happened while trying to grayscale the image,
	// it will be logged via Logger.
	Grayscale bool
	Logger    logger.Logger
}

ReadableArgs defines the args used by Readable function.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
appengine module
birds Package birds generates HTML out of twitter threads.	Package birds generates HTML out of twitter threads.
cloudrun module
cmd
epubwriter Module
debug module
grayscale Package grayscale provides function to grayscale an image.	Package grayscale provides function to grayscale an image.
logger Package logger provides a simple log interface that you can wrap whatever logging library you use into.	Package logger provides a simple log interface that you can wrap whatever logging library you use into.
rmapi Package rmapi implements reMarkable api, as described in https://github.com/splitbrain/ReMarkableAPI/wiki.	Package rmapi implements reMarkable api, as described in https://github.com/splitbrain/ReMarkableAPI/wiki.
debug Module
tgbot Package tgbot provides some simple wrapping around telegram bot api.	Package tgbot provides some simple wrapping around telegram bot api.
ziputil Package ziputil provides some utility functions for zip archive handling.	Package ziputil provides some utility functions for zip archive handling.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL