skimmer

package module
v0.0.17 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 11, 2024 License: AGPL-3.0 Imports: 20 Imported by: 0

README

skimmer

skimmer is a lightweight feed reader inspired by newsboat and yarnc from yarn.social. skimmer is very minimal and deliberately lacks features. That is to say skimmer's best feature is what it doesn't do. skimmer tries to do two things well.

  1. Read a list of URLs, fetch the feeds and write the items to an SQLite 3 database
  2. Display the items in the SQLite 3 database in reverse chronological order

That's it. That is skimmer secret power. It does only two things. There is no elaborate user interface beyond standard input, standard output and standard error found on POSIX type operating systems. Even if you invoke it in "interactive" mode your choices are limited, press enter and go to next item, press "n" and mark the item read, press "s" and save the item, press "q" and quit interactive mode.

By storing the item information in an SQLite3 database (like newsboat's cache.db file) I can re-purpose the feed content as needed. An example would be generating a personal news aggregation page. Another might be to convert the entries to BibTeX and manage them as reference. Lots of options are possible.

skimmer's url list

As mentioned skimmer was very much inspired by newsboat. In fact it uses newsboat's urls list format. That's because skimmer isn't trying to replace newsboat as a reader of all feeds but instead gives me more options for how I read the feeds I've collected.

The newsboat urls file boils down to a list of urls, one per line with an optional "label" added after the url using the notation of space, double quote, tilde, label content followed by a double quote and end of line. That's really easy to parse. You can add comments using the hash mark with hash mark and anything to the right ignored when the urls are read in to skimmer.

UPDATE: 2023-10-31, In using the experimental skimmer app in practive I have found some feed sources still white list access based on user agent strings (not something that is concidered a "best practice" today). Unfortunately it is highly in conconsistant to know which string is accepted. As a result maintaining a list of feeds is really challenging unless you can specific a user agent string per feed source for those that need it. As a result I've add an additional column of content to the newsboat url file format. A user agent can be included after a feed's label by adding a space and the user agent string value.

skimmer's SQLite 3 database

skimmer uses SQLite 3 database with two tables for managing feeds and their content. It doesn't use newsboat's cache.db. The name of the skimmer database ends in ".skim" and pairs with the name of the urls file. Example if I have a urls list named "my_news.txt" skimmer will use a database file (and create it if it doesn't exist) called "my_news.skim". Each time skimmer reads the urls file it will replace the content in the skimmer database file except for any notations about a given item having been read or saved.

skimmer feed types

Presently skimmer is focused on reading RSS 2, Atom and jsonfeeds as that is provided by the Go package skimmer uses (i.e. goread). Someday, maybe, I hope to include support for Gopher or Gemini feeds.

SYNOPSIS

skimmer [OPTIONS] URL_LIST_FILENAME
skimmer [OPTIONS] SKIMMER_DB_FILENAME [TIME_RANGE]

skimmer have two ways to invoke it. You can fetch the contents from list of URLs in newsboat urls file format. You can read the items from the related skimmer database.

OPTIONS

-help : display a help page

-license : display license

-version : display version number and build hash

-limit N : Limit the display the N most recent items

-prune : The deletes items from the items table for the skimmer file provided. If a time range is provided then the items in the time range will be deleted. If a single time is provided everything older than that time is deleted. A time can be specified in several ways. An alias of "today" would remove all items older than today. If "now" is specified then all items older then the current time would be removed. Otherwise time can be specified as a date in YYYY-MM-DD format or timestamp YYYY-MM-DD HH:MM:SS format.

-i, -interactive : display an item and prompt for next action. e.g. (n)ext, (s)ave, (t)ag, (q)uit. If you press enter the next item will be displayed without marking changing the items state (e.g. marking it read). If you press "n" the item will be marked as read before displaying the next item. If you press "s" the item will be tagged as saved and next item will be displayed. If you press "t" you can tag the items. Tagged items are treated as save but the next item is not fetched. Pressing "q" will quit interactive mode without changing the last items state.

Examples

Fetch and read my newsboat feeds from .newsboat/urls. This will create a .newsboat/urls.skim if it doesn't exist. Remember invoking skimmer with a URLs file will retrieve feeds and their contents and invoking skimmer with the skimmer database file will let you read them.

skimmer .newsboat/urls
skimmer .newsboat/urls.skim

This will fetch and read the feeds frommy-news.urls. This will create a my-news.skim file. When the skimmer database is read a simplistic interactive mode is presented.

skimmer my-news.urls
skimmer -i my-news.skim

The same method is used to update your my-news.skim file and read it.

Export the current state of the skimmer database channels to a urls file. Feeds that failed to be retrieved will not be in the database channels table channels table. This is an easy way to get rid of the cruft and dead feeds.

skimmer -urls my-news.skim >my-news.urls

Prune the items in the database older than today.

skimmer -prune my-news.skim today

Prune the items older than September 30, 2023.

skimmer -prune my-news.skim \
    "2023-09-30 23:59:59"

Installation instructions

Installation From Source

Requirements

skimmer is an experiment. The compiled binaries are not necessarily tested. To compile from source you need to have git, make, Pandoc, SQLite3 and Go.

  • Git >= 2
  • Make >= 3.8 (GNU Make)
  • Pandoc > 3
  • SQLite3 > 3.4
  • Go >= 1.21.4
Steps to compile and install

Installation process I used to setup skimmer on a new machine.

git clone https://github.com/rsdoiel/skimmer
cd skimmer
make
make install

Acknowledgments

This experiment would not be possible with the authors of newsboat, SQLite3, Pandoc and the gofeed package for Go.

Documentation

Index

Constants

View Source
const (
	EnvHttpBrowser   = "SKIM_HTTP_BROWSER"
	EnvGopherBrowser = "SKIM_GOPHER_BROWSER"
	EnvGemeniBrowser = "SKIM_GEMINI_BROWSER"
	EnvFtpBrowser    = "SKIM_FTP_BROWSER"
)
View Source
const (
	// Version number of release
	Version = "0.0.17"

	// ReleaseDate, the date version.go was generated
	ReleaseDate = "2024-10-11"

	// ReleaseHash, the Git hash when version.go was generated
	ReleaseHash = "d91cbc7"

	LicenseText = `` /* 33747-byte string literal not displayed */

)

Variables

View Source
var (
	// SQLCreateTables provides the statements that are use to create our tables
	// It has two percent s, first is feed list name, second is datetime scheme
	// was generated.
	SQLCreateTables = `` /* 651-byte string literal not displayed */

	// SQLResetChannels clear the channels talbe
	SQLResetChannels = `DELETE FROM channels;`

	// Update the channels in the skimmer file
	SQLUpdateChannel = `` /* 221-byte string literal not displayed */

	// Update a feed item in the items table
	SQLUpdateItem = `` /* 302-byte string literal not displayed */

	// Return link and title for Urls formatted output
	SQLChannelsAsUrls = `SELECT link, title FROM channels ORDER BY link;`

	// SQLItemCount returns a list of items in the items table
	SQLItemCount = `SELECT COUNT(*) FROM items;`

	// SQLItemStats returns a list of rows with totals per status
	SQLItemStats = `SELECT IIF(status = '', 'unread', status) AS status, COUNT(*) FROM items GROUP BY status ORDER BY status`

	// SQLDisplayItems returns a list of items in decending chronological order.
	SQLDisplayItems = `` /* 183-byte string literal not displayed */

	SQLMarkItem = `UPDATE items SET status = ? WHERE link = ?;`

	SQLTagItem = `UPDATE items SET tags = ? WHERE link = ?;`

	// SQLPruneItems will prune our items table for all items that have easier
	// a updated or publication date early than the timestamp provided.
	SQLPruneItems = `` /* 167-byte string literal not displayed */

)

Functions

func CheckWaitInterval

func CheckWaitInterval(iTime time.Time, wait time.Duration) (time.Time, bool)

CheckWaitInterval checks to see if an interval of time has been met or exceeded. It returns the remaining time interval (possibly reset) and a boolean. The boolean is true when the time interval has been met or exceeded, false otherwise.

``` tot := len(something) // calculate the total number of items to process t0 := time.Now() iTime := time.Now() reportProgress := false

for i, key := range records {
    // ... process stuff ...
    if iTime, reportProgress = CheckWaitInterval(rptTime, (30 * time.Second)); reportProgress {
        log.Printf("%s", ProgressETA(t0, i, tot))
    }
}

```

func ClearScreen added in v0.0.3

func ClearScreen()

func FmtHelp

func FmtHelp(src string, appName string, version string, releaseDate string, releaseHash string) string

FmtHelp lets you process a text block with simple curly brace markup.

func JSONMarshal added in v0.0.3

func JSONMarshal(data interface{}) ([]byte, error)

JSONMarshal provides provide a custom json encoder to solve a an issue with HTML entities getting converted to UTF-8 code points by json.Marshal(), json.MarshalIndent().

func JSONMarshalIndent added in v0.0.3

func JSONMarshalIndent(data interface{}, prefix string, indent string) ([]byte, error)

JSONMarshalIndent provides provide a custom json encoder to solve a an issue with HTML entities getting converted to UTF-8 code points by json.Marshal(), json.MarshalIndent().

func JSONUnmarshal added in v0.0.3

func JSONUnmarshal(src []byte, data interface{}) error

JSONUnmarshal is a custom JSON decoder so we can treat numbers easier

func OpenInBrowser added in v0.0.5

func OpenInBrowser(in io.Reader, out io.Writer, eout io.Writer, link string) error

func ParseURLList

func ParseURLList(fName string, src []byte) (map[string]*FeedSource, error)

ParseURLList takes a filename and byte slice source, parses the contents returning a map of urls to labels and an error value.

func ProgressETA

func ProgressETA(t0 time.Time, i int, tot int) string

ProgressETA returns a string with the percentage processed and estimated time remaining. It requires the a counter of records processed, the total count of records and a time zero value.

``` tot := len(something) // calculate the total number of items to process t0 := time.Now() iTime := time.Now() reportProgress := false

for i, key := range records {
    // ... process stuff ...
    if iTime, reportProgress = CheckWaitInterval(rptTime, (30 * time.Second)); reportProgress {
        log.Printf("%s", ProgressETA(t0, i, tot))
    }
}

```

func ProgressIPS

func ProgressIPS(t0 time.Time, i int, timeUnit time.Duration) string

ProgressIPS returns a string with the elapsed time and increments per second. Takes a time zero, a counter and time unit. Returns a string with count, running time and increments per time unit. ``` t0 := time.Now() iTime := time.Now() reportProgress := false

for i, key := range records {
    // ... process stuff ...
    if iTime, reportProgress = CheckWaitInterval(rptTime, (30 * time.Second)); reportProgress || i = 0 {
        log.Printf("%s", ProgressIPS(t0, i, time.Second))
    }
}

```

func SaveChannel added in v0.0.8

func SaveChannel(db *sql.DB, link string, feedLabel string, channel *gofeed.Feed) error

SaveChannel will write the Channel information to a skimmer channel table.

func SaveItem added in v0.0.8

func SaveItem(db *sql.DB, feedLabel string, item *gofeed.Item) error

SaveItem saves a gofeed item to the item table in the skimmer database

func SetupScreen added in v0.0.3

func SetupScreen(out io.Writer)

Types

type FeedSource added in v0.0.8

type FeedSource struct {
	Url       string `json:"url,omitempty"`
	Label     string `json:"label,omitempty"`
	UserAgent string `json:"user_agent,omitempty"`
}

FeedSource describes the source of a feed. It includes the URL, an optional label, user agent string.

type Html2Skim added in v0.0.8

type Html2Skim struct {
	// AppName holds the name of the application
	AppName string `json:"app_name,omitempty"`

	// DbName holds the path to the SQLite3 database
	DBName string `json:"db_name,omitempty"`

	// URL holds the URL to visit to collect items from
	URL string `json:"url,omitempty"`

	// Selector holds the HTML selector to used to retrieve links
	// an empty page will result looking for all href in the page document
	Selector string `json:"selector,omitempty"`

	// Title holds channel title for the psuedo feed created by scraping
	Title string `json:"title,omitempty"`

	// Description holds the channel description for the pseudo feed created by scraping
	Description string `json:"description,omitempty"`

	// Link set the feed link for channel, this is useful if you render a pseudo feed to RSS
	Link string `json:"link,omitempty"`

	// Generator lets you set the generator value for the channel
	Generator string `json:"generator,omitempty"`

	// LastBuildDate sets the date for the channel being built
	LastBuildDate string `json:"last_build_date,omitempty"`
	// contains filtered or unexported fields
}

Htm2Skim uses the Coly Golang package to scrape a website and turn it into an RSS feed.

Html2Skim struct holds the configuration for scraping a webpage and and updating a skimmer database populating both the channel table and items table based on how the struct is set.

func NewHtml2Skim added in v0.0.8

func NewHtml2Skim(appName string) (*Html2Skim, error)

NewHtml2Skim initialized a new Html2Skim struct

func (*Html2Skim) Run added in v0.0.8

func (app *Html2Skim) Run(out io.Writer, eout io.Writer, args []string, title string, description string, link string) error

func (*Html2Skim) Scrape added in v0.0.8

func (app *Html2Skim) Scrape(db *sql.DB, uri string, selector string) (*gofeed.Feed, error)

Scrape takes a Skimmer database, a URI (url) and CSS selector pointing at anchor elements you want to create a feed with. It then collects those links and renders a feed struct and error value.

type Skim2Md added in v0.0.5

type Skim2Md struct {
	// AppName holds the name of the application
	AppName string `json:"app_name,omitempty"`

	// DbName holds the path to the SQLite3 database
	DBName string `json:"db_name,omitempty"`

	// Title if this is set the title will be included
	// when generating the markdown of saved items
	Title string `json:"title,omitempty"`

	// FrontMatter, if true insert Frontmatter block in Markdown output
	FrontMatter bool `json:"frontmatter,omitempty"`

	// PocketButton, if true insert a "save to pocket" button for each RSS item output
	PocketButton bool
	// contains filtered or unexported fields
}

Skim2Md supports the skim2md cli.

func NewSkim2Md added in v0.0.5

func NewSkim2Md(appName string) (*Skim2Md, error)

NewSkim2Md initialized a new Skim2Md struct

func (*Skim2Md) DisplayItem added in v0.0.5

func (app *Skim2Md) DisplayItem(link string, title string, description string, enclosures string, updated string, published string, label string, tags string) error

func (*Skim2Md) Run added in v0.0.5

func (app *Skim2Md) Run(out io.Writer, eout io.Writer, args []string, frontMatter bool, pocketButton bool) error

func (*Skim2Md) Write added in v0.0.5

func (app *Skim2Md) Write(db *sql.DB) error

Write, display the contents from database

type Skimmer

type Skimmer struct {
	// AppName holds the name of the application
	AppName string `json:"app_name,omitempty"`

	// UserAgent holds the user agent string used by skimmer.
	// Right now I plan to default it to
	//       app.AppName + "/" + app.Version + " (" + ReleaseDate + "." + ReleaseHash + ")"
	UserAgent string `json:"user_agent,omitempty"`

	// DbName holds the path to the SQLite3 database
	DBName string `json:"db_name,omitempty"`

	// Urls are the map of urls to labels to be fetched or read
	Urls map[string]*FeedSource `json:"urls,omitempty"`

	// Limit contrains the number of items shown
	Limit int `json:"limit,omitempty"`

	// Prune contains the date to use to prune the database.
	Prune bool `json:"prune,omitempty"`

	// Interactive if true causes Run to display one item at a time with a minimal of input
	Interactive bool `json:"interactive,omitempty"`

	// AsURLs, output the skimmer feeds as a newsboat style url file
	AsURLs bool `json:"urls,omitempty"`
	// contains filtered or unexported fields
}

Skimmer is the application structure that holds configuration and ties the app to the runner for the cli.

func NewSkimmer

func NewSkimmer(appName string) (*Skimmer, error)

func (*Skimmer) ChannelsToUrls added in v0.0.3

func (app *Skimmer) ChannelsToUrls(db *sql.DB) ([]byte, error)

ChannelsToUrls converts the current channels table to Urls formated output and refreshes app.Urls data structure.

func (*Skimmer) DisplayItem added in v0.0.5

func (app *Skimmer) DisplayItem(link string, title string, description string, enclosures string, updated string, published string, label string, tags string) error

func (*Skimmer) Download

func (app *Skimmer) Download(db *sql.DB) error

Download the contents from app.Urls

func (*Skimmer) ItemCount

func (app *Skimmer) ItemCount(db *sql.DB) (int, error)

ItemCount returns the total number items in the database.

func (*Skimmer) MarkItem added in v0.0.3

func (app *Skimmer) MarkItem(db *sql.DB, link string, val string) error

func (*Skimmer) PruneItems

func (app *Skimmer) PruneItems(db *sql.DB, pruneDT time.Time) error

PruneItems takes a timestamp and performs a row delete on the table for items that are older than the timestamp.

func (*Skimmer) ReadUrls

func (app *Skimmer) ReadUrls(fName string) error

ReadUrls reads urls or OPML file provided and updates the feeds in the skimmer skimmer file.

Newsboat's url file format is `<URL><SPACE>"~<LABEL>"` one entry per line The hash mark, "#" at the start of the line indicates a comment line.

OPML is documented at http://opml.org

func (*Skimmer) ResetChannels added in v0.0.3

func (app *Skimmer) ResetChannels(db *sql.DB) error

func (*Skimmer) Run

func (app *Skimmer) Run(in io.Reader, out io.Writer, eout io.Writer, args []string) error

Run provides the runner for skimmer. It allows for testing of much of the cli functionality

func (*Skimmer) RunInteractive added in v0.0.3

func (app *Skimmer) RunInteractive(db *sql.DB) error

RunInteractive provides a sliver of interactive UI, basically displaying an item then prompting for an action.

func (*Skimmer) Setup

func (app *Skimmer) Setup(fPath string) error

Setup checks to see if anything needs to be setup (or fixed) for skimmer to run.

func (*Skimmer) TagItem added in v0.0.3

func (app *Skimmer) TagItem(db *sql.DB, link string, tag string) error

func (*Skimmer) Write

func (app *Skimmer) Write(db *sql.DB) error

Display the contents from database

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL