Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct { // The name of the source, e.g., "New York Magazine" Name string // url of the site containing links URL url.URL // CSS selector for a link within a list of links. ItemSelector css.Selector // CSS selector for a caption within a link item. // Relative to ItemSelector CaptionSelector css.Selector // CSS selector for the actual link within a link item. Should be an // "a" element. Relative to ItemSelector. LinkSelector css.Selector // Maximum number of Items in a Set. If a scraper returns more than this // within a link site, Items will be chosen arbitrarily. MaxItems uint // The minimum number of words that a block-level HTML element must // contain for it to be included in a link item's caption. Used to // exclude short pieces of text like blog tags, bylines, or anything // else that can get in the way of a caption's substance. // // Must be greater than zero. The default is three. ShortElementFilter int }
Config stores options for the link source container.
There is no support for grouped (i.e., comma-separated) selectors. This is because, while grouped selectors are useful for applying styles to generalized sets of elements, the HTML parser needs to locate elements individually.
func (*Config) CheckAndSetDefaults ¶
CheckAndSetDefaults validates c and either returns a copy of c with default settings applied or returns an error due to an invalid configuration
func (*Config) UnmarshalYAML ¶
UnmarshalYAML implements the yaml.Unmarshaler interface. Validation is performed here.
type LinkItem ¶
type LinkItem struct { // using a string here because we'll let the downstream context deal // with parsing URLs etc. This comes from a website so we can't really // trust it. LinkURL string Caption string }
LinkItem represents data for a single link item found within a list of links
func (LinkItem) Key ¶
Key returns the key to use for determining whether a LinkItem has already been stored within the database
func (LinkItem) NewKVEntry ¶
NewKVEntry prepares the LinkItem to be saved in the KV database. Keys are SHA256 hashes of the entire LinkItem. Values are timestamps in seconds since the Unix epoch. Usually we'll just be checking whether newly fetched LinkItems are already saved. Eventually we might want to use the timestamp.
type Set ¶
type Set struct { // The publication that the links came from Name string // contains filtered or unexported fields }
Set represents a set of link items. It's not meant to be modified by concurrent goroutines.
func NewSet ¶
NewSet initializes a new collection of listed link items for an HTML document Reader, link source configuration, and HTTP status code (which is treated as a 200 OK if not set)
func (*Set) AddMessage ¶
AddMessage adds a message to the Set for displaying later in an email. These messages are used only for ad hoc notes that don't belong in a LinkItem, such as error messages. Messages should be complete sentences.
func (*Set) CountLinkItems ¶
CountLinkItems returns the number of LinkItems managed by the Set
func (*Set) RemoveLinkItem ¶
RemoveLinkItem removes the LinkItem from the Set. Not to be used concurrently