Documentation ¶
Overview ¶
Package rss is a small library for simplifying the parsing of RSS and Atom feeds.
The package could do with more testing, but it conforms to the RSS 1.0, 2.0, and Atom 1.0 specifications, to the best of my ability. I've tested it with about 15 different feeds, and it seems to work fine with them.
If anyone has any problems with feeds being parsed incorrectly, please let me know so that I can debug and improve the package.
Example usage:
package main import "github.com/SlyMarbo/rss" func main() { feed, err := rss.Fetch("http://example.com/rss") if err != nil { // handle error. } // ... Some time later ... err = feed.Update() if err != nil { // handle error. } }
The output structure is pretty much as you'd expect:
type Feed struct { Nickname string // This is not set by the package, but could be helpful. Title string Description string Link string // Link to the creator's website. UpdateURL string // URL of the feed itself. Image *Image // Feed icon. Items []*Item ItemMap map[string]struct{} // Used in checking whether an item has been seen before. Refresh time.Time // Earliest time this feed should next be checked. Unread uint32 // Number of unread items. Used by aggregators. } type Item struct { Title string Summary string Content string Link string Date time.Time DateValid bool ID string Read bool } type Image struct { Title string URL string Height uint32 Width uint32 }
The library does its best to follow the appropriate specifications and not to set the Refresh time too soon. It currently follows all update time management methods in the RSS 1.0, 2.0, and Atom 1.0 specifications. If one is not provided, it defaults to 12 hour intervals (see DefaultRefreshInterval). If you are having issues with feed providors dropping connections, please let me know and I can increase this default, or you can increase the Refresh time manually. The Feed.Update method uses this Refresh time, so if Update seems to be returning very quickly with no new items, it's likely not making a request due to the provider's Refresh interval.
Index ¶
Constants ¶
const DATE = "15:04:05 MST 02/01/2006"
DATE is a constant date string.
Variables ¶
var DefaultFetchFunc = func(url string) (resp *http.Response, err error) { client := http.DefaultClient return client.Get(url) }
DefaultFetchFunc uses http.DefaultClient to fetch a feed.
var DefaultRefreshInterval = 12 * time.Hour
DefaultRefreshInterval is the minimum wait until the next refresh, provided the feed does not provide its own interval.
Setting this too high will delay the feed receiving new items, setting it too low will put excessive load on the feed hosts.
The default value is 12 hours.
var TimeLayouts = []string{ "Mon, 2 Jan 2006 15:04:05 Z", "Mon, 2 Jan 2006 15:04:05", "Mon, 2 Jan 2006 15:04:05 -0700", "Mon, 2 Jan 06 15:04:05 -0700", "Mon, 2 Jan 06 15:04:05", "2 Jan 2006 15:04:05 -0700", "2 Jan 2006 15:04:05", "2 Jan 06 15:04:05 -0700", "2006-01-02 15:04:05 -0700", "2006-01-02 15:04:05", time.ANSIC, time.UnixDate, time.RubyDate, time.RFC822Z, time.RFC1123Z, time.RFC3339, time.RFC3339Nano, "2 Jan 2006 15:04:05 -0700 MST", "2 Jan 2006 15:04:05 MST -0700", "Mon, 2 Jan 2006 15:04:05 MST -0700", "Mon, 2 Jan 2006 15:04:05 -0700 MST", "2 Jan 06 15:04:05 -0700 MST", "2 Jan 06 15:04:05 MST -0700", "Jan 2, 2006 15:04 PM -0700 MST", "Jan 2, 2006 15:04 PM MST -0700", "Jan 2, 06 15:04 PM MST -0700", "Jan 2, 06 15:04 PM -0700 MST", }
TimeLayouts is contains a list of time.Parse() layouts that are used in attempts to convert item.Date and item.PubDate string to time.Time values. The layouts are attempted in ascending order until either time.Parse() does not return an error or all layouts are attempted.
var TimeLayoutsLoadLocation = []string{ "Mon, 2 Jan 2006 15:04:05 MST", "Mon, 2 Jan 06 15:04:05 MST", "2 Jan 2006 15:04:05 MST", "2 Jan 06 15:04:05 MST", "Jan 2, 2006 15:04 PM MST", "Jan 2, 06 15:04 PM MST", time.RFC1123, time.RFC850, time.RFC822, }
TimeLayoutsLoadLocation are time layouts which do not contain the location as a fixed constant. Instead of -0700, they use MST. Golang does not load the timezone by default, which means parseTime calls `time.LoadLocation(t.Location().String())` and then applies the offset returned by LoadLocation to the result.
Functions ¶
This section is empty.
Types ¶
type Enclosure ¶
type Enclosure struct { URL string `json:"url"` Type string `json:"type"` Length uint `json:"length"` }
Enclosure maps an enclosure.
type Feed ¶
type Feed struct { Nickname string `json:"nickname"` // This is not set by the package, but could be helpful. Title string `json:"title"` Language string `json:"language"` Author string `json:"author"` Description string `json:"description"` Link string `json:"link"` // Link to the creator's website. UpdateURL string `json:"updateurl"` // URL of the feed itself. Image *Image `json:"image"` // Feed icon. Categories []string `json:"categories"` Items []*Item `json:"items"` ItemMap map[string]struct{} `json:"itemmap"` // Used in checking whether an item has been seen before. Refresh time.Time `json:"refresh"` // Earliest time this feed should next be checked. Unread uint32 `json:"unread"` // Number of unread items. Used by aggregators. FetchFunc FetchFunc `json:"-"` }
Feed is the top-level structure.
func FetchByClient ¶
FetchByClient uses a http.Client to fetch a URL.
func FetchByFunc ¶
FetchByFunc uses a func to fetch a URL.
func (*Feed) UpdateByFunc ¶
UpdateByFunc uses a func to update f.
type Image ¶
type Image struct { Title string `json:"title"` Href string `json:"href"` URL string `json:"url"` Height uint32 `json:"height"` Width uint32 `json:"width"` }
Image maps an image.
type Item ¶
type Item struct { Title string `json:"title"` Summary string `json:"summary"` Content string `json:"content"` Categories []string `json:"category"` Link string `json:"link"` Date time.Time `json:"date"` Image *Image `json:"image"` DateValid bool ID string `json:"id"` Enclosures []*Enclosure `json:"enclosures"` Read bool `json:"read"` }
Item represents a single story.
type RAWContent ¶
type RAWContent struct {
RAWContent string `xml:",innerxml"`
}