Documentation ¶
Index ¶
Constants ¶
const DefaultRatePerSecond = 2
DefaultRatePerSecond defines the default request rate per second when creating a new Fetcher.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type DateFilterFunc ¶
type ExtractResult ¶
type Extractor ¶
type Extractor struct {
// contains filtered or unexported fields
}
Extractor is a utility for extracting Substack posts from URLs.
func NewExtractor ¶
NewExtractor creates a new Extractor with the provided Fetcher. If the Fetcher is nil, a default Fetcher will be used.
func (*Extractor) ExtractAllPosts ¶
func (e *Extractor) ExtractAllPosts(ctx context.Context, urls []string) <-chan ExtractResult
func (*Extractor) ExtractPost ¶
func (*Extractor) GetAllPostsURLs ¶
type FetchError ¶
FetchError represents an error returned when encountering too many requests with a Retry-After value.
func (*FetchError) Error ¶
func (e *FetchError) Error() string
Error returns the error message for the FetchError, indicating the retry wait time.
type FetchResult ¶
type FetchResult struct { Url string Body io.ReadCloser Error error }
FetchResult represents the result of a URL fetch operation.
type Fetcher ¶
type Fetcher struct { Client *http.Client RateLimiter *rate.Limiter BackoffCfg backoff.BackOff Cookie *http.Cookie }
Fetcher represents a URL fetcher with rate limiting and retry mechanisms.
func NewFetcher ¶
func NewFetcher(opts ...FetcherOption) *Fetcher
NewFetcher creates a new Fetcher with the provided options. If ratePerSecond is 0, the default rate (DefaultRatePerSecond) is used. If b is nil, the default backoff configuration is used.
type FetcherOption ¶
type FetcherOption func(*FetcherOptions)
FetcherOption defines a function that applies a specific option to FetcherOptions.
func WithBackOffConfig ¶
func WithBackOffConfig(b backoff.BackOff) FetcherOption
WithBackOffConfig sets the backoff configuration for the Fetcher.
func WithCookie ¶
func WithCookie(cookie *http.Cookie) FetcherOption
WithCookie sets the cookie for the Fetcher.
func WithProxyURL ¶
func WithProxyURL(proxyURL *url.URL) FetcherOption
WithProxyURL sets the proxy URL for the Fetcher.
func WithRatePerSecond ¶
func WithRatePerSecond(rate int) FetcherOption
WithRatePerSecond sets the rate per second for the Fetcher.
type FetcherOptions ¶
type FetcherOptions struct { RatePerSecond int ProxyURL *url.URL BackOffConfig backoff.BackOff Cookie *http.Cookie }
FetcherOptions holds configurable options for Fetcher.
type Post ¶
type Post struct { Id int `json:"id"` PublicationId int `json:"publication_id"` Type string `json:"type"` Slug string `json:"slug"` PostDate string `json:"post_date"` CanonicalUrl string `json:"canonical_url"` PreviousPostSlug string `json:"previous_post_slug"` NextPostSlug string `json:"next_post_slug"` CoverImage string `json:"cover_image"` Description string `json:"description"` WordCount int `json:"wordcount"` //PostTags []string `json:"postTags"` Title string `json:"title"` BodyHTML string `json:"body_html"` }
Post represents a structured Substack post with various fields.
type PostWrapper ¶
type PostWrapper struct {
Post Post `json:"post"`
}
PostWrapper wraps a Post object for JSON unmarshaling.