Documentation ¶
Index ¶
- type Option
- func WithAllowURLRevisit() Option
- func WithCookies(url string, cookies []*http.Cookie) Option
- func WithDetectCharset() Option
- func WithDisableCookies() Option
- func WithHeaders(headers map[string]string) Option
- func WithIgnoreRobotsTxt() Option
- func WithLogDebugger() Option
- func WithRandomUserAgent() Option
- func WithRequestTimeout(timeout time.Duration) Option
- func WithTransport(transport http.RoundTripper) Option
- func WithUserAgent(ua string) Option
- type Scraper
- func (s *Scraper) ClonedCollector() *colly.Collector
- func (s *Scraper) Name() string
- func (s *Scraper) NormalizeID(id string) string
- func (s *Scraper) ParseIDFromURL(string) (string, error)
- func (s *Scraper) Priority() int
- func (s *Scraper) SetRequestTimeout(timeout time.Duration)
- func (s *Scraper) URL() *url.URL
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Option ¶
func WithAllowURLRevisit ¶
func WithAllowURLRevisit() Option
func WithDetectCharset ¶
func WithDetectCharset() Option
func WithDisableCookies ¶
func WithDisableCookies() Option
func WithHeaders ¶
func WithIgnoreRobotsTxt ¶
func WithIgnoreRobotsTxt() Option
func WithLogDebugger ¶
func WithLogDebugger() Option
func WithRandomUserAgent ¶
func WithRandomUserAgent() Option
func WithRequestTimeout ¶
func WithTransport ¶
func WithTransport(transport http.RoundTripper) Option
func WithUserAgent ¶
type Scraper ¶
type Scraper struct {
// contains filtered or unexported fields
}
Scraper implements basic Provider interface.
func NewDefaultScraper ¶
NewDefaultScraper returns a *Scraper with default options enabled.
func NewScraper ¶
NewScraper returns Provider implemented *Scraper.
func (*Scraper) ClonedCollector ¶
func (s *Scraper) ClonedCollector() *colly.Collector
ClonedCollector returns cloned internal collector.
func (*Scraper) NormalizeID ¶
func (*Scraper) SetRequestTimeout ¶
SetRequestTimeout sets timeout for HTTP requests.
Click to show internal directories.
Click to hide internal directories.