Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( // ErrInvalidURL is the error thrown if visiting URL // is invalid format. ErrInvalidURL = errors.New("Invalid URL") // ErrForbidden is the error thrown if the url is not allowed to visit ErrForbidden = errors.New("Forbidden") // ErrAlreadyVisitedDomain is the error for already visited URL ErrAlreadyVisited = errors.New("Already visited") )
Functions ¶
This section is empty.
Types ¶
type Crawler ¶
type Crawler struct {
// contains filtered or unexported fields
}
func NewCrawlerWithLimitRule ¶
NewCrawlerWithLimitRule returns `*Crawler` with LimitRule.
func (*Crawler) OnError ¶
func (c *Crawler) OnError(f ErrorCallback)
OnError register a function. Function will be executed on error occured
func (*Crawler) OnVisit ¶
func (c *Crawler) OnVisit(f VisitCallback)
OnVisit register a function. Function will be executed on visiting web site.
func (*Crawler) OnVisited ¶
func (c *Crawler) OnVisited(f VisitedCallback)
OnVisited register a function. Function will be executed on after visit web site.
func (*Crawler) SetParallelism ¶
SetParallelism set limit of crawling parallelism. By default, parallelism is 5.
func (*Crawler) UseHeadlessChrome ¶
func (c *Crawler) UseHeadlessChrome()
UseHeadlessChrome use headless chrome at the time of request. By default, using `http.Get()`.
type ErrorCallback ¶
type ErrorCallback func(error)
ErrorCallback is a type of alias for OnError callback functions
type LimitRule ¶
type LimitRule struct { // AllowedHosts define accessible hosts. // When AllowedHosts is empty, all hosts are allowed. AllowedHosts []string AllowedUrls []*regexp.Regexp }
func NewLimitRule ¶
func NewLimitRule() *LimitRule
NewLimitRule returns empty LimitRule. You can add rule to it to call AddAllowedHosts
func (*LimitRule) AddAllowedHosts ¶
AddAllowedHosts add rule define accessible hosts.
func (*LimitRule) AddAllowedUrls ¶
type VisitCallback ¶
type VisitCallback func(response []byte)
VisitCallback is a type of alias for OnVisit callback functions
type VisitedCallback ¶
type VisitedCallback func(*CrawlResult)
VisitedCallback is a type of alias for OnVisited callback functions