Documentation ¶
Index ¶
- func Sitemap(resp *Response, file string) error
- type Response
- func Start(ctx context.Context, url string) (resp *Response, err error)
- func StartWithDepth(ctx context.Context, url string, maxDepth int) (resp *Response, err error)
- func StartWithDepthAndDomainRegex(ctx context.Context, url string, maxDepth int, domainRegex string) (resp *Response, err error)
- func StartWithDomainRegex(ctx context.Context, url, domainRegex string) (resp *Response, err error)
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Response ¶
type Response struct { BaseURL *url.URL // starting url at maxDepth 0 UniqueURLs map[string]int // UniqueURLs holds the map of unique urls we crawled and times its repeated URLsPerDepth map[int][]*url.URL // URLsPerDepth holds url found in each depth SkippedURLs map[string][]string // SkippedURLs holds urls from different domains(if domainRegex is given) and invalid URLs ErrorURLs map[string]error // errorURLs holds details as to why reason this url was not crawled DomainRegex *regexp.Regexp // restricts crawling the urls to given domain MaxDepth int // MaxDepth of crawl, -1 means no limit for maxDepth Interrupted bool // says if gru was interrupted while scraping }
Response holds the scrapped response
func StartWithDepth ¶
StartWithDepth will start the scrapping with given max depth and base url domain
func StartWithDepthAndDomainRegex ¶
func StartWithDepthAndDomainRegex(ctx context.Context, url string, maxDepth int, domainRegex string) (resp *Response, err error)
StartWithDepthAndDomainRegex will start the scrapping with max depth and regex
func StartWithDomainRegex ¶
StartWithDomainRegex will start the scrapping with no depth limit(-1) and regex
Click to show internal directories.
Click to hide internal directories.