Documentation ¶
Overview ¶
Package paginate of the Dataflow kit describes Paginator interface to retrieve the next page from the current one.
Next page can be obtained in several ways:
BySelector returns a Paginator that extracts the next page from a document by querying a given CSS selector and extracting the given HTML attribute from the resulting element.
ByQueryParam returns a Paginator that returns the next page from a document by incrementing a given query parameter. Note that this will paginate infinitely - you probably want to specify a maximum number of pages to scrape by using MaxPages parameter of ScrapeOptions.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Paginator ¶
type Paginator interface { // NextPage controls the progress of the scrape. It is called for each input // page, starting with the origin URL, and is expected to return the URL of // the next page to process. Note that order matters - calling 'NextPage' on // page 1 should return page 2, not page 3. The function should return an // empty string when there are no more pages to process. NextPage(url string, document *goquery.Selection) (string, error) }
The Paginator interface should be implemented by things that can retrieve the next page from the current one.
func ByQueryParam ¶
ByQueryParam returns a Paginator that returns the next page from a document by incrementing a given query parameter. Note that this will paginate infinitely - you probably want to specify a maximum number of pages to scrape by using MaxPages parameter of ScrapeOptions.
func BySelector ¶
BySelector returns a Paginator that extracts the next page from a document by querying a given CSS selector and extracting the given HTML attribute from the resulting element.