Documentation ¶
Overview ¶
Package scraper contains an implementation of the tool interface for a web scraping tool.
Index ¶
Constants ¶
const ( DefualtMaxDept = 1 DefualtParallels = 2 DefualtDelay = 3 DefualtAsync = true )
Variables ¶
var ErrScrapingFailed = errors.New("scraper could not read URL, or scraping is not allowed for provided URL")
Functions ¶
This section is empty.
Types ¶
type Options ¶
type Options func(*Scraper)
func WithAsync ¶
async: The boolean value indicating if the scraper should run asynchronously. Returns a function that sets the async option for the Scraper.
func WithBlacklist ¶
WithBlacklist creates an Options function that appends the url endpoints to be excluded from the scraping, to the current list
Default value:
[]string{ "login", "signup", "signin", "register", "logout", "download", "redirect", },
blacklist: slice of strings with url endpoints to be excluded from the scraping. Returns: an Options function.
func WithDelay ¶
WithDelay creates an Options function that sets the delay of a Scraper.
The delay parameter specifies the amount of time in milliseconds that the Scraper should wait between requests.
Default value: 3
delay: the delay to set. Returns: an Options function.
func WithMaxDepth ¶
WithMaxDepth sets the maximum depth for the Scraper.
Default value: 1
maxDepth: the maximum depth to set. Returns: an Options function.
func WithNewBlacklist ¶
WithNewBlacklist creates an Options function that replaces the list of url endpoints to be excluded from the scraping, with a new list.
Default value:
[]string{ "login", "signup", "signin", "register", "logout", "download", "redirect", },
blacklist: slice of strings with url endpoints to be excluded from the scraping. Returns: an Options function.
func WithParallelsNum ¶
WithParallelsNum sets the number of maximum allowed concurrent requests of the matching domains
Default value: 2
parallels: the number of parallels to set. Returns: the updated Scraper options.
type Scraper ¶
func New ¶
New creates a new instance of Scraper with the provided options.
The options parameter is a variadic argument allowing the user to specify custom configuration options for the Scraper. These options can be functions that modify the Scraper's properties.
The function returns a pointer to a Scraper instance and an error. The error value is nil if the Scraper is created successfully.
func (Scraper) Call ¶
Call scrapes a website and returns the site data.
The function takes a context.Context object for managing the execution context and a string input representing the URL of the website to be scraped. It returns a string containing the scraped data and an error if any.
func (Scraper) Description ¶
Description returns the description of the Go function.
There are no parameters. It returns a string.