Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Connector ¶
func (*Connector) CreateExecution ¶
type ScrapeWebsiteInput ¶
type ScrapeWebsiteInput struct { // TargetURL: The URL of the website to scrape. TargetURL string `json:"target_url"` // AllowedDomains: The list of allowed domains to scrape. AllowedDomains []string `json:"allowed_domains"` // MaxK: The maximum number of pages to scrape. MaxK int `json:"max_k"` // IncludeLinkText: Whether to include the scraped text of the scraped web page. IncludeLinkText *bool `json:"include_link_text"` // IncludeLinkHtml: Whether to include the scraped HTML of the scraped web page. IncludeLinkHtml *bool `json:"include_link_html"` }
ScrapeWebsiteInput defines the input of the scrape website task
type ScrapeWebsiteOutput ¶
type ScrapeWebsiteOutput struct { // Pages: The list of pages that were scraped. Pages []PageInfo `json:"pages"` }
ScrapeWebsiteOutput defines the output of the scrape website task
func Scrape ¶
func Scrape(input ScrapeWebsiteInput) (ScrapeWebsiteOutput, error)
Scrape crawls a webpage and returns a slice of PageInfo
Click to show internal directories.
Click to hide internal directories.