Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ExtractLinks ¶
func ExtractLinks(payload string, originalURL string, shouldFetch URLFetchChecker) (toFetch ExtractedLinks, toStore ExtractedLinks)
ExtractLinks gets links from a page
Types ¶
type ExtractedLinks ¶
ExtractedLinks holds the current url we parsed and the links extracted from it
type PageStructure ¶
type PageStructure struct { Title string `json:"title,omitempty"` H1 []string `json:"h1,omitempty"` H2 []string `json:"h2,omitempty"` H3 []string `json:"h3,omitempty"` H4 []string `json:"h4,omitempty"` Text []string `json:"text,omitempty"` }
PageStructure holds the parsed/extracted data from a page
func ExtractText ¶
func ExtractText(payload string) PageStructure
ExtractText extracts text from a page
type URLFetchChecker ¶
URLFetchChecker is a function that tells us if we should fetch a link or not
Click to show internal directories.
Click to hide internal directories.