Documentation ¶
Index ¶
Constants ¶
View Source
const ( // AssetTypeLink is used for <link> assets AssetTypeLink = "link" // AssetTypeImage is used for <image> assets AssetTypeImage = "image" // AssetTypeScript is used for <script> assets AssetTypeScript = "script" )
Variables ¶
View Source
var ( // ErrURLInvalid is given when the URL provided to the 'Site' // method is empty or invalid ErrURLInvalid = errors.New("The given URL is invalid") // ErrHTTPError is given when the URL provided results in a // HTTP error code or could not be reached. ErrHTTPError = errors.New("The given URL gave a http error code") // ErrParseError is given when a page gave a HTML response that // could not be parsed. ErrParseError = errors.New("Failed to parse link") )
Functions ¶
This section is empty.
Types ¶
type Asset ¶
Asset represents a reference to a piece of static content. Assets include stylesheets, images and scripts. Assets can be external because they will not be followed by the scraper. They do not represent the content that was served but the reference from the page.
type Page ¶
type Page struct { URL string `json:"url"` Assets []*Asset `json:"assets"` Pages []string `json:"pages"` }
Page represents a location within a sitemap. Should be indicative of a page within the website.
type Sitemap ¶
type Sitemap struct {
Pages []*Page `json:"pages"`
}
Sitemap represents a heirachy of pages within a webiste
Click to show internal directories.
Click to hide internal directories.