Documentation ¶
Index ¶
Constants ¶
View Source
const ( // PageExtension is the file extension that downloaded pages get. PageExtension = ".html" // PageDirIndex is the file name of the index file for every dir. PageDirIndex = "index" + PageExtension )
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Config ¶ added in v0.1.1
type Config struct { URL string Includes []string Excludes []string ImageQuality uint // image quality from 0 to 100%, 0 to disable reencoding MaxDepth uint // download depth, 0 for unlimited Timeout uint // time limit in seconds to process each http request OutputDirectory string Username string Password string Cookies []Cookie Header http.Header Proxy string UserAgent string }
Config contains the scraper configuration.
type Cookie ¶ added in v0.2.0
type Cookie struct { Name string `json:"name"` Value string `json:"value,omitempty"` Expires *time.Time `json:"expires,omitempty"` }
Cookie represents a cookie, it copies parts of the http.Cookie struct but changes the JSON marshaling to exclude empty fields.
type Scraper ¶
type Scraper struct { URL *url.URL // contains the main URL to parse, will be modified in case of a redirect // contains filtered or unexported fields }
Scraper contains all scraping data.
Click to show internal directories.
Click to hide internal directories.