Documentation ¶
Overview ¶
Package httpsyet provides the configuration and execution for crawling a list of sites for links that can be updated to HTTPS.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Crawler ¶
type Crawler struct { Sites []string // At least one URL. Out io.Writer // Required. Writes one detected site per line. Log *log.Logger // Required. Errors are reported here. Depth int // Optional. Limit depth. Set to >= 1. Parallel int // Optional. Set how many sites to crawl in parallel. Delay time.Duration // Optional. Set delay between crawls. Get func(string) (*http.Response, error) // Optional. Defaults to http.Get. Verbose bool // Optional. If set, status updates are written to logger. }
Crawler is used as configuration for Run. Is validated in Run().
type Rake ¶
type Rake struct {
// contains filtered or unexported fields
}
Rake represents a fanned out circular pipe network with a flexibly adjusting buffer. site item is processed once only - items seen before are filtered out.
A Rake may be used e.g. as a crawling Crawler where every link shall be visited only once.
func New ¶
New returns a (pointer to a) new operational Rake.
`rake` is the operation to be executed in parallel on any item which has not been seen before. Have it use `myrake.Feed(items...)` in order to provide feed-back.
`attr` allows to specify an attribute for the seen filter. Pass `nil` to filter on any item itself.
`somany` is the # of parallel processes - the parallelism of the network built by Rake, the # of parallel raking endpoints of the Rake.
func (*Rake) Attr ¶
Attr sets the (optional) attribute to discriminate 'seen'.
`attr` allows to specify an attribute for the 'seen' filter. If not set 'seen' will discriminate any item by itself.
Seen panics iff called after first nonempty `Feed(...)`
func (*Rake) Done ¶
func (my *Rake) Done() (done <-chan struct{})
Done returns a channel which will be signalled and closed when traffic has subsided, nothing is left to be processed and consequently all goroutines have terminated.
func (*Rake) Rake ¶
Rake sets the rake function to be applied (in parallel).
`rake` is the operation to be executed in parallel on any item which has not been seen before.
You may provide `nil` here and call `Rake(..)` later to provide it. Or have it use `myrake.Feed(items...)` in order to provide feed-back.
Rake panics iff called after first nonempty `Feed(...)`