Documentation ¶
Overview ¶
Package robots implements a higher-level robots.txt interface.
The package implements a cache that caches robots.txt structures per hostname.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Cache ¶
type Cache struct {
// contains filtered or unexported fields
}
Cache implements an LRU robots cache.
The cache maintains an LRU of domain names into their robots.txt structures, when a new domain is seen the cache will fetch the robots.txt parse it, and add it to the cache.
func (*Cache) Allowed ¶
Allowed returns true if the request is allowed.
The method will lookup the robots.txt structure for the domain name and check if the request user agent is allowed to fetch the URL. Subsequent calls may use the cached robots.txt structures.
Note that robots.txt lookup is simplistic, it basically takes the hostname and appends `/robots.txt` to it, this means that the X-robots header and the robots.txt meta tag are not considered.
The method returns an error if the context is canceled or if a parsing error occurs.