Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ErrPrefixMissing = errors.New("prefix is required")
Functions ¶
func StringifyCookies ¶
StringifyCookies serializes list of http.Cookies to string
func UnstringifyCookies ¶
UnstringifyCookies deserializes a cookie string to http.Cookies
Types ¶
type InMemoryStorage ¶
type InMemoryStorage struct {
// contains filtered or unexported fields
}
InMemoryStorage is the default storage backend of colly. InMemoryStorage keeps cookies and visited urls in memory without persisting data on the disk.
func (*InMemoryStorage) Close ¶
func (s *InMemoryStorage) Close() error
Close implements Storage.Close()
func (*InMemoryStorage) Init ¶
func (s *InMemoryStorage) Init() error
Init initializes InMemoryStorage
func (*InMemoryStorage) IsVisited ¶
func (s *InMemoryStorage) IsVisited(requestID string) (bool, error)
IsVisited implements Storage.IsVisited()
func (*InMemoryStorage) Visited ¶
func (s *InMemoryStorage) Visited(requestID string) error
Visited implements Storage.Visited()
type RedisStorage ¶
type RedisStorage struct { // Address is the redis server address Address string // Password is the password for the redis server Password string // DB is the redis database. Default is 0 DB int // Prefix is an optional string in the keys. It can be used // to use one redis database for independent scraping tasks. Prefix string // Client is the redis connection Client *redis.Client // Expiration time for Visited keys. After expiration pages // are to be visited again. Expires time.Duration }
func MustNewRedisStorage ¶
func MustNewRedisStorage(address, password string, db int, prefix string) *RedisStorage
func NewRedisStorage ¶
func NewRedisStorage(address, password string, db int, prefix string) (*RedisStorage, error)
func (*RedisStorage) Clear ¶
func (s *RedisStorage) Clear() error
Clear removes all entries from the storage
func (*RedisStorage) Visited ¶
func (s *RedisStorage) Visited(requestID string) error
type Storage ¶
type Storage interface { // Init initializes the storage Init() error // Visited receives and stores a request ID that is visited by the Collector Visited(requestID string) error // IsVisited returns true if the request was visited before IsVisited // is called IsVisited(requestID string) (bool, error) }
Storage is an interface which handles Collector's internal data, like visited urls and cookies. The default Storage of the Collector is the InMemoryStorage. Collector's storage can be changed by calling Collector.SetStorage() function.