crawler

package
v0.0.0-...-28926d1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 8, 2024 License: MIT Imports: 6 Imported by: 0

Documentation

Index

Constants

View Source
const (
	StatusFinish    = "finish"
	StatusRunning   = "running"
	StatusNoSpiders = "no_spiders"
)

Variables

View Source
var (
	ErrTrackIsNil = errors.New("tracking number is nil")
)

Functions

This section is empty.

Types

type Manager

type Manager struct {
	SpiderFinder SpiderFinder
	// contains filtered or unexported fields
}

Manager creates Crawler instances for starts spiders. Crawler executes N tasks in goroutines and collect results.

func NewCrawlerManager

func NewCrawlerManager(repo SpiderFinder, log logger.Logger, client *http.Client) *Manager

func (*Manager) Start

func (c *Manager) Start(track *models.TrackingNumber) (*models.Crawler, error)

Start can start spiders in parallel and wait for all spiders to finish. For each spider we have to create new args for scraper with tracking number for replace in url or body or headers, etc...

After scraping, we can get result from scraper and save it to results. If we have error, we can save it to results too.

One spider can find another tracking number in result and start new spider for this tracking number.

type SpiderFinder

type SpiderFinder interface {
	FindSpidersByTrackingNumber(trackingNumber string) []*models.Spider
}

SpiderFinder can found spiders by tracking number regexp.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL