crawler

package

v0.0.0-...-28926d1 Latest Latest Go to latest Published: Feb 8, 2024 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/apoldev/trackchecker

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
Variables
type Manager
- func NewCrawlerManager(repo SpiderFinder, log logger.Logger, client *http.Client) *Manager
- func (c *Manager) Start(track *models.TrackingNumber) (*models.Crawler, error)
type SpiderFinder

Constants ¶

View Source

const (
	StatusFinish    = "finish"
	StatusRunning   = "running"
	StatusNoSpiders = "no_spiders"
)

Variables ¶

View Source

var (
	ErrTrackIsNil = errors.New("tracking number is nil")
)

Functions ¶

This section is empty.

Types ¶

type Manager ¶

type Manager struct {
	SpiderFinder SpiderFinder
	// contains filtered or unexported fields
}

Manager creates Crawler instances for starts spiders. Crawler executes N tasks in goroutines and collect results.

func NewCrawlerManager ¶

func NewCrawlerManager(repo SpiderFinder, log logger.Logger, client *http.Client) *Manager

func (*Manager) Start ¶

func (c *Manager) Start(track *models.TrackingNumber) (*models.Crawler, error)

Start can start spiders in parallel and wait for all spiders to finish. For each spider we have to create new args for scraper with tracking number for replace in url or body or headers, etc...

After scraping, we can get result from scraper and save it to results. If we have error, we can save it to results too.

One spider can find another tracking number in result and start new spider for this tracking number.

type SpiderFinder ¶

type SpiderFinder interface {
	FindSpidersByTrackingNumber(trackingNumber string) []*models.Spider
}

SpiderFinder can found spiders by tracking number regexp.

Source Files ¶

View all Source files

crawler.go

Directories ¶

Path	Synopsis
repo

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL