scraper

package
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 30, 2018 License: MIT Imports: 7 Imported by: 0

Documentation

Overview

Package scraper implements a web scraper as a library

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type State

type State struct {
	Timeout time.Duration    // Timeout for each individual item
	Getter  getter.Interface // Getter gets the page
	Parser  parser.Interface // Parser parses links
	Queuer  queuer.Interface // Queuer queues new items and starts queued items
	Logger  logger.Interface // Logger logs the results
}

State implements a web scraper

func (*State) Start

func (s *State) Start(ctx context.Context, url string)

Start starts the scraping with a base url. Cancel the context to end early.

Directories

Path Synopsis
Package getter defines an interface that is used to request results by URL
Package getter defines an interface that is used to request results by URL
mockgetter
Package mockgetter defines a getter.Interface that returns mock results for use in tests
Package mockgetter defines a getter.Interface that returns mock results for use in tests
webgetter
Package webgetter defines a getter.Interface that gets real results by HTTP
Package webgetter defines a getter.Interface that gets real results by HTTP
Package logger defines an interface that is used to log events and metrics during execution
Package logger defines an interface that is used to log events and metrics during execution
consolelogger
Package consolelogger defines a logger.Interface that emits logs to a writer (usually the console)
Package consolelogger defines a logger.Interface that emits logs to a writer (usually the console)
mocklogger
Package mocklogger defines a logger.Interface that stores a string representation of each logged event for testing
Package mocklogger defines a logger.Interface that stores a string representation of each logged event for testing
Package parser defines an interface used to parse HTML and extract links
Package parser defines an interface used to parse HTML and extract links
htmlparser
Package htmlparser defines a parser.Interface that parses HTML and returns the urls from anchor href attributes
Package htmlparser defines a parser.Interface that parses HTML and returns the urls from anchor href attributes
mockparser
Package mockparser defines a parser.Interface that returns dummy urls for a given input, and is used in tests
Package mockparser defines a parser.Interface that returns dummy urls for a given input, and is used in tests
Package queuer defines an interface used to queue and execute an action on items
Package queuer defines an interface used to queue and execute an action on items
concurrentqueuer
Package concurrentqueuer defines a queuer.Interface that runs several workers concurrently on a queue
Package concurrentqueuer defines a queuer.Interface that runs several workers concurrently on a queue

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL