scrapers

package
v0.0.0-...-61abd5f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 16, 2015 License: MIT Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type AsyncSiteScraper

type AsyncSiteScraper struct {
	BaseScraper
	// contains filtered or unexported fields
}

AsyncSiteScraper scrapes an entire site asynchronously.

func NewAsyncSiteScraper

func NewAsyncSiteScraper(siteURL string) (*AsyncSiteScraper, error)

NewAsyncSiteScraper initializes a new AsyncSiteScraper.

func (*AsyncSiteScraper) Scrape

func (s *AsyncSiteScraper) Scrape() error

Scrape the site for links.

type BaseScraper

type BaseScraper struct {
	// contains filtered or unexported fields
}

BaseScraper is the base scraper for HTML sites/pages.

func (s *BaseScraper) UniqueLinks() []string

UniqueLinks returns all the unique internal links that where found.

type PageScraper

type PageScraper struct {
	BaseScraper
	// contains filtered or unexported fields
}

PageScraper scrapes a website and finds all pages.

func NewPageScraper

func NewPageScraper(siteURL string, pageURL string) *PageScraper

NewPageScraper initializes a PageScraper.

func (*PageScraper) Scrape

func (s *PageScraper) Scrape() error

Scrape the page for links.

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser parses a HTML page and stores the found links.

func NewParser

func NewParser(reader io.Reader) *Parser

NewParser initializes a new Parser.

func (p *Parser) Links() []string

Links returns the found links.

func (*Parser) Parse

func (p *Parser) Parse() error

Parse a HTML page and store the found links.

type Scraper

type Scraper interface {
	Scrape() error
	UniqueLinks() []string
}

Scraper defines a site/page scraper.

type SyncSiteScraper

type SyncSiteScraper struct {
	BaseScraper
	// contains filtered or unexported fields
}

SyncSiteScraper scrapes an entire site synchronously.

func NewSyncSiteScraper

func NewSyncSiteScraper(siteURL string) (*SyncSiteScraper, error)

NewSyncSiteScraper initializes a new SyncSiteScraper.

func (*SyncSiteScraper) Scrape

func (s *SyncSiteScraper) Scrape() error

Scrape the site for links.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL