scrape

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 28, 2021 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Index

Constants

View Source
const DaumNews = "news.daum.net"

DaumNews is the root domain for visiting.

View Source
const NaverNews = "news.naver.com"

NaverNews is the root domain for visiting.

Variables

This section is empty.

Functions

func OnHTMLDaumHeadlineNews

func OnHTMLDaumHeadlineNews(dc chan<- gachifinder.GachiData, s *Scrape)

OnHTMLDaumHeadlineNews registers to subvisit and parse the scraped "new.daum.com" HTML.

func OnHTMLNaverHeadlineNews

func OnHTMLNaverHeadlineNews(dc chan<- gachifinder.GachiData, s *Scrape)

OnHTMLNaverHeadlineNews registers to subvisit and parse the scraped "new.naver.com" HTML.

Types

type ParsingHandler

type ParsingHandler func(chan<- gachifinder.GachiData, *Scrape)

ParsingHandler ...

type Scrape

type Scrape struct {
	Config *gachifinder.Config
	// contains filtered or unexported fields
}

Scrape struct.

func (*Scrape) Do

func (s *Scrape) Do(fs []ParsingHandler) <-chan gachifinder.GachiData

Do creates colly.collector and queue, and then do and wait till done

type Scraper

type Scraper interface {
	// Do is a producer in a part of a pipeline
	Do([]ParsingHandler) <-chan gachifinder.GachiData
}

Scraper interface is a crawling actor.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL