Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type EmbedExtractor ¶
type EmbedExtractor interface { // RelevantTagNames returns a set of HTML tag names that are relevant to this extractor. RelevantTagNames() []string // Extract detects if a node should be extracted as an embedded element; if not return nil. Extract(node *html.Node) webdoc.Element }
EmbedExtractor is interface for extracting embedded nodes int webdoc.Element.
type ImageExtractor ¶
ImageExtractor treats images as another type of embed and provides heuristics for lead image candidacy.
func NewImageExtractor ¶
func NewImageExtractor(pageURL *nurl.URL, logger logutil.Logger) *ImageExtractor
func (*ImageExtractor) RelevantTagNames ¶
func (ie *ImageExtractor) RelevantTagNames() []string
type TwitterExtractor ¶
TwitterExtractor is used to look for Twitter embeds. This class will looks for both rendered and unrendered tweets.
func NewTwitterExtractor ¶
func NewTwitterExtractor(pageURL *nurl.URL, logger logutil.Logger) *TwitterExtractor
func (*TwitterExtractor) Extract ¶
func (te *TwitterExtractor) Extract(node *html.Node) webdoc.Element
func (*TwitterExtractor) RelevantTagNames ¶
func (te *TwitterExtractor) RelevantTagNames() []string
type VimeoExtractor ¶
VimeoExtractor is used for extracting Vimeo videos and relevant information.
func NewVimeoExtractor ¶
func NewVimeoExtractor(pageURL *nurl.URL, logger logutil.Logger) *VimeoExtractor
func (*VimeoExtractor) RelevantTagNames ¶
func (ve *VimeoExtractor) RelevantTagNames() []string
type YouTubeExtractor ¶
YouTubeExtractor is used for extracting YouTube videos and relevant information.
func NewYouTubeExtractor ¶
func NewYouTubeExtractor(pageURL *nurl.URL, logger logutil.Logger) *YouTubeExtractor
func (*YouTubeExtractor) Extract ¶
func (ye *YouTubeExtractor) Extract(node *html.Node) webdoc.Element
func (*YouTubeExtractor) RelevantTagNames ¶
func (ye *YouTubeExtractor) RelevantTagNames() []string
Click to show internal directories.
Click to hide internal directories.