docfilter

package
v0.0.0-...-25b8d04 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 26, 2024 License: MIT Imports: 6 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type LeadImageFinder

type LeadImageFinder struct {
	// contains filtered or unexported fields
}

LeadImageFinder is used to identify a lead image for an article (sometimes known as a mast). Each candidate image is scored based on several heuristics including: - If an ancestor node of the image is the "figure" tag. - If the image is close with content.

In original dom-distiller they have two more heuristics which cannot be implemented here because it would require us to compute the stylesheet (NEED-COMPUTE-CSS): - The ratio of width/height. - The area of the image (width * height) relative to its container.

func NewLeadImageFinder

func NewLeadImageFinder(logger logutil.Logger) *LeadImageFinder

func (*LeadImageFinder) Process

func (f *LeadImageFinder) Process(doc *webdoc.Document) bool

type NestedElementRetainer

type NestedElementRetainer struct{}

func NewNestedElementRetainer

func NewNestedElementRetainer() *NestedElementRetainer

func (*NestedElementRetainer) Process

func (f *NestedElementRetainer) Process(doc *webdoc.Document) bool

type RelevantElements

type RelevantElements struct{}

func NewRelevantElements

func NewRelevantElements() *RelevantElements

func (*RelevantElements) Process

func (f *RelevantElements) Process(doc *webdoc.Document) bool

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL