english

package
v0.0.0-...-25b8d04 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 26, 2024 License: MIT Imports: 5 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type NumWordsRulesClassifier

type NumWordsRulesClassifier struct{}

NumWordsRulesClassifier classifies several TextBlock as content or not-content through rules that have been determined using the C4.8 machine learning algorithm, as described in the paper "Boilerplate Detection using Shallow Text Features" (WSDM 2010), particularly using number of words per block and link density per block.

func NewNumWordsRulesClassifier

func NewNumWordsRulesClassifier() *NumWordsRulesClassifier

func (*NumWordsRulesClassifier) Process

type TerminatingBlocksFinder

type TerminatingBlocksFinder struct{}

TerminatingBlocksFinder finds blocks which are potentially indicating the end of an article text and marks them with label.StrictlyNotContent.

func NewTerminatingBlocksFinder

func NewTerminatingBlocksFinder() *TerminatingBlocksFinder

func (*TerminatingBlocksFinder) Process

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL