Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
View Source
var NonAlphaFeatureList []string = []string{
"digit-count",
"rune-count",
"dict-words-count",
"slash-count",
"colon-count",
"dash-count",
"dot-count",
"whitespace-count",
"class",
}
NonAlphaFeatureList contains a list of strings representing the Features excluding the letter frequencies
Functions ¶
func ExtractFeatures ¶
ExtractFeatures extracts features based on a given configuration and a directory containing words of different languages. Those features can then be used to train a ML model to automatically classify scraped fields for new websites.
func TrainModel ¶
Types ¶
Click to show internal directories.
Click to hide internal directories.