package
Version:
v0.0.0-...-acd0fa8
Opens a new window with list of versions in this module.
Published: Jul 28, 2020
License: MIT
Opens a new window with license information.
Imports: 10
Opens a new window with list of imports.
Imported by: 0
Opens a new window with list of known importers.
Documentation
¶
GetHTMLDoc is get HTML document
GetLinks is get all links in HTML document
GetTexts is get all HTML texts matched elements by selector
func OutputJSONL(rows []string)
OutputJSONL is output jsonl to dataset directory
SanitizeHTML is sanitize HTML texts without policy
UniqStr is make stringSlice unique
Source Files
¶
Click to show internal directories.
Click to hide internal directories.