Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type MediaWikiParser ¶
type MediaWikiParser struct{}
func (MediaWikiParser) ParsePage ¶
func (_ MediaWikiParser) ParsePage(res scrape.Response) (*Page, error)
ParseSections parses raw HTML from MediaWikiResponse. Returns a parse.Page instance.
Can error when:
- The content in the response is not valid HTML
TODO: Add table parsing support
func (MediaWikiParser) ParseSection ¶
ParseSection parses the raw HTML of a MediaWikiResponse and searches for a section that contains the heading specified by the function argument. If found, returns a scrape.Page containing the section (heading and its corresponding body text.)
Can error when:
The content in the response is not valid HTML
The heading is not found.
FIX: Add support for querying the Intro section - either take in a flag arg to this function, or just do a simple if-check at the start to see if that's what the user wants. Could run into problems with this if there is a name clash with "Introduction" and an existing section though.
type Page ¶
Page represents a wiki/backend agnostic container for storing the content of a wiki page.
type Parser ¶
type Parser interface { ParsePage(res scrape.Response) (*Page, error) // Section should still return the page it belongs to, just with a single section ParseSection(res scrape.Response, heading string) (*Page, error) }
Parser dentes the methods one should implement on a parser struct for a specific wiki in order to handle parsing for different formats.
type ParsingError ¶
ParsingError represents an error encountered during the parsing of a goquery document generated from a scrape.Response body.
func (*ParsingError) Error ¶
func (e *ParsingError) Error() string
Error returns a formatted Parsing error including code and additional information