Documentation ¶
Index ¶
- func Parse(fname, url, page string) ([]map[string]interface{}, error)
- func ParseExt(fname, url, page string) (string, error)
- func ParseLinks(page, url string) ([]string, error)
- func ParseNewLinks(page, url string) ([]string, error)
- type DomNode
- type Parser
- func (p *Parser) Do() ([]*UrlTask, []map[string]interface{}, error)
- func (p *Parser) Parse(page, pageUrl string) ([]*UrlTask, []map[string]interface{}, error)
- func (p *Parser) ParseURL(url string) ([]*UrlTask, []map[string]interface{}, error)
- func (p *Parser) RunJs(items []map[string]interface{}) ([]map[string]interface{}, error)
- type Rule
- type UrlTask
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ParseLinks ¶
ParseLinks returns all urls contained in html page
func ParseNewLinks ¶
ParseNewLinks returns new urls contained in html page
Types ¶
type Parser ¶
type Parser struct { Name string `json:"name"` DefaultFields bool `json:"default_fields"` ZipContent bool `json:"zip_content"` ExampleUrl string `json:"example_url"` UA string `json:"ua"` Urls []string `json:"urls"` Rules map[string][]*Rule `json:"rules"` Js string `json:"js"` }
Parser contains a set of cascaded rule and an optional js code to parse corresponding htmls
Source Files ¶
Click to show internal directories.
Click to hide internal directories.