Directories ¶
Path | Synopsis |
---|---|
core
|
|
common/com_interfaces
Package com_interfaces contains some common interface of GO_SPIDER project.
|
Package com_interfaces contains some common interface of GO_SPIDER project. |
common/config
Package config provides for parse config file.
|
Package config provides for parse config file. |
common/etc_config
Package etc_config implements config initialization of one spider.
|
Package etc_config implements config initialization of one spider. |
common/mlog
Package mlog implements log operations.
|
Package mlog implements log operations. |
common/page
Package page contains result catched by Downloader.
|
Package page contains result catched by Downloader. |
common/page_items
Package page_items contains parsed result by PageProcesser.
|
Package page_items contains parsed result by PageProcesser. |
common/request
Package request implements request entity contains url and other relevant informaion.
|
Package request implements request entity contains url and other relevant informaion. |
common/resource_manage
Package resource_manage implements a resource management.
|
Package resource_manage implements a resource management. |
common/util
Package util contains some common functions of GO_SPIDER project.
|
Package util contains some common functions of GO_SPIDER project. |
downloader
Package downloader is the main module of GO_SPIDER for download page.
|
Package downloader is the main module of GO_SPIDER for download page. |
pipeline
Package pipeline is the persistent and offline process part of crawler.
|
Package pipeline is the persistent and offline process part of crawler. |
scheduler
The package is useless
|
The package is useless |
spider
craw master module
|
craw master module |
example
|
|
sina_stock_json_processor
The example gets stock newses from site sina.com (http://live.sina.com.cn/zt/f/v/finance/globalnews1).
|
The example gets stock newses from site sina.com (http://live.sina.com.cn/zt/f/v/finance/globalnews1). |
extension
|
|
Click to show internal directories.
Click to hide internal directories.