Documentation ¶
Overview ¶
Pholcus(幽灵蛛)是一款纯Go语言编写的支持分布式的高并发、重量级爬虫软件,定位于互联网数据采集,为具备一定Go或JS编程基础的人提供一个只需关注规则定制的功能强大的爬虫工具。 它支持单机、服务端、客户端三种运行模式,拥有Web、GUI、命令行三种操作界面;规则简单灵活、批量任务并发、输出方式丰富(mysql/mongodb/kafka/csv/excel等)、有大量Demo共享;另外它还支持横纵向两种抓取模式,支持模拟登录和任务暂停、取消等一系列高级功能。 (官方QQ群:Go大数据 42731170)。
Directories ¶
Path | Synopsis |
---|---|
app interface for graphical user interface.
|
app interface for graphical user interface. |
downloader/surfer
surfer是一款Go语言编写的高并发web下载器,支持 GET/POST/HEAD 方法及 http/https 协议,同时支持固定UserAgent自动保存cookie与随机大量UserAgent禁用cookie两种模式,高度模拟浏览器行为,可实现模拟登录等功能。
|
surfer是一款Go语言编写的高并发web下载器,支持 GET/POST/HEAD 方法及 http/https 协议,同时支持固定UserAgent自动保存cookie与随机大量UserAgent禁用cookie两种模式,高度模拟浏览器行为,可实现模拟登录等功能。 |
downloader/surfer/agent
Package agent generates user agents strings for well known browsers and for custom browsers.
|
Package agent generates user agents strings for well known browsers and for custom browsers. |
pipeline
数据收集
|
数据收集 |
pipeline/collector
结果收集与输出
|
结果收集与输出 |
[spider frame (golang)] Pholcus(幽灵蛛)是一款纯Go语言编写的高并发、分布式、重量级爬虫软件,支持单机、服务端、客户端三种运行模式,拥有Web、GUI、命令行三种操作界面;规则简单灵活、批量任务并发、输出方式丰富(mysql/mongodb/csv/excel等)、有大量Demo共享;同时她还支持横纵向两种抓取模式,支持模拟登录和任务暂停、取消等一系列高级功能; (官方QQ群:Go大数据 42731170,欢迎加入我们的讨论)。
|
[spider frame (golang)] Pholcus(幽灵蛛)是一款纯Go语言编写的高并发、分布式、重量级爬虫软件,支持单机、服务端、客户端三种运行模式,拥有Web、GUI、命令行三种操作界面;规则简单灵活、批量任务并发、输出方式丰富(mysql/mongodb/csv/excel等)、有大量Demo共享;同时她还支持横纵向两种抓取模式,支持模拟登录和任务暂停、取消等一系列高级功能; (官方QQ群:Go大数据 42731170,欢迎加入我们的讨论)。 |
common
|
|
config
Package config is used to parse config Usage: import( "github.com/astaxie/beego/config" ) cnf, err := config.NewConfig("ini", "config.conf") cnf APIS: cnf.Set(key, val string) error cnf.String(key string) string cnf.Strings(key string) []string cnf.Int(key string) (int, error) cnf.Int64(key string) (int64, error) cnf.Bool(key string) (bool, error) cnf.Float(key string) (float64, error) cnf.DefaultString(key string, defaultVal string) string cnf.DefaultStrings(key string, defaultVal []string) []string cnf.DefaultInt(key string, defaultVal int) int cnf.DefaultInt64(key string, defaultVal int64) int64 cnf.DefaultBool(key string, defaultVal bool) bool cnf.DefaultFloat(key string, defaultVal float64) float64 cnf.DIY(key string) (interface{}, error) cnf.GetSection(section string) (map[string]string, error) cnf.SaveConfigFile(filename string) error more docs http://beego.me/docs/module/config.md
|
Package config is used to parse config Usage: import( "github.com/astaxie/beego/config" ) cnf, err := config.NewConfig("ini", "config.conf") cnf APIS: cnf.Set(key, val string) error cnf.String(key string) string cnf.Strings(key string) []string cnf.Int(key string) (int, error) cnf.Int64(key string) (int64, error) cnf.Bool(key string) (bool, error) cnf.Float(key string) (float64, error) cnf.DefaultString(key string, defaultVal string) string cnf.DefaultStrings(key string, defaultVal []string) []string cnf.DefaultInt(key string, defaultVal int) int cnf.DefaultInt64(key string, defaultVal int64) int64 cnf.DefaultBool(key string, defaultVal bool) bool cnf.DefaultFloat(key string, defaultVal float64) float64 cnf.DIY(key string) (interface{}, error) cnf.GetSection(section string) (map[string]string, error) cnf.SaveConfigFile(filename string) error more docs http://beego.me/docs/module/config.md |
goquery
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document.
|
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. |
mahonia
This package is a character-set conversion library for Go.
|
This package is a character-set conversion library for Go. |
ping
taken from http://golang.org/src/pkg/net/ipraw_test.go notes: in MAC system, use root user to run.
|
taken from http://golang.org/src/pkg/net/ipraw_test.go notes: in MAC system, use root user to run. |
pool
通用资源池,动态增加资源实例,并支持空闲资源定时回收功能。
|
通用资源池,动态增加资源实例,并支持空闲资源定时回收功能。 |
session
Package session provider Usage: import( "github.com/astaxie/beego/session" ) func init() { globalSessions, _ = session.NewManager("memory", `{"cookieName":"gosessionid", "enableSetCookie,omitempty": true, "gclifetime":3600, "maxLifetime": 3600, "secure": false, "cookieLifeTime": 3600, "providerConfig": ""}`) go globalSessions.GC() } more docs: http://beego.me/docs/module/session.md
|
Package session provider Usage: import( "github.com/astaxie/beego/session" ) func init() { globalSessions, _ = session.NewManager("memory", `{"cookieName":"gosessionid", "enableSetCookie,omitempty": true, "gclifetime":3600, "maxLifetime": 3600, "secure": false, "cookieLifeTime": 3600, "providerConfig": ""}`) go globalSessions.GC() } more docs: http://beego.me/docs/module/session.md |
websocket
Package websocket implements a client and server for the WebSocket protocol as specified in RFC 6455.
|
Package websocket implements a client and server for the WebSocket protocol as specified in RFC 6455. |
runtime
|
|
[spider frame (golang)] Pholcus(幽灵蛛)是一款纯Go语言编写的高并发、分布式、重量级爬虫软件,支持单机、服务端、客户端三种运行模式,拥有Web、GUI、命令行三种操作界面;规则简单灵活、批量任务并发、输出方式丰富(mysql/mongodb/csv/excel等)、有大量Demo共享;同时她还支持横纵向两种抓取模式,支持模拟登录和任务暂停、取消等一系列高级功能; (官方QQ群:Go大数据 42731170,欢迎加入我们的讨论)。
|
[spider frame (golang)] Pholcus(幽灵蛛)是一款纯Go语言编写的高并发、分布式、重量级爬虫软件,支持单机、服务端、客户端三种运行模式,拥有Web、GUI、命令行三种操作界面;规则简单灵活、批量任务并发、输出方式丰富(mysql/mongodb/csv/excel等)、有大量Demo共享;同时她还支持横纵向两种抓取模式,支持模拟登录和任务暂停、取消等一系列高级功能; (官方QQ群:Go大数据 42731170,欢迎加入我们的讨论)。 |
Click to show internal directories.
Click to hide internal directories.