request

package
v1.3.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 15, 2020 License: Apache-2.0 Imports: 11 Imported by: 584

Documentation

Index

Constants

View Source
const (
	DefaultDialTimeout = 2 * time.Minute // 默认请求服务器超时
	DefaultConnTimeout = 2 * time.Minute // 默认下载超时
	DefaultTryTimes    = 3               // 默认最大下载次数
	DefaultRetryPause  = 2 * time.Second // 默认重新下载前停顿时长
)
View Source
const (
	SURF_ID    = 0 // 默认的surf下载内核(Go原生),此值不可改动
	PHANTOM_ID = 1 // 备用的phantomjs下载内核,一般不使用(效率差,头信息支持不完善)
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Request

type Request struct {
	Spider        string          //规则名,自动设置,禁止人为填写
	Url           string          //目标URL,必须设置
	Rule          string          //用于解析响应的规则节点名,必须设置
	Method        string          //GET POST POST-M HEAD
	Header        http.Header     //请求头信息
	EnableCookie  bool            //是否使用cookies,在Spider的EnableCookie设置
	PostData      string          //POST values
	DialTimeout   time.Duration   //创建连接超时 dial tcp: i/o timeout
	ConnTimeout   time.Duration   //连接状态超时 WSARecv tcp: i/o timeout
	TryTimes      int             //尝试下载的最大次数
	RetryPause    time.Duration   //下载失败后,下次尝试下载的等待时间
	RedirectTimes int             //重定向的最大次数,为0时不限,小于0时禁止重定向
	Temp          Temp            //临时数据
	TempIsJson    map[string]bool //将Temp中以JSON存储的字段标记为true,自动设置,禁止人为填写
	Priority      int             //指定调度优先级,默认为0(最小优先级为0)
	Reloadable    bool            //是否允许重复该链接下载
	//Surfer下载器内核ID
	//0为Surf高并发下载器,各种控制功能齐全
	//1为PhantomJS下载器,特点破防力强,速度慢,低并发
	DownloaderID int
	// contains filtered or unexported fields
}

Request represents object waiting for being crawled.

func UnSerialize

func UnSerialize(s string) (*Request, error)

反序列化

func (*Request) AddHeader

func (self *Request) AddHeader(key, value string) *Request

func (*Request) Copy

func (self *Request) Copy() *Request

获取副本

func (*Request) GetConnTimeout

func (self *Request) GetConnTimeout() time.Duration

func (*Request) GetCookies

func (self *Request) GetCookies() string

func (*Request) GetDialTimeout

func (self *Request) GetDialTimeout() time.Duration

func (*Request) GetDownloaderID

func (self *Request) GetDownloaderID() int

func (*Request) GetEnableCookie

func (self *Request) GetEnableCookie() bool

func (*Request) GetHeader

func (self *Request) GetHeader() http.Header

func (*Request) GetMethod

func (self *Request) GetMethod() string

获取Http请求的方法名称 (注意这里不是指Http GET方法)

func (*Request) GetPostData

func (self *Request) GetPostData() string

func (*Request) GetPriority

func (self *Request) GetPriority() int

func (*Request) GetProxy

func (self *Request) GetProxy() string

func (*Request) GetRedirectTimes

func (self *Request) GetRedirectTimes() int

func (*Request) GetReferer

func (self *Request) GetReferer() string

func (*Request) GetRetryPause

func (self *Request) GetRetryPause() time.Duration

func (*Request) GetRuleName

func (self *Request) GetRuleName() string

func (*Request) GetSpiderName

func (self *Request) GetSpiderName() string

func (*Request) GetTemp

func (self *Request) GetTemp(key string, defaultValue interface{}) interface{}

获取临时缓存数据 defaultValue 不能为 interface{}(nil)

func (*Request) GetTemps

func (self *Request) GetTemps() Temp

func (*Request) GetTryTimes

func (self *Request) GetTryTimes() int

func (*Request) GetUrl

func (self *Request) GetUrl() string

获取Url

func (*Request) IsReloadable

func (self *Request) IsReloadable() bool

func (*Request) MarshalJSON

func (self *Request) MarshalJSON() ([]byte, error)

func (*Request) Prepare

func (self *Request) Prepare() error

发送请求前的准备工作,设置一系列默认值 Request.Url与Request.Rule必须设置 Request.Spider无需手动设置(由系统自动设置) Request.EnableCookie在Spider字段中统一设置,规则请求中指定的无效 以下字段有默认值,可不设置: Request.Method默认为GET方法; Request.DialTimeout默认为常量DefaultDialTimeout,小于0时不限制等待响应时长; Request.ConnTimeout默认为常量DefaultConnTimeout,小于0时不限制下载超时; Request.TryTimes默认为常量DefaultTryTimes,小于0时不限制失败重载次数; Request.RedirectTimes默认不限制重定向次数,小于0时可禁止重定向跳转; Request.RetryPause默认为常量DefaultRetryPause; Request.DownloaderID指定下载器ID,0为默认的Surf高并发下载器,功能完备,1为PhantomJS下载器,特点破防力强,速度慢,低并发。

func (*Request) Serialize

func (self *Request) Serialize() string

序列化

func (*Request) SetCookies

func (self *Request) SetCookies(cookie string) *Request

func (*Request) SetDownloaderID

func (self *Request) SetDownloaderID(id int) *Request

func (*Request) SetEnableCookie

func (self *Request) SetEnableCookie(enableCookie bool) *Request

func (*Request) SetHeader

func (self *Request) SetHeader(key, value string) *Request

func (*Request) SetMethod

func (self *Request) SetMethod(method string) *Request

设定Http请求方法的类型

func (*Request) SetPriority

func (self *Request) SetPriority(priority int) *Request

func (*Request) SetProxy

func (self *Request) SetProxy(proxy string) *Request

func (*Request) SetReferer

func (self *Request) SetReferer(referer string) *Request

func (*Request) SetReloadable

func (self *Request) SetReloadable(can bool) *Request

func (*Request) SetRuleName

func (self *Request) SetRuleName(ruleName string) *Request

func (*Request) SetSpiderName

func (self *Request) SetSpiderName(spiderName string) *Request

func (*Request) SetTemp

func (self *Request) SetTemp(key string, value interface{}) *Request

func (*Request) SetTemps

func (self *Request) SetTemps(temp map[string]interface{}) *Request

func (*Request) SetUrl

func (self *Request) SetUrl(url string) *Request

func (*Request) Unique

func (self *Request) Unique() string

请求的唯一识别码

type Temp

type Temp map[string]interface{}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL