client

package
v0.0.0-...-1e75203 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 19, 2019 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultUserAgent        = "Geziyor 1.0"
	DefaultMaxBody    int64 = 1024 * 1024 * 1024 // 1GB
	DefaultRetryTimes       = 2
)

Variables

View Source
var (
	// ErrNoCookieJar is the error type for missing cookie jar
	ErrNoCookieJar = errors.New("cookie jar is not available")
	ErrWrongStatus = errors.New("wrong response status code")
)
View Source
var (
	DefaultRetryHTTPCodes = []int{500, 502, 503, 504, 522, 524, 408}
)

Functions

func ConvertHeaderToMap

func ConvertHeaderToMap(header http.Header) map[string]interface{}

ConvertHeaderToMap converts http.Header to map[string]interface{}

func ConvertMapToHeader

func ConvertMapToHeader(m map[string]interface{}) http.Header

ConvertMapToHeader converts map[string]interface{} to http.Header

func NewRedirectionHandler

func NewRedirectionHandler(maxRedirect int) func(req *http.Request, via []*http.Request) error

NewRedirectionHandler returns maximum allowed redirection function with provided maxRedirect

func SetDefaultHeader

func SetDefaultHeader(header http.Header, key string, value string) http.Header

SetDefaultHeader sets header if not exists before

Types

type Client

type Client struct {
	*http.Client
	// contains filtered or unexported fields
}

Client is a small wrapper around *http.Client to provide new methods.

func NewClient

func NewClient(maxBodySize int64, charsetDetectDisabled bool, retryTimes int, retryHTTPCodes []int) *Client

NewClient creates http.Client with modified values for typical web scraper

func (*Client) Cookies

func (c *Client) Cookies(URL string) []*http.Cookie

Cookies returns the cookies to send in a request for the given URL.

func (*Client) DoRequest

func (c *Client) DoRequest(req *Request) (resp *Response, err error)

DoRequest selects appropriate request handler, client or Chrome

func (*Client) DoRequestChrome

func (c *Client) DoRequestChrome(req *Request) (*Response, error)

DoRequestChrome opens up a new chrome instance and makes request

func (*Client) DoRequestClient

func (c *Client) DoRequestClient(req *Request) (*Response, error)

DoRequestClient is a simple wrapper to read response according to options.

func (*Client) SetCookies

func (c *Client) SetCookies(URL string, cookies []*http.Cookie) error

SetCookies handles the receipt of the cookies in a reply for the given URL

type Request

type Request struct {
	*http.Request

	// Meta contains arbitrary data.
	// Use this Meta map to store contextual data between your requests
	Meta map[string]interface{}

	// If true, requests will be synchronized
	Synchronized bool

	// If true request will be opened in Chrome and
	// fully rendered HTML DOM response will returned as response
	Rendered bool

	// Optional response body encoding. Leave empty for automatic detection.
	// If you're having issues with auto detection, set this.
	Encoding string

	// Set this true to cancel requests. Should be used on middlewares.
	Cancelled bool
	// contains filtered or unexported fields
}

Request is a small wrapper around *http.Request that contains Metadata and Rendering option

func NewRequest

func NewRequest(method, url string, body io.Reader) (*Request, error)

NewRequest returns a new Request given a method, URL, and optional body.

func (*Request) Cancel

func (r *Request) Cancel()

Cancel request

type Response

type Response struct {
	*http.Response

	// Response body
	Body []byte

	// Goquery Document object. If response IsHTML, its non-nil.
	HTMLDoc *goquery.Document

	Request *Request
}

Response type wraps http.Response Contains parsed response data and Geziyor functions.

func (*Response) IsHTML

func (r *Response) IsHTML() bool

IsHTML checks if response content is HTML by looking content-type header

func (*Response) JoinURL

func (r *Response) JoinURL(relativeURL string) string

JoinURL joins base response URL and provided relative URL.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL