client

package
v0.0.0-...-2d0df81 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 7, 2021 License: MPL-2.0 Imports: 17 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultUserAgent        = "Geziyor 1.0"
	DefaultMaxBody    int64 = 1024 * 1024 * 1024 // 1GB
	DefaultRetryTimes       = 2
)

Default values for client

Variables

View Source
var (
	DefaultRetryHTTPCodes = []int{500, 502, 503, 504, 522, 524, 408}
)
View Source
var (
	// ErrNoCookieJar is the error type for missing cookie jar
	ErrNoCookieJar = errors.New("cookie jar is not available")
)

Functions

func ConvertHeaderToMap

func ConvertHeaderToMap(header http.Header) map[string]interface{}

ConvertHeaderToMap converts http.Header to map[string]interface{}

func ConvertMapToHeader

func ConvertMapToHeader(m map[string]interface{}) http.Header

ConvertMapToHeader converts map[string]interface{} to http.Header

func NewRedirectionHandler

func NewRedirectionHandler(maxRedirect int) func(req *http.Request, via []*http.Request) error

NewRedirectionHandler returns maximum allowed redirection function with provided maxRedirect

func SetDefaultHeader

func SetDefaultHeader(header http.Header, key string, value string) http.Header

SetDefaultHeader sets header if not exists before

Types

type Client

type Client struct {
	*http.Client
	// contains filtered or unexported fields
}

Client is a small wrapper around *http.Client to provide new methods.

func NewClient

func NewClient(opt *Options) *Client

NewClient creates http.Client with modified values for typical web scraper

func (*Client) Cookies

func (c *Client) Cookies(URL string) []*http.Cookie

Cookies returns the cookies to send in a request for the given URL.

func (*Client) DoRequest

func (c *Client) DoRequest(req *Request) (resp *Response, err error)

DoRequest selects appropriate request handler, client or Chrome

func (*Client) SetCookies

func (c *Client) SetCookies(URL string, cookies []*http.Cookie) error

SetCookies handles the receipt of the cookies in a reply for the given URL

type Options

type Options struct {
	MaxBodySize           int64
	CharsetDetectDisabled bool
	RetryTimes            int
	RetryHTTPCodes        []int
	RemoteAllocatorURL    string
	AllocatorOptions      []chromedp.ExecAllocatorOption
}

Options is custom http.client options

type Request

type Request struct {
	*http.Request

	// Meta contains arbitrary data.
	// Use this Meta map to store contextual data between your requests
	Meta map[string]interface{}

	// If true, requests will be synchronized
	Synchronized bool

	// If true request will be opened in Chrome and
	// fully rendered HTML DOM response will returned as response
	Rendered bool

	// Optional response body encoding. Leave empty for automatic detection.
	// If you're having issues with auto detection, set this.
	Encoding string

	// Set this true to cancel requests. Should be used on middlewares.
	Cancelled bool
	// contains filtered or unexported fields
}

Request is a small wrapper around *http.Request that contains Metadata and Rendering option

func NewRequest

func NewRequest(method, url string, body io.Reader) (*Request, error)

NewRequest returns a new Request given a method, URL, and optional body.

func (*Request) Cancel

func (r *Request) Cancel()

Cancel request

type Response

type Response struct {
	*http.Response

	// Response body
	Body []byte

	// Goquery Document object. If response IsHTML, its non-nil.
	HTMLDoc *goquery.Document

	Request *Request
}

Response type wraps http.Response Contains parsed response data and Geziyor functions.

func (*Response) IsHTML

func (r *Response) IsHTML() bool

IsHTML checks if response content is HTML by looking content-type header

func (*Response) JoinURL

func (r *Response) JoinURL(relativeURL string) (*url.URL, error)

JoinURL joins base response URL and provided relative URL.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL