Documentation ¶
Index ¶
- type CaptureOptions
- type CaptureResponse
- type CaptureStatus
- type Connector
- func (c Connector) Capture(URL string, options CaptureOptions) (captureResponse CaptureResponse, err error)
- func (c *Connector) Close()
- func (c Connector) GetAvailableCaptureSlot() (err error)
- func (c Connector) GetCaptureStatus(jobID string) (captureStatus CaptureStatus, err error)
- func (c Connector) GetUserStatus() (userStatus UserStatus, err error)
- type UserStatus
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CaptureOptions ¶
type CaptureOptions struct { // Capture a web page with errors (HTTP status=4xx or 5xx). By default SPN2 captures only status=200 URLs. CaptureAll bool `spn:"capture_all"` // Capture web page outlinks automatically. This also applies to PDF, JSON, RSS and MRSS feeds. CaptureOutlinks int `spn:"capture_outlinks"` // Capture full page screenshot in PNG format. This is also stored in the Wayback Machine as a different capture. CaptureScreenshot bool `spn:"capture_screenshot"` // The capture becomes available in the Wayback Machine after ~12 hours instead of immediately. This option helps reduce the load on our systems. All API responses remain exactly the same when using this option. DelayWBAvailability bool `spn:"delay_wb_availability"` // Force the use of a simple HTTP GET request to capture the target URL. By default SPN2 does a HTTP HEAD on the target URL to decide whether to use a headless browser or a simple HTTP GET request. force_get overrides this behavior. ForceGet bool `spn:"force_get"` // Skip checking if a capture is a first if you don’t need this information. This will make captures run faster. SkipFirstArchive bool `spn:"skip_first_archive"` // if_not_archived_within=<timedelta> // // Capture web page only if the latest existing capture at the Archive is older than the <timedelta> limit. Its format could be any datetime expression like “3d 5h 20m” or just a number of seconds, e.g. “120”. If there is a capture within the defined timedelta, SPN2 returns that as a recent capture. The default system <timedelta> is 45 min. // // if_not_archived_within=<timedelta1>,<timedelta2> // // When using 2 comma separated <timedelta> values, the first one applies to the main capture and the second one applies to outlinks. IfNotArchivedWithin string `spn:"if_not_archived_within"` // Return the timestamp of the last capture for all outlinks. OutlinksAvailability bool `spn:"outlinks_availability"` // Send an email report of the captured URLs to the user’s email. EmailResult bool `spn:"email_result"` // Run JS code for <N> seconds after page load to trigger target page functionality like image loading on mouse over, scroll down to load more content, etc. The default system <N> is 5 sec. // // More details on the JS code we execute: // https://github.com/internetarchive/brozzler/blob/master/brozzler/behaviors.yaml // // WARNING: The max <N> value that applies is 30 sec. // // NOTE: If the target page doesn’t have any JS you need to run, you can use js_behavior_timeout=0 to speed up the capture. JsBehaviorTimeout string `spn:"js_behavior_timeout"` // It's hard to determine if int 0 is user input or default value, so we use string instead // Use extra HTTP Cookie value when capturing the target page. CaptureCookie string `spn:"capture_cookie"` // Use custom HTTP User-Agent value when capturing the target page. UseUserAgent string `spn:"use_user_agent"` // target_username=<XXX> // target_password=<YYY> // // Use your own username and password in the target page’s login forms. TargetUsername string `spn:"target_username"` // target_username=<XXX> // target_password=<YYY> // // Use your own username and password in the target page’s login forms. TargetPassword string `spn:"target_password"` }
func (CaptureOptions) Encode ¶
func (opts CaptureOptions) Encode() url.Values
converts CaptureOptions to url.Values
type CaptureResponse ¶
type CaptureResponse struct { URL string `json:"url"` JobID string `json:"job_id"` Status string `json:"status"` StatusExt string `json:"status_ext"` Message string `json:"message"` }
CaptureResponse represent the JSON response from SPN returned when a capture is executed
type CaptureStatus ¶
type CaptureStatus struct { Timestamp string `json:"timestamp"` DurationSec float64 `json:"duration_sec"` OriginalURL string `json:"original_url"` Status string `json:"status"` StatusExt string `json:"status_ext"` JobID string `json:"job_id"` Outlinks []string `json:"outlinks"` Resources []string `json:"resources"` Exception string `json:"exception"` Message string `json:"message"` }
CaptureStatus represent the date returned by the /save/status/{job_id} endpoint
type Connector ¶
type Connector struct { AccessKey string SecretKey string HTTPClient *http.Client // contains filtered or unexported fields }
Connector represent the necessary data to execute SPN requests
func (Connector) Capture ¶
func (c Connector) Capture(URL string, options CaptureOptions) (captureResponse CaptureResponse, err error)
Capture execute a capture via https://web.archive.org/save and return the response. Options for the capture can be specified when calling the method
func (Connector) GetAvailableCaptureSlot ¶
Wait until a capture slot is available
func (Connector) GetCaptureStatus ¶
func (c Connector) GetCaptureStatus(jobID string) (captureStatus CaptureStatus, err error)
GetCaptureStatus retrieve the informations about a SPN job
func (Connector) GetUserStatus ¶
func (c Connector) GetUserStatus() (userStatus UserStatus, err error)
GetUserStatus retrieve the user status for a given SPN account
type UserStatus ¶
type UserStatus struct { DailyCaptures int `json:"daily_captures"` DailyCapturesLimit int `json:"daily_captures_limit"` Available int `json:"available"` Processing int `json:"processing"` }
UserStatus represent the data returned by the /save/status/user endpoint
func (*UserStatus) Update ¶
func (to *UserStatus) Update(from UserStatus)