screenshot

package module
v0.5.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 15, 2024 License: GPL-3.0 Imports: 24 Imported by: 1

README

ScreenShot

golang GoDoc Go Report Issues Size Tag View examples License


Purpose

Sometimes you don't want just standard web links in your web presentation but a preview image showing the page you're linking to. That is where this package comes in. It generates – by way of calling the external Chrome browser – an image of the web page a given URL addresses. Those image files are stored locally and may be used as often as you want without additional external network traffic.

Installation

You can use Go to install this package for you:

go get github.com/mwat56/screenshot

After that you can import it the usual Go way to use the library.

Usage

There are only two functions you have to worry about:

// `SetImageDir` sets the directory to use for storing the generated
// screenshot images.
//
// If `aDirectory` is empty or invalid the system's temp directory is used.
//
// `aDirectory` The directory to store the generated images.
func SetImageDir(aDirectory string) { ... }

This function should be called before any other one to make sure the generated screenshots end up where you want them to be. The default is the system's temp directory (e.g. /tmp under GNU/Linux).

To actually create the screenshot image you'd call:

// CreateImage generates an image of `aURL` and stores it in `ImageDir()`,
// returning the file name of the saved image or an error in case of problems.
//
//	`aURL` The address of the web page to process.
func CreateImage(aURL string) (string, error) { … }

The returned string is the name of the generated image file (without its path). If you combine it with the directory returned by ImageDir() you get the complete path/filename to locally access the image.

Generating a screenshot image usually takes between one and five seconds, depending on the actual web-page in question; however, it can take considerably longer. To avoid hanging the program the CreateImage() function uses a timeout of half a minute.

And, finally, not all web-pages can be rendered properly and turned into an image. In case of errors (like network-errors or problem while storing the image file) CreateImage() returns an empty filename and an error.

There are a couple more functions (mostly property GETters and SETters) which you will probably barely need; for details refer to the source code documentation.

Libraries

The Go library controlling a headless instance of the Chrome browser

is required for this package to work. Under Linux this browser is usually part of your distribution (as chromium-browser).

To resize the screenshot if required by the ImageHeight()/ImageWidth() values the

must be part of your Go installation (if not, run: go get -u golang.org/x/image/draw).

Example

In the source code's sub-directory app/ there's a demo program (screenshot.go) allowing you to generate a screenshot image of an URL given on the commandline.

To run it call e.g.

#> cd app
#> go build screenshot.go
#> ./screenshot

It will show you all available commandline options e.g.:

Usage: ./screenshot [OPTIONS]

-bc
	allow the browser to handle web cookies (default false)
-be
	skip sites with Certificate errors (default false)
-bm
	let browser emulate a mobile device (default false)
-bs
	let browser show scrollbars if available (default false)
-bt int
	max. time (seconds) allowed to process a single web page (default 32)
-ia
	accept the respective other image format (default true)
-id string
	directory for storing the screenshot image (default "/tmp")
-ih int
	max. height of the screenshot image (default 768)
-io
	overwrite an existing image (default false)
-iq int
	quality of the screenshot image (default 75)
-is float
	the browser's scale factor for the screenshot image (default 0.00)
-iw int
	max. width of the screenshot image (default 896)
-ja string
	name of text-file that contains sites better avoiding JavaScript
	(default "/home/matthias/devel/Go/src/github.com/mwat56/screenshot/app/hostsavoidjs.list")
-jn string
	name of text-file that contains sites needing JavaScript
	(default "/home/matthias/devel/Go/src/github.com/mwat56/screenshot/app/hostsneedjs.list")
-jp navigator.platform
	Identifier the JavaScript navigator.platform should use (default "Linux x86_64")
-js
	allow browser's use of JavaScript (default false)
-ju string
	description of the UserAgent the browser should report
	(default "Mozilla/5.0 (X11; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0")
-u string
	(*required*) the URL for the browser's screenshot
-v	verbose (default false)

As noted before you'll only need the -u string option, obviously.

You can use this program to generate screenshot images "by hand" and fiddle with the various commandline options to see what difference it makes if you change them.

History

Prior to this a few years back I wrote the pageview package which used the external wkhtmltoimage program; and in most cases it worked just fine. However, once in a while wkhtmltoimage produced a segmentation fault (core dumped) – reproducible. For a while I thought I could live with it, but over time it happened more often (i.e. with additional URLs). Fiddling around with various commandline options provided no improvement. In the end I started to look around, searching for alternative approaches – short of writing my own URL retrieval and rendering system. That's when I found ChromeDP and hence this package came into existence.

Licence

    Copyright © 2022, 2024  M.Watermann, 10247 Berlin, Germany
                    All rights reserved
                EMail : <support@mwat.de>

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You should have received a copy of the GNU General Public License along with this program. If not, see the GNU General Public License for details.


GFDL

Documentation

Overview

Package screenshot implements a web page link preview (snapshot image).

Copyright © 2022 M.Watermann, 10247 Berlin, Germany
                All rights reserved
            EMail : <support@mwat.de>

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You should have received a copy of the GNU General Public License along with this program. If not, see the [GNU General Public License](http://www.gnu.org/licenses/gpl.html) for details.

Index

Constants

View Source
const (
	// Default `UserAgent` string:
	DefaultAgent = `Mozilla/5.0 (X11; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0`
)

Variables

This section is empty.

Functions

func AcceptOther added in v0.2.0

func AcceptOther() bool

`AcceptOther()` returns whether to respect the respective other image format.

The CreateImage function checks whether a screenshot image already exists and – if so – doesn't create a new one. The filename extension (and it's image format) is determined by the []ImageQuality] setting: See the comments there. Now, assume current ImageType is configured `png` and CreateImage is called: To check whether there's already a screenshot present it looks for the appropriate image file with a `png` extension. If it exists no further work is done. However, if AcceptOther is `true` (i.e. the default) the other ImageType (`jpeg` in this example) is checked as well, and if that file exists no further work is done and CreateImage will return the already existing filename.

See also ImageOverwrite.

Returns:

  • `bool`: If `true` (i.e. the default) an existing screenshot image will satisfy.

func AvoidJSfile added in v0.4.0

func AvoidJSfile() string

`AvoidJSfile()` returns the name of the path/file containing hosts/domains where to avoid running JavaScript.

NOTE: This value is used only if the `JavaScript()` property is `true`.

Returns:

  • `string`: The path/filename of sites where to avoid JavaScript.

func CertErrors

func CertErrors() bool

`CertErrors()` returns whether to skip sites with certificate errors; defaults to `false` which in consequence ignores such errors.

Returns:

  • `bool`: Whether to ignore a site with certificate errors.

func Cookies

func Cookies() bool

`Cookies()` returns whether to allow web cookies during page retrieval; defaults to `false` for safety and speed reasons.

Returns:

  • `bool`: Whether cookies will be available during page retrieval.

func CreateImage

func CreateImage(aURL string) (string, error)

`CreateImage()` generates an image of `aURL` and stores it in ImageDir, returning the file name of the saved image or an error in case of problems.

In case the ImageAge or AcceptOther properties determine that the requested screenshot image already exists this function does not in fact create another screenshot but returns that existing filename. See also the comments to the SetAcceptOther function.

Parameters:

  • `aURL`: The address of the web page to process.

Returns:

  • `string`: The file name of the saved image.
  • `error`: A possible error during creation of the screenshot image.

func ImageAge

func ImageAge() int

`ImageAge()` returns the maximum age (in hours) of the locally stored screenshot images.

Returns:

  • `int`: The age a page image can have before requesting it again.

func ImageDir

func ImageDir() string

`ImageDir()` returns the directory to store the generated screenshot images.

Returns:

  • `string`: The directory to store the generated images.

func ImageHeight

func ImageHeight() int

`ImageHeight()` is the max. height of the virtual screen used to render. The initial default value is `768`.

NOTE: This is the max. height of the screenshot. Depending on the actual web-site and its rendering by the used 'Chrome' instance the generated image's height could be less.

The value `0` (zero) renders the entire page top to bottom, calculating the actual height from the page content.

Returns:

  • `int`: The height of the images to generate.

func ImageOverwrite added in v0.3.0

func ImageOverwrite() bool

`ImageOverwrite()` returns whether an existing file should be overwritten.

By default (i.e. with this value `false`) CreateImage will not replace an already existing image file by a new screenshot. With this property set `true` the`CreateImage()` function will overwrite any existing file regardless of e.g. age (see ImageAge) or quality (see ImageQuality).

Returns:

  • `bool`; Whether an existing file should be overwritten.

func ImageQuality

func ImageQuality() int

`ImageQuality()` returns the desired image quality.

Returns:

  • `int`: The desired image quality.

func ImageScale

func ImageScale() float64

`ImageScale()` returns the virtual browser's scale factor for the generated screenshot image.

Returns:

  • `float64`: The current scale factor used, `0` disables scaling.

func ImageType

func ImageType() string

`ImageType()` returns the type/format of the screenshot file generated.

NOTE: The image type/format depends on the given ImageQuality: `quality == 100` results in a `png` image, `quality < 100` results in a `jpeg` image.

NOTE: If the URL to shoot points to an image file (i.e. ".gif", ".jpeg", ".jpg", ".png", ".svg") the result of this function might be _wrong_ because the actually generated image depends on the type of the requested image.

Returns:

  • `string`: The image type to use when generating screenshots.

func ImageWidth

func ImageWidth() int

`ImageWidth()` is the width in pixels of the imaginary screen used to render. The default value is `896`.

NOTE: This is the max. width of the screenshot. Depending on the actual web-site and its rendering by the running 'Chrome' instance the generated image could be smaller.

Returns:

  • `int`: The width of the images to generate.

func JavaScript

func JavaScript() bool

`JavaScript()` returns whether to allow JavaScript during page retrieval; defaults to `false` for safety and speed reasons.

Returns:

  • `bool`: Whether JavaScript will be available during page retrieval.

func MaxProcessTime

func MaxProcessTime() int

`MaxProcessTime()` returns the timeout (in seconds) used to retrieve & render a requested web page. The initial default value is `32`.

Returns:

  • `int`: The new max. seconds allowed to process a web page.

func Mobile

func Mobile() bool

`Mobile()` returns whether the virtual browser should emulate a mobile device.

Returns:

  • `bool`: Whether the virtual browser should emulate a mobile device.

func NeedJSfile added in v0.4.0

func NeedJSfile() string

`NeedJSfile()` returns the name of the path/file containing hosts/domains requiring JavaScript to be active/working.

NOTE: This value is used only if the JavaScript option is set `false`.

Returns:

  • `string`: The path/file of with hosts/domains requiring JavaScript.

func PathFile

func PathFile(aURL string) string

`PathFile()` returns the complete local path/file of `aURL`.

NOTE: This function does not check whether the image file for `aURL` actually exists in the local filesystem but just reports the default path-/filename computed by string operations.

Parameters:

  • `aURL`: The address of the web page to process.

Returns:

  • `string`: The path/file of the screenshot of `aURL`.

func Platform

func Platform() string

`Platform()` returns the text the JS `navigator.platform` should return.

NOTE: This value is used only if the JavaScript option is set `true`.

Returns:

  • `string`: The platform identifier to use with JavaScript.

func ReadWaitTime added in v0.1.2

func ReadWaitTime() int

`ReadWaitTime()` returns the number of minutes to wait before an Avoid/Need hosts file is re-read.

The initial default value is `1`.

Returns:

  • `int`: The number of minutes to wait.

func Scrollbars

func Scrollbars() bool

`Scrollbars()` returns whether the virtual browser will show scrollbars (if available in web-page).

Returns:

  • `bool`: Whether scrollbars should be enabled:

func SetAcceptOther added in v0.2.0

func SetAcceptOther(doUse bool)

`SetAcceptOther()` sets whether to respect the respective other image format.

(See comments to the AcceptOther function.)

Parameters:

  • `doUse`: If `true` (i.e. the default) an existing screenshot image of the "other" format will satisfy.

func SetAvoidJSfile added in v0.4.0

func SetAvoidJSfile(aFilename string)

`SetAvoidJSfile()` configures the name of the file containing hosts/domains where to avoid running JavaScript.

NOTE: This value is used only if the `JavaScript()` property is `true`. An invalid filename disables the feature.

Parameters:

  • `aFilename`: The path/filename of sites with JavaScript to avoid.

func SetCertErrors

func SetCertErrors(doIgnore bool)

`SetCertErrors()` determines whether to reject sites with certificate errors or process the respective page anyway.

Parameters:

  • `doIgnore`: If `false` (i.e. the default) all certificate errors will be ignored and web-sites will be processed regardless of such errors.

func SetCookies

func SetCookies(doAllow bool)

`SetCookies()` determines whether to allow web cookies during page retrieval or not.

Parameters:

  • `anAllow`: If `false` (i.e. the default) no cookies will be available during page retrieval, otherwise (i.e. `true`) they will be used.

func SetImageAge

func SetImageAge(aMaxAge int)

`SetImageAge()` sets the maximum age of locally stored screenshot images before they may get updated by a new call to `CreateImage(…)`.

Usually you'll want this property at its default value (`0`, zero) which disables an age check because usually you want an image of the page at the time you linked to it.

Parameters:

  • `aMaxAge`: The age (in hours) a page image can have before requesting it again.

func SetImageDir

func SetImageDir(aDirectory string)

`SetImageDir()` sets the directory to use for storing the generated screenshot images.

If `aDirectory` is empty or invalid the system's temp directory is used.

Parameters:

  • `aDirectory`: The directory to store the generated images.

func SetImageHeight

func SetImageHeight(aHeight int)

`SetImageHeight()` sets the height in pixels of the screenshot images to generate. The initial default value is `768`.

See comments of ImageHeight.

Setting this value to `0` will result in an image containing the whole web-page (which might be quite long); so the actual height of the generated screenshot would be unpredictable.

Parameters:

  • `aHeight`: The new height of the images to generate.

func SetImageOverwrite added in v0.3.0

func SetImageOverwrite(doAllow bool)

`SetImageOverwrite()` decides whether an existing file should be overwritten.

See comments of ImageOverwrite.

Parameters:

  • `doAllow`; Whether an existing file should be overwritten.

func SetImageQuality

func SetImageQuality(aQuality int)

`SetImageQuality()` changes the quality of the screenshot image to be generated. Values are supported between `1` and `100`; default is `75`.

Parameters:

  • `aQuality`: The new desired image quality.

func SetImageScale

func SetImageScale(aFactor float64)

`SetImageScale()` sets the virtual browser's scale factor for the generated screenshot image.

Parameters:

  • `aFactor`: The new scale factor; `0` disables scaling.

func SetImageWidth

func SetImageWidth(aWidth int)

`SetImageWidth()` sets the width of the images to generate. The initial default value is `896`.

See comments of ImageWidth.

Parameters:

  • `aWidth`: The new width of the images to generate.

func SetJavaScript

func SetJavaScript(doAllow bool)

`SetJavaScript()` determines whether to activate the JavaScript engine during page retrieval or not.

Parameters:

  • `doAllow`: If `false` (i.e. the default) no JavaScript will be available during page retrieval, otherwise (i.e. `true`) it will be activated.

func SetMaxProcessTime

func SetMaxProcessTime(aProcessTime int)

`SetMaxProcessTime()` sets the timeout used to retrieve & render a requested web page.

NOTE: A wrong (i.e. negative) value and `0` (zero) resets the timeout value to its default of 32 seconds.

Parameters:

  • `aProcessTime`: The new max. seconds allowed to process a web page.

func SetMobile

func SetMobile(aMobile bool)

`SetMobile()` sets whether to emulate mobile device. This includes viewport meta tag, overlay scrollbars, text autosizing and more.

Parameters:

  • `aMobile`: Whether the virtual browser should emulate a mobile device.

func SetNeedJSfile added in v0.4.0

func SetNeedJSfile(aFilename string)

`SetNeedJSfile()` configures the name of the file containing hosts/domains requiring JavaScript to be active/working.

NOTE: This value is used only if the JavaScript option is set `false`. An invalid filename disables the feature.

Parameters:

  • `aFilename`: The path/filename of sites with required JavaScript.

func SetPlatform

func SetPlatform(aPlatform string)

`SetPlatform()` sets the text the JS `navigator.platform` should return.

NOTE: This value is used only if the `JavaScript()` option is set `true`.

Parameters:

  • `aPlatform`: The platform identifier to use for `navigator.platform`.

func SetReadWaitTime added in v0.1.2

func SetReadWaitTime(aMinutes int)

`SetReadWaitTime()` sets the number of minutes to wait before an Avoid/Need hosts file is re-read.

Usually you'll want this property at its default value (`1`, one) which seems to be a reasonable compromise between batch processing (i.e. looping through a list of URLs to process) and mitigation of disk accesses. An invalid (i.e. negative) value and `0` (zero) resets this property to its default of `1` (one) minute.

Parameters:

  • `aMinutes`: The number of minutes to wait.

func SetScrollbars

func SetScrollbars(aScrollbar bool)

`SetScrollbars()` sets whether the virtual browser will show scrollbars (if available in web-page).

NOTE: This feature is currently considered EXPERIMENTAL and might not work as expected.

Parameters:

  • `aScrollbar`: Flag whether to show scrollbars (if available).

func SetUserAgent

func SetUserAgent(anAgent string)

`SetUserAgent()` changes the current `User Agent` setting to `anAgent`.

NOTE: This value is used by the virtual browser in its page requests (and showing up in the page provider's logfile); if the `JavaScript()` option is set `true` the JS-engine will return this value if requested.

An invalid (empty) value resets this property to its current default of `Mozilla/5.0 (X11; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0`.

Parameters:

  • `anAgent`: The new `User Agent` setting.

func String

func String() string

`String()` returns a string of lines showing the currently configured screenshot options.

Returns:

  • `string`: A stringified representation of the current configuration.

func UserAgent

func UserAgent() string

`UserAgent()` returns the current `User Agent` setting.

NOTE: This value is used only if the `JavaScript()` option is set `true`.

Returns:

  • `string`: The current `User Agent` setting.

Types

type TScreenshotParams added in v0.4.2

type TScreenshotParams struct {
	// Flag whether to accept the respective other image format
	AcceptOther bool

	// Flag whether certificate errors should be processed.
	CertErrors bool

	// Dis-/Allow use of web cookies
	Cookies bool

	// Path/filename of a list of web hosts/domains where JavaScript
	// running should be avoided (defaults to a file in user's homedir).
	HostsAvoidJSfile string

	// Path/filename of a list of web hosts/domains where JavaScript
	// is required to work (defaults to a file in user's homedir).
	HostsNeedJSfile string

	// Max. age of cached page screenshot images (in hours).
	ImageAge int

	// Directory to store the generated screenshot images.
	ImageDir string

	// Max. height of the screenshot image to generate.
	ImageHeight int

	// Dis-/Allow to overwrite pre-existing screenshot files.
	ImageOverwrite bool

	// Quality (in percent) of the screenshot image to generate.
	ImageQuality int

	// The virtual browser's scale factor value.
	// 0 disables the override.
	ImageScale float64

	// Max. width of the screenshot image to generate.
	ImageWidth int

	// Flag whether to dis-/allow JavaScript in retrieved pages.
	JavaScript bool

	// Timeout (in seconds) for page processing.
	MaxProcessTime int

	// Flag whether to emulate a mobile device or not.
	// This includes viewport meta tag, overlay scrollbars, text
	// autosizing and more.
	Mobile bool

	// The identifier the JavaScript `navigator.platform` should return.
	Platform string

	// Flag whether to show the scraped web-page's scrollbars.
	Scrollbars bool

	// User Agent to use when queuing external sites.
	UserAgent string
}

TScreenshotParams bundles all available configuration options and pass them to the `Setup()` function in a single call.

func Options

func Options() *TScreenshotParams

`Options()` returns the currently configured screenshot options.

Returns:

  • `*TScreenshotParams`: The currently configured screenshot options.

func (*TScreenshotParams) Do added in v0.4.2

`Do()` uses its options' values to configure the runtime options for taking screenshots.

NOTE: While it is perfectly legal (from Go's point of view) to omit those fields you don't care about please be aware that those missing fields will nevertheless be set (by `Go`): with the respective data type's default value. And since there's no way to distinguish the automatically set default value of a missing field from a user provided value you have to handle such a situation carefully. Depending on the number of options you want to set you might want to prefer calling the various `SetXxxx()` functions (if there are less than half of the available options to set). Or – if you want to set the majority of the options – you'd provide the options you do not want to change with their already existing values by calling the respective GETter function of the option in question, like:

myOptions := screenshot.Options()
	// set fields …
	myOptions.ImageHeight = myHeightValue,
	myOptions.ImageQuality = myQualityValue,
	// ...
	// say, you don't want to change the width option
	myOptions.ImageWidth = screenshot.ImageWidth(),
	// ...

myOptions.Do()
// continue with your program ...

Returns:

  • `*TScreenshotParams`: The currently configured screenshot options.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL