safebrowsing

package module
v0.0.0-...-58d2bdf Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 27, 2020 License: Apache-2.0 Imports: 27 Imported by: 0

README

Build Status

Reference Implementation for the Usage of Google Safe Browsing APIs (v4)

The safebrowsing Go package can be used with the Google Safe Browsing APIs (v4) to access the Google Safe Browsing lists of unsafe web resources. Inside the cmd sub-directory, you can find two programs: sblookup and sbserver. The sbserver program creates a proxy local server to check URLs and a URL redirector to redirect users to a warning page for unsafe URLs. The sblookup program is a command line service that can also be used to check URLs.

This README.md is a quickstart guide on how to build, deploy, and use the safebrowsing Go package. It can be used out-of-the-box. The GoDoc and API documentation provide more details on fine tuning the parameters if desired.

Setup

To use the safebrowsing Go package you must obtain an API key from the Google Developer Console. For more information, see the Get Started section of the Google Safe Browsing APIs (v4) documentation.

How to Build

To download and install from the source, run the following command:

go get github.com/google/safebrowsing

The programs below execute from your $GOPATH/bin folder. Add that to your $PATH for convenience:

export PATH=$PATH:$GOPATH/bin

Proxy Server

The sbserver server binary runs a Safe Browsing API lookup proxy that allows users to check URLs via a simple JSON API.

  1. Once the Go environment is setup, run the following command with your API key:

    go get github.com/google/safebrowsing/cmd/sbserver
    sbserver -apikey $APIKEY
    

    With the default settings this will start a local server at 127.0.0.1:8080.

  2. The server also uses an URL redirector (listening on /r) to show an interstitial for anything marked unsafe.
    If the URL is safe, the client is automatically redirected to the target. Else, an interstitial warning page is shown as recommended by Safe Browsing.
    Try these URLs:

    127.0.0.1:8080/r?url=http://testsafebrowsing.appspot.com/apiv4/ANY_PLATFORM/MALWARE/URL/
    127.0.0.1:8080/r?url=http://testsafebrowsing.appspot.com/apiv4/ANY_PLATFORM/SOCIAL_ENGINEERING/URL/
    127.0.0.1:8080/r?url=http://testsafebrowsing.appspot.com/apiv4/ANY_PLATFORM/UNWANTED_SOFTWARE/URL/
    127.0.0.1:8080/r?url=http://www.google.com/
    
  3. The server also has a lightweight implementation of the API v4 threatMatches endpoint.
    To use the local proxy server to check a URL, send a POST request to 127.0.0.1:8080/v4/threatMatches:find with the following JSON body:

    {
    	"threatInfo": {
    		"threatTypes":      ["UNWANTED_SOFTWARE", "MALWARE"],
    		"platformTypes":    ["ANY_PLATFORM"],
    		"threatEntryTypes": ["URL"],
    		"threatEntries": [
    			{"url": "google.com"},
    			{"url": "http://testsafebrowsing.appspot.com/apiv4/ANY_PLATFORM/MALWARE/URL/"}
    		]
    	}
    }
    

    Refer to the Google Safe Browsing APIs (v4) for the format of the JSON request.

Command-Line Lookup

The sblookup command-line binary is another example of how the Go Safe Browsing library can be used to protect users from unsafe URLs. This command-line tool filters unsafe URLs piped via STDIN. Example usage:

$ go get github.com/google/safebrowsing/cmd/sblookup
$ echo "http://testsafebrowsing.appspot.com/apiv4/ANY_PLATFORM/MALWARE/URL/" | sblookup -apikey=$APIKEY
  Unsafe URL found:  http://testsafebrowsing.appspot.com/apiv4/ANY_PLATFORM/MALWARE/URL/ [{testsafebrowsing.appspot.com/apiv4/ANY_PLATFORM/MALWARE/URL/ {MALWARE ANY_PLATFORM URL}}]

Safe Browsing System Test

To perform an end-to-end test on the package with the Safe Browsing backend, run the following command:

go test github.com/google/safebrowsing -v -run TestSafeBrowser -apikey $APIKEY

Documentation

Overview

Package safebrowsing implements a client for the Safe Browsing API v4.

API v4 emphasizes efficient usage of the network for bandwidth-constrained applications such as mobile devices. It achieves this by maintaining a small portion of the server state locally such that some queries can be answered immediately without any network requests. Thus, fewer API calls made, means less bandwidth is used.

At a high-level, the implementation does the following:

            hash(query)
                 |
            _____V_____
           |           | No
           | Database  |-----+
           |___________|     |
                 |           |
                 | Maybe?    |
            _____V_____      |
       Yes |           | No  V
     +-----|   Cache   |---->+
     |     |___________|     |
     |           |           |
     |           | Maybe?    |
     |      _____V_____      |
     V Yes |           | No  V
     +<----|    API    |---->+
     |     |___________|     |
     V                       V
(Yes, unsafe)            (No, safe)

Essentially the query is presented to three major components: The database, the cache, and the API. Each of these may satisfy the query immediately, or may say that it does not know and that the query should be satisfied by the next component. The goal of the database and cache is to satisfy as many queries as possible to avoid using the API.

Starting with a user query, a hash of the query is performed to preserve privacy regarded the exact nature of the query. For example, if the query was for a URL, then this would be the SHA256 hash of the URL in question.

Given a query hash, we first check the local database (which is periodically synced with the global Safe Browsing API servers). This database will either tell us that the query is definitely safe, or that it does not have enough information.

If we are unsure about the query, we check the local cache, which can be used to satisfy queries immediately if the same query had been made recently. The cache will tell us that the query is either safe, unsafe, or unknown (because the it's not in the cache or the entry expired).

If we are still unsure about the query, then we finally query the API server, which is guaranteed to return to us an authoritative answer, assuming no networking failures.

For more information, see the API developer's guide:

https://developers.google.com/safe-browsing/

Index

Constants

View Source
const (
	// DefaultServerURL is the default URL for the Safe Browsing API.
	DefaultServerURL = "safebrowsing.googleapis.com"

	// DefaultUpdatePeriod is the default period for how often SafeBrowser will
	// reload its blacklist database.
	DefaultUpdatePeriod = 30 * time.Minute

	// DefaultID and DefaultVersion are the default client ID and Version
	// strings to send with every API call.
	DefaultID      = "GoSafeBrowser"
	DefaultVersion = "1.0.0"

	// DefaultRequestTimeout is the default amount of time a single
	// api request can take.
	DefaultRequestTimeout = time.Minute
)
View Source
const (
	ThreatType_Malware                       = ThreatType(pb.ThreatType_MALWARE)
	ThreatType_SocialEngineering             = ThreatType(pb.ThreatType_SOCIAL_ENGINEERING)
	ThreatType_UnwantedSoftware              = ThreatType(pb.ThreatType_UNWANTED_SOFTWARE)
	ThreatType_PotentiallyHarmfulApplication = ThreatType(pb.ThreatType_POTENTIALLY_HARMFUL_APPLICATION)
)

List of ThreatType constants.

View Source
const (
	PlatformType_AnyPlatform  = PlatformType(pb.PlatformType_ANY_PLATFORM)
	PlatformType_AllPlatforms = PlatformType(pb.PlatformType_ALL_PLATFORMS)

	PlatformType_Windows = PlatformType(pb.PlatformType_WINDOWS)
	PlatformType_Linux   = PlatformType(pb.PlatformType_LINUX)
	PlatformType_Android = PlatformType(pb.PlatformType_ANDROID)
	PlatformType_OSX     = PlatformType(pb.PlatformType_OSX)
	PlatformType_iOS     = PlatformType(pb.PlatformType_IOS)
	PlatformType_Chrome  = PlatformType(pb.PlatformType_CHROME)
)

List of PlatformType constants.

View Source
const (
	ThreatEntryType_URL = ThreatEntryType(pb.ThreatEntryType_URL)

	// These below are not supported yet.
	ThreatEntryType_Executable = ThreatEntryType(pb.ThreatEntryType_EXECUTABLE)
	ThreatEntryType_IPRange    = ThreatEntryType(pb.ThreatEntryType_IP_RANGE)
)

List of ThreatEntryType constants.

Variables

DefaultThreatLists is the default list of threat lists that SafeBrowser will maintain. Do not modify this variable.

Functions

func GenerateLookupHosts

func GenerateLookupHosts(urlStr string) ([]string, error)

func ValidURL

func ValidURL(url string) bool

ValidURL parses the given string and returns true if it is a Safe Browsing compatible URL.

In general, clients can (and should) just call LookupURLs, which performs the same checks internally. This method can be useful when checking a batch of URLs, as the first parse failure will cause LookupURLs to stop processing the request and return an error.

Types

type Config

type Config struct {
	// ServerURL is the URL for the Safe Browsing API server.
	// If empty, it defaults to DefaultServerURL.
	ServerURL string

	// ProxyURL is the URL of the proxy to use for all requests.
	// If empty, the underlying library uses $HTTP_PROXY environment variable.
	ProxyURL string

	// APIKey is the key used to authenticate with the Safe Browsing API
	// service. This field is required.
	APIKey string

	// ID and Version are client metadata associated with each API request to
	// identify the specific implementation of the client.
	// They are similar in usage to the "User-Agent" in an HTTP request.
	// If empty, these default to DefaultID and DefaultVersion, respectively.
	ID      string
	Version string

	// DBPath is a path to a persistent database file.
	// If empty, SafeBrowser operates in a non-persistent manner.
	// This means that blacklist results will not be cached beyond the lifetime
	// of the SafeBrowser object.
	DBPath string

	// UpdatePeriod determines how often we update the internal list database.
	// If zero value, it defaults to DefaultUpdatePeriod.
	UpdatePeriod time.Duration

	// ThreatLists determines which threat lists that SafeBrowser should
	// subscribe to. The threats reported by LookupURLs will only be ones that
	// are specified by this list.
	// If empty, it defaults to DefaultThreatLists.
	ThreatLists []ThreatDescriptor

	// RequestTimeout determines the timeout value for the http client.
	RequestTimeout time.Duration

	// Logger is an io.Writer that allows SafeBrowser to write debug information
	// intended for human consumption.
	// If empty, no logs will be written.
	Logger io.Writer
	// contains filtered or unexported fields
}

Config sets up the SafeBrowser object.

type PlatformType

type PlatformType uint16

PlatformType is an enumeration type for platform classes. Examples of platform classes are Windows, Linux, Android, etc.

func (PlatformType) String

func (pt PlatformType) String() string

type SafeBrowser

type SafeBrowser struct {
	// contains filtered or unexported fields
}

SafeBrowser is a client implementation of API v4.

It provides a set of lookup methods that allows the user to query whether certain entries are considered a threat. The implementation manages all of local database and caching that would normally be needed to interact with the API server.

func NewSafeBrowser

func NewSafeBrowser(conf Config) (*SafeBrowser, error)

NewSafeBrowser creates a new SafeBrowser.

The conf struct allows the user to configure many aspects of the SafeBrowser's operation.

func (*SafeBrowser) Close

func (sb *SafeBrowser) Close() error

Close cleans up all resources. This method must not be called concurrently with other lookup methods.

func (*SafeBrowser) LookupURLs

func (sb *SafeBrowser) LookupURLs(urls []string) (threats [][]URLThreat, err error)

LookupURLs looks up the provided URLs. It returns a list of threats, one for every URL requested, and an error if any occurred. It is safe to call this method concurrently.

The outer dimension is across all URLs requested, and will always have the same length as urls regardless of whether an error occurs or not. The inner dimension is across every fragment that a given URL produces. For some URL at index i, one can check for a hit on any blacklist by checking if len(threats[i]) > 0. The ThreatEntryType field in the inner ThreatDescriptor will be set to ThreatEntryType_URL as this is a URL lookup.

If an error occurs, the caller should treat the threats list returned as a best-effort response to the query. The results may be stale or be partial.

func (*SafeBrowser) LookupURLsContext

func (sb *SafeBrowser) LookupURLsContext(ctx context.Context, urls []string) (threats [][]URLThreat, err error)

LookupURLsContext looks up the provided URLs. The request will be canceled if the provided Context is canceled, or if Config.RequestTimeout has elapsed. It is safe to call this method concurrently.

See LookupURLs for details on the returned results.

func (*SafeBrowser) Status

func (sb *SafeBrowser) Status() (Stats, error)

Status reports the status of SafeBrowser. It returns some statistics regarding the operation, and an error representing the status of its internal state. Most errors are transient and will recover themselves after some period.

func (*SafeBrowser) WaitUntilReady

func (sb *SafeBrowser) WaitUntilReady(ctx context.Context) error

WaitUntilReady blocks until the database is not in an error state. Returns nil when the database is ready. Returns an error if the provided context is canceled or if the SafeBrowser instance is Closed.

type Stats

type Stats struct {
	QueriesByDatabase int64         // Number of queries satisfied by the database alone
	QueriesByCache    int64         // Number of queries satisfied by the cache alone
	QueriesByAPI      int64         // Number of queries satisfied by an API call
	QueriesFail       int64         // Number of queries that could not be satisfied
	DatabaseUpdateLag time.Duration // Duration since last *missed* update. 0 if next update is in the future.
}

Stats records statistics regarding SafeBrowser's operation.

type ThreatDescriptor

type ThreatDescriptor struct {
	ThreatType      ThreatType
	PlatformType    PlatformType
	ThreatEntryType ThreatEntryType
}

A ThreatDescriptor describes a given threat, which itself is composed of several parameters along different dimensions: ThreatType, PlatformType, and ThreatEntryType.

type ThreatEntryType

type ThreatEntryType uint16

ThreatEntryType is an enumeration type for threat entries. Examples of threat entries are via URLs, binary digests, and IP address ranges.

func (ThreatEntryType) String

func (tet ThreatEntryType) String() string

type ThreatType

type ThreatType uint16

ThreatType is an enumeration type for threats classes. Examples of threat classes are malware, social engineering, etc.

func (ThreatType) String

func (tt ThreatType) String() string

type URLThreat

type URLThreat struct {
	Pattern string
	ThreatDescriptor
}

A URLThreat is a specialized ThreatDescriptor for the URL threat entry type.

Directories

Path Synopsis
cmd
sblookup
Command sblookup is a tool for looking up URLs via the command-line.
Command sblookup is a tool for looking up URLs via the command-line.
sbserver
Command sbserver is an application for serving URL lookups via a simple API.
Command sbserver is an application for serving URL lookups via a simple API.
sbserver/statik
Package statik contains static assets.
Package statik contains static assets.
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL