Documentation ¶
Overview ¶
Package crawler implements a STAC resource crawler.
Index ¶
- Variables
- func Crawl(resource string, visitor Visitor, options ...*Options) error
- func LinkTypeAnyJSON(link Link) bool
- func LinkTypeApplicationJSON(link Link) bool
- func LinkTypeGeoJSON(link Link) bool
- func LinkTypeNone(link Link) bool
- type Asset
- type Crawler
- type ErrorHandler
- type Handler
- type Link
- type LinkMatcher
- type Links
- type Options
- type Queue
- type Resource
- type ResourceInfo
- type ResourceType
- type Task
- type Visitor
Constants ¶
This section is empty.
Variables ¶
var DefaultOptions = &Options{ ErrorHandler: func(err error) error { return err }, }
DefaultOptions used when creating a new crawler.
var ErrStopRecursion = errors.New("stop recursion")
ErrStopRecursion is returned by the visitor when it wants to stop recursing.
Functions ¶
func Crawl ¶ added in v0.11.0
Crawl calls the visitor for each resolved resource.
The resource can be a file path or a URL. Any error returned by visitor will stop crawling and be returned by this function. Context cancellation will also stop crawling and the context error will be returned.
This is a shorthand for calling New, Add, and Wait when you only need to crawl a single entry.
func LinkTypeAnyJSON ¶ added in v0.10.0
func LinkTypeApplicationJSON ¶ added in v0.10.0
func LinkTypeGeoJSON ¶ added in v0.10.0
func LinkTypeNone ¶ added in v0.10.0
Types ¶
type Asset ¶ added in v0.12.0
type Asset map[string]interface{}
Asset provides metadata about data for an item.
func (Asset) Description ¶ added in v0.12.0
Description returns the asset's description.
type Crawler ¶
type Crawler struct {
// contains filtered or unexported fields
}
Crawler crawls STAC resources.
func New ¶
New creates a crawler with the provided options (or DefaultOptions if none are provided).
The visitor will be called for each resource added and for every additional resource linked from the initial entry.
type ErrorHandler ¶ added in v0.11.0
ErrorHandler is called with any errors during a crawl. If the function returns nil, the crawl will continue. If the function returns an error, the crawl will stop.
type LinkMatcher ¶ added in v0.10.0
type Options ¶
type Options struct { // Optional function to limit which resources to crawl. If provided, the function // will be called with the URL or absolute path to a resource before it is crawled. // If the function returns false, the resource will not be read and the visitor will // not be called. Filter func(string) bool // Optional function to handle any errors during the crawl. By default, any error // will stop the crawl. To continue crawling on error, provide a function that // returns nil. The special ErrStopRecursion will stop the crawler from recursing deeper // but will not stop the crawl altogether. ErrorHandler ErrorHandler // Optional queue to use for crawling tasks. If not provided, an in-memory queue // will be used. When running a crawl across multiple processes, it can be useful // to provide a queue that is shared across processes. Queue Queue }
Options for creating a crawler.
type Queue ¶ added in v0.15.0
func NewMemoryQueue ¶ added in v0.15.0
NewMemoryQueue is used if a custom queue is not provided for a crawl.
The crawl will stop if the provided context is cancelled. The limit is used to control the number of resources that will be visited concurrently.
type Resource ¶
type Resource map[string]interface{}
Resource represents a STAC catalog, collection, or item.
func (Resource) ConformsTo ¶ added in v0.6.0
Returns the STAC / OGC Features API conformance classes (if any).
func (Resource) Extensions ¶
Extensions returns the resource extension URLs.
func (Resource) Type ¶
func (r Resource) Type() ResourceType
Type returns the specific resource type.
type ResourceInfo ¶ added in v0.15.0
type ResourceInfo struct { // Location is the URL or file path of the resource. Location string // Entry is the URL or file path of the initial resource that was crawled and pointed to this resource. Entry string }
ResourceInfo includes information about how the resource was accessed.
type ResourceType ¶
type ResourceType string
ResourceType indicates the STAC resource type.
const ( Item ResourceType = "item" Catalog ResourceType = "catalog" Collection ResourceType = "collection" )
type Task ¶ added in v0.9.0
type Task struct {
// contains filtered or unexported fields
}
func (*Task) MarshalJSON ¶ added in v0.15.0
func (*Task) UnmarshalJSON ¶ added in v0.15.0
type Visitor ¶
type Visitor func(Resource, *ResourceInfo) error
Visitor is called for each resource during crawling.
Any returned error will stop crawling and be returned by Crawl.