Documentation ¶
Overview ¶
Package biorxivcmd provides support for building command line tools that access api.biorxiv.com
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func OptionsForEndpoint ¶
func OptionsForEndpoint(cfg apicrawlcmd.Crawl[Service]) ([]operations.Option, error)
OptionsForEndpoint returns the operations.Option's derived from the apicrawlcmd configuration.
Types ¶
type Command ¶
type Command struct {
// contains filtered or unexported fields
}
Çommand implements the command line operations available for api.biorxiv.org.
func NewCommand ¶
func NewCommand(_ context.Context, crawl apicrawlcmd.Crawl[yaml.Node], cfs operations.FS, cacheRoot string, chkp checkpoint.Operation) (*Command, error)
NewCommand returns a new Command instance for the specified API crawl.
func (*Command) Crawl ¶
func (c *Command) Crawl(ctx context.Context, flags CrawlFlags) error
Crawl implements the crawl command. The crawl is incremental and utilizes an internal state file to track progress and restart from that point in a subsequent crawl. This makes it possible to have a start date that predates the creation of biorxiv and an end date of 'now' with each incremental crawl picking up where the previous one left off assuming that biorxiv doesn't add new preprints with dates that predate the current one.
func (*Command) LookupDownloaded ¶
LookupDownloaded looks up the specified preprints via their 'PreprintDOI' printing out fields using the specified template.
type CrawlFlags ¶
type CrawlFlags struct {
Restart bool `subcmd:"restart,false,'restart the crawl, ignoring the saved checkpoint'"`
}
type IndexFlags ¶
type IndexFlags struct{}
type LookupFlags ¶
type LookupFlags struct {
Template string `subcmd:"template,'{{.}}',template to use for printing fields in the downloaded Preprint objects"`
}
type ScanFlags ¶
type ScanFlags struct {
Template string `` /* 126-byte string literal not displayed */
}
type Service ¶
type Service struct { ServiceURL string `yaml:"service_url" cmd:"rxiv service URL, eg. https://api.biorxiv.org/pubs/biorxiv for biorxiv"` StartDate cmdyaml.FlexTime `yaml:"start_date" cmd:"start date for crawl, eg. 2020-01-01"` EndDate cmdyaml.FlexTime `yaml:"end_date" cmd:"end date for crawl, eg. 2020-12-01"` }
Service represents biorxiv specific configuration parameters.