bcrawler

command module

v0.0.0-...-fa8d76c Latest Latest Go to latest Published: Mar 10, 2021 License: Apache-2.0 Imports: 15 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/crawlerclub/bcrawler

Links

Open Source Insights

README ¶

bcrawler

A standalone crawling tool for extracting data from the web.

Usage

Usage of ./bcrawler:
  -alsologtostderr
    	log to standard error as well as files
  -conf string
    	dir for parsers conf (default "./conf")
  -dir string
    	the data dir (default "data")
  -log_dir string
    	If non-empty, write log files in this directory
  -logtostderr
    	log to standard error instead of files
  -q string
    	the queue dir (default "q")
  -sleep int
    	in seconds (default -1)
  -start string
    	the parser name for the start url (default "addr_year")

There should be a parsers directory under ./conf, which stores all the parser configurations in json format.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL