sitemapper

package
v0.0.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 1, 2023 License: MIT Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Generate

func Generate(config *Config, cr *gocrawler.Client, elapsed time.Duration)

Generates a report in JSON format from the crawler client and config. The report contains the initial crawler info, the network info for each host visited, and the page info for each page visited such as all the links found in the page if the link belongs to the same host as the seed URL.

func SameHostLinkExtractor

func SameHostLinkExtractor(c *gocrawler.Client, currLink string, resp []byte) []string

Extracts links that are on the same host as the current link

Types

type Config

type Config struct {
	gocrawler.Config
	ReportPath string
}

func SetupConfig

func SetupConfig() *Config

SetupConfig wraps the gocrawler.Config and adds an additional report path field where users can specify where to export the report to.

func (*Config) PrintConfig

func (c *Config) PrintConfig()

type ReportFormat

type ReportFormat struct {
	Seed      string  `json:"seed"`
	MaxRPS    float64 `json:"max_rps"`
	CrawlTime string  `json:"crawl_time"`

	VisitedNetInfo  map[string][]gocrawler.NetworkInfo `json:"network_info"`
	VisitedPageResp map[string]gocrawler.PageInfo      `json:"page_info"`
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL