weaver

package module
v0.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 23, 2024 License: MIT Imports: 15 Imported by: 0

README

Go Reference Go Report Card

Weaver

Weaver logo

weaver is a command-line tool for checking links on websites.

Old stories would tell how Weavers would kill each other over aesthetic disagreements, such as whether it was prettier to destroy an army of a thousand men or to leave it be, or whether a particular dandelion should or should not be plucked. For a Weaver, to think was to think aesthetically. To act—to Weave—was to bring about more pleasing patterns. They did not eat physical food: they seemed to subsist on the appreciation of beauty.
—China Miéville, “Perdido Street Station”

Here's how to install it:

go install github.com/bitfield/weaver/cmd/weaver@latest

To run it:

weaver https://example.com
Links: 2 (2 OK, 0 errors, 0 warnings) [1s]

Verbose mode

To see more information about what's going on, use the -v flag:

weaver -v https://example.com
[OKAY] https://example.com (200 OK) (referrer: START)
[OKAY] https://www.iana.org/domains/example (200 OK) (referrer: https://example.com)

Links: 2 (2 OK, 0 errors, 0 warnings) [800ms]

How it works

The program checks the status of the specified URL. If the server responds with an HTML page, the program will parse this page for links, and check each new link for its status.

If the link points to the same domain as the original URL, it is also parsed for further links, and so on recursively until all links on the site have been visited.

Any broken links will be reported, together with the referring page:

[DEAD] https://example.com/bogus (404 Not Found) (referrer: https://example.com/)

Rate limiting

The program attempts to continuously adapt its request rate to suit the server. On receiving a 429 Too Many Requests response, it will reduce the current request rate. After a while with no further 429 responses, it will steadily increase the rate until it trips the rate limit once again.

Even without receiving any 429 responses, the program limits itself to a maximum of 5 requests per second, to be respectful of server resources.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Main

func Main() int

Types

type AdaptiveRateLimiter added in v0.3.1

type AdaptiveRateLimiter struct {
	// contains filtered or unexported fields
}

func NewAdaptiveRateLimiter added in v0.3.1

func NewAdaptiveRateLimiter() *AdaptiveRateLimiter

func (*AdaptiveRateLimiter) GraduallyIncreaseRateLimit added in v0.3.1

func (a *AdaptiveRateLimiter) GraduallyIncreaseRateLimit() (increased bool)

func (AdaptiveRateLimiter) Limit added in v0.3.1

func (a AdaptiveRateLimiter) Limit() rate.Limit

func (*AdaptiveRateLimiter) ReduceLimit added in v0.3.1

func (a *AdaptiveRateLimiter) ReduceLimit()

func (AdaptiveRateLimiter) SetLimit added in v0.3.1

func (a AdaptiveRateLimiter) SetLimit(r rate.Limit)

func (*AdaptiveRateLimiter) Wait added in v0.3.1

func (a *AdaptiveRateLimiter) Wait(ctx context.Context)

type Checker

type Checker struct {
	Verbose    bool
	Output     io.Writer
	BaseURL    *url.URL
	HTTPClient *http.Client
	Limiter    *AdaptiveRateLimiter
	// contains filtered or unexported fields
}

func NewChecker

func NewChecker() *Checker

func (*Checker) Check

func (c *Checker) Check(ctx context.Context, site string)

func (*Checker) Crawl

func (c *Checker) Crawl(ctx context.Context, page *url.URL, referrer string)

func (*Checker) RecordResult

func (c *Checker) RecordResult(link, referrer string, err error, resp *http.Response)

func (*Checker) Results

func (c *Checker) Results() []Result

type Result

type Result struct {
	Link     string
	Status   Status
	Message  string
	Referrer string
}

func (Result) String

func (r Result) String() string

type Status

type Status string
const (
	StatusOK      Status = "OKAY"
	StatusWarning Status = "WARN"
	StatusError   Status = "DEAD"
	StatusSkipped Status = "SKIP"
)

func (Status) String

func (s Status) String() string

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL