aim

package module
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 4, 2021 License: Apache-2.0 Imports: 6 Imported by: 0

README

🔍 Aim - SEO Meta Crawler

Elegant Scraper with sensible defaults for crawling websites based on SEO Metrics.

Aim provides an interface for crawling websites based on Colly, the lightning fast scraping framework for Gophers.

Features

🏃  Fast — uses Colly under the hood.

🦄  Easy to use — sensible defaults for most use cases.

🛠  Extendable — hooks are used to make crawled data available anywhere.

👀  Takes a close look — You can find a list of all the things aim can find here

Installation

Add aim to your go.mod file:

module github.com/x/y

go 1.17

require (
        github.com/tim-richter/aim latest
)

Example

func main() {
    crawl := aim.NewCrawl(
        "https://test-domain.com",
        aim.MaxDepth(3),
        aim.LimitRule(aim.LimitRule{DomainGlob: "*", Parallelism: 2}))
    
    crawl.OnResponse = func(response aim.CrawlerReponse) {
        fmt.Printf("%+v\n", response)
    }
    
    crawl.Start()
}

Bugs

Bugs or suggestions? Visit the issue tracker

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Crawler

type Crawler struct {
	Domain         string
	UserAgent      string
	MaxDepth       int
	UseSitemap     bool
	AllowedDomains []string
	Limits         []colly.LimitRule
	OnResponse     func(response CrawlerResponse)
}

func NewCrawl

func NewCrawl(domain string, options ...CrawlerOption) *Crawler

func (*Crawler) Init

func (crawler *Crawler) Init()

func (*Crawler) Start

func (crawler *Crawler) Start() *Crawler

type CrawlerOption

type CrawlerOption func(*Crawler)

func AllowedDomains

func AllowedDomains(domains ...string) CrawlerOption

func Limit

func Limit(rule LimitRule) CrawlerOption

func MaxDepth

func MaxDepth(depth int) CrawlerOption

func UseSitemap

func UseSitemap(use bool) CrawlerOption

func UserAgent

func UserAgent(ua string) CrawlerOption

type CrawlerResponse

type CrawlerResponse struct {
	// contains filtered or unexported fields
}

type LimitRule

type LimitRule = colly.LimitRule

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL