crawler

package module
v0.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 7, 2024 License: MIT Imports: 8 Imported by: 0

README

Crawler

This module is used for crawling the web page given and return Page object for each status that has configured channel for it.

You can configure new channels using the Channels map like

	chans := crawler.Channels{
		404: make(chan crawler.Page),
		200: make(chan crawler.Page),
	}

Documentation

Overview

Package crawler provides functionalities to crawl a given website's pages

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Channels

type Channels map[int]chan Page

Channels is a Page channels map where the index is the response code so we can define different behavior for the different resp codes

type Crawler

type Crawler struct {
	// contains filtered or unexported fields
}

Crawler is the main package object used to initialize the crawl

func NewCrawler

func NewCrawler(urlString string, chans Channels, parents bool) (*Crawler, error)

NewCrawler is the crawler inicialization method

func (*Crawler) Run

func (c *Crawler) Run()

Run is the crawler start method

type Page

type Page struct {
	Url  url.URL       // Page url
	Resp http.Response // Page response as returned from the GET request
	Body string        // Response body string
}

Page is a struct that carries the scanned url, response and response body string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL