crawldb

package
v0.0.0-...-3373bcc Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 22, 2021 License: MIT Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CrawlDB

type CrawlDB struct {
	// contains filtered or unexported fields
}

CrawlDB is the main interface for persistent storage in OnionScan

func (*CrawlDB) DeleteRelationship

func (cdb *CrawlDB) DeleteRelationship(onion string, from string, identiferType string, identifier string) error

DeleteRelationship deletes a relationship given the quad.

func (*CrawlDB) GetAllRelationshipsCount

func (cdb *CrawlDB) GetAllRelationshipsCount() int

GetAllRelationshipsCount returns the total number of relationships stored in the database.

func (*CrawlDB) GetCrawlRecord

func (cdb *CrawlDB) GetCrawlRecord(id int) (CrawlRecord, error)

GetCrawlRecord returns a CrawlRecord from the database given an ID.

func (*CrawlDB) GetRelationshipsCount

func (cdb *CrawlDB) GetRelationshipsCount(identifier string) int

GetRelationshipsCount returns the total number of relationships for a given identifier.

func (*CrawlDB) GetRelationshipsWithIdentifier

func (cdb *CrawlDB) GetRelationshipsWithIdentifier(identifier string) ([]Relationship, error)

GetRelationshipsWithIdentifier returns all relatioships associated with a given identifier.

func (*CrawlDB) GetRelationshipsWithOnion

func (cdb *CrawlDB) GetRelationshipsWithOnion(onion string) ([]Relationship, error)

GetRelationshipsWithOnion returns all relationships with an Onion field matching the onion parameter.

func (*CrawlDB) GetUserRelationshipFromOnion

func (cdb *CrawlDB) GetUserRelationshipFromOnion(identifier string, fromonion string) (map[string]Relationship, error)

GetUserRelationshipFromOnion reconstructs a user relationship from a given identifier. fromonion is used as a filter to ensure that only user relationships from a given onion are reconstructed.

func (*CrawlDB) HasCrawlRecord

func (cdb *CrawlDB) HasCrawlRecord(url string, duration time.Duration) (bool, int)

HasCrawlRecord returns true if a given URL is associated with a crawl record in the database. Only records created after the given duration are considered.

func (*CrawlDB) Initialize

func (cdb *CrawlDB) Initialize()

Initialize sets up a new database - should only be called when creating a new database. There is a lot of indexing here, which may seem overkill - but on a large OnionScan run these indexes take up < 100MB each - which is really cheap when compared with their search potential.

func (*CrawlDB) InsertCrawlRecord

func (cdb *CrawlDB) InsertCrawlRecord(url string, page *model.Page) (int, error)

InsertCrawlRecord adds a new spider entry to the database and returns the record id.

func (*CrawlDB) InsertRelationship

func (cdb *CrawlDB) InsertRelationship(onion string, from string, identiferType string, identifier string) (int, error)

InsertRelationship creates a new Relationship in the database.

func (*CrawlDB) NewDB

func (cdb *CrawlDB) NewDB(dbdir string)

NewDB creates new new CrawlDB instance. If the database does not exist at the given dbdir, it will be created.

type CrawlRecord

type CrawlRecord struct {
	URL       string
	Timestamp time.Time
	Page      model.Page
}

CrawlRecord defines a spider entry in the database

type Relationship

type Relationship struct {
	ID         int
	Onion      string
	From       string
	Type       string
	Identifier string
	FirstSeen  time.Time
	LastSeen   time.Time
}

Relationship defines a correltion record in the Database.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL