Documentation ¶
Overview ¶
Package dor is a domain rank data collection library and fast HTTP service which shows a specified domain's rank from the following providers: * Alexa * Majestic * Umbrella OpenDNS * Open PageRank * Tranco * Quantcast
Can be used as a base for a domain categorization, network filters or suspicious domain detection. Data is updated automatically by dor-insert once a day by default.
See service/dor-http/dor.go for an example of the Dor HTTP service and cmd/dor-insert/dor-insert.go for the data insertion script.
Client request example:
curl 127.0.0.1:8080/rank/github.com
Server response:
{ "data": "github.com", "ranks": [ { "domain": "github.com", "rank": 33, "date": "2018-01-11T18:01:27.251103268Z", "source": "majestic", "raw": "29,23,github.com,com,179825,518189,github.com,com,29,23,179994,518726" }, { "domain": "github.com", "rank": 72, "date": "2018-01-11T18:04:26.267833256Z", "source": "alexa", "raw": "" }, { "domain": "github.com", "rank": 2367, "last_update": "2018-01-11T18:06:50.866600102Z", "source": "umbrella", "raw": "" }, { "domain": "github.com", "rank": 115, "last_update": "2018-03-27T17:01:13.535Z", "source": "pagerank", "raw": "" }, { "domain": "github.com", "rank": 68, "last_update": "2018-03-27T17:01:13.535Z", "source": "tranco", "raw": "" }, { "domain": "github.com", "rank": 114, "date": "2019-05-04T00:00:00Z", "source": "quantcast", "raw": "" } ], "timestamp": "2018-01-11T18:07:09.186271429Z" }
Index ¶
- Variables
- type AlexaIngester
- type App
- type ClickhouseStorage
- type Entry
- type FindResponse
- type Ingester
- type IngesterConf
- type LookupMap
- type MajesticIngester
- type MemoryStorage
- type MongoStorage
- type PageRankIngester
- type QuantcastIngester
- type Storage
- type TrancoIngester
- type UmbrellaIngester
- type YandexRadarIngester
Constants ¶
This section is empty.
Variables ¶
var DefaultTTL = 30
DefaultTTL for records in days.
Functions ¶
This section is empty.
Types ¶
type AlexaIngester ¶
type AlexaIngester struct {
IngesterConf
}
AlexaIngester represents Ingester implementation for Alexa Top 1 Million websites
func (*AlexaIngester) Do ¶
func (in *AlexaIngester) Do() (chan *Entry, error)
Do implements Ingester Do func with the data from Alexa Top 1M CSV file
type App ¶
App represents Dor configuration options
func New ¶
New bootstraps App struct.
stn - storage name stl - storage location string keep - keep new data or overwrite old one (always false for MemoryStorage)
func (*App) FillByTimer ¶
FillByTimer combines filling and updating on a specific duration
type ClickhouseStorage ¶
type ClickhouseStorage struct {
// contains filtered or unexported fields
}
ClickhouseStorage is a dor.Storage that uses Clickhouse database.
func NewClickhouseStorage ¶
func NewClickhouseStorage(location, table string, batch int) (*ClickhouseStorage, error)
NewClickhouseStorage bootstraps ClickhouseStorage.
func (*ClickhouseStorage) Get ¶
func (c *ClickhouseStorage) Get(d string, sources ...string) ([]*Entry, error)
Get ranks for specified domain and sources.
type Entry ¶
type Entry struct { Domain string `json:"domain" db:"domain" bson:"domain"` Rank uint32 `json:"rank" db:"rank" bson:"rank"` Date time.Time `json:"date" bson:"date"` Source string `json:"source" bson:"source"` RawData string `json:"raw" bson:"raw"` }
Entry is a SimpleRank with extended fields
type FindResponse ¶
type FindResponse struct { RequestData string `json:"data"` Hits []*Entry `json:"ranks"` Timestamp time.Time `json:"timestamp"` }
FindResponse is a find request response.
type Ingester ¶
type Ingester interface { Do() (chan *Entry, error) // returns a channel for consumers GetDesc() string // simple getter for the source }
Ingester fetches data and uploads it to the Storage
type IngesterConf ¶
IngesterConf represents a top popular domains provider configuration.
Implemented ingesters by now are:
- Alexa Top 1 Million
- Majestic Top 1 Million
- Umbrella Top 1 Million
- PageRank Top 10 Millions
- Tranco Top 1 Million
func (*IngesterConf) GetDesc ¶
func (in *IngesterConf) GetDesc() string
GetDesc is a simple getter for a collection's description
type MajesticIngester ¶
type MajesticIngester struct { IngesterConf // contains filtered or unexported fields }
MajesticIngester is a List implementation which downloads data and translates it to LookupMap
More info: https://blog.majestic.com/development/alexa-top-1-million-sites-retired-heres-majestic-million/
func (*MajesticIngester) Do ¶
func (in *MajesticIngester) Do() (chan *Entry, error)
Do implements Ingester interface with the data from Majestic CSV file
type MemoryStorage ¶
type MemoryStorage struct {
Maps map[string]*memoryCollection
}
MemoryStorage implements Storage interface as in-memory storage
func (*MemoryStorage) Get ¶
func (ms *MemoryStorage) Get(d string, sources ...string) ([]*Entry, error)
Get implements Get method of the Storage interface
type MongoStorage ¶
type MongoStorage struct {
// contains filtered or unexported fields
}
MongoStorage implements the Storage interface for MongoDB
func NewMongoStorage ¶
func NewMongoStorage(u string, db string, col string, size int, w int, ret bool) (*MongoStorage, error)
NewMongoStorage bootstraps MongoStorage, creates indexes
u is the Mongo URL db is the database name col is the collection name size is the bulk message size w is number of workers ret is the data retention option
func (*MongoStorage) Get ¶
func (m *MongoStorage) Get(d string, sources ...string) ([]*Entry, error)
Get implements Storage interface method Get
type PageRankIngester ¶
type PageRankIngester struct {
IngesterConf
}
PageRankIngester represents Ingester implementation for Domcop PageRank top 10M domains
func (*PageRankIngester) Do ¶
func (in *PageRankIngester) Do() (chan *Entry, error)
Do implements Ingester Do func with the data from DomCop
type QuantcastIngester ¶
type QuantcastIngester struct {
IngesterConf
}
QuantcastIngester represents Ingester implementation for Quantcast Top 1 Million websites.
func NewQuantcast ¶
func NewQuantcast() *QuantcastIngester
NewQuantcast bootstraps QuantcastIngester.
func (*QuantcastIngester) Do ¶
func (in *QuantcastIngester) Do() (chan *Entry, error)
Do gets the data from Quantcast Top 1M txt file.
type Storage ¶
type Storage interface { Put(<-chan *Entry, string, time.Time) error // Put is usually a bulk inserter from the channel that works in a goroutine, second argument is a Source of the data and third is the last update time. Get(domain string, sources ...string) ([]*Entry, error) // Get is a simple getter for the latest rank of the domain in a particular domain rank provider or all of them if nothing selected. }
Storage represents an interface to store and query ranks.
type TrancoIngester ¶
type TrancoIngester struct {
IngesterConf
}
TrancoIngester represents Ingester implementation for Tranco Top 1 Million websites. About: https://tranco-list.eu/
func (*TrancoIngester) Do ¶
func (in *TrancoIngester) Do() (chan *Entry, error)
Do implements Ingester Do func with the data from Tranco Top 1M CSV file
type UmbrellaIngester ¶
type UmbrellaIngester struct {
IngesterConf
}
UmbrellaIngester represents Ingester implementation for OpenDNS Umbrella Top 1M domains
More info: https://umbrella.cisco.com/blog/2016/12/14/cisco-umbrella-1-million/
func (*UmbrellaIngester) Do ¶
func (in *UmbrellaIngester) Do() (chan *Entry, error)
Do implements Ingester Do func with the data from OpenDNS
type YandexRadarIngester ¶ added in v2.5.0
type YandexRadarIngester struct {
IngesterConf
}
YandexRadarIngester represents Ingester implementation for Yandex Radar.
func NewYandexRadar ¶ added in v2.5.0
func NewYandexRadar() *YandexRadarIngester
NewYandexRadar bootstraps YandexRadarIngester.
func (*YandexRadarIngester) Do ¶ added in v2.5.0
func (in *YandexRadarIngester) Do() (chan *Entry, error)
Do implements Ingester Do func with the data.