dht

package module
v1.3.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2022 License: MIT Imports: 27 Imported by: 2

README

image

what's the new

  • ✅ update all depend mod to new
  • ✅ update to go 1.18
  • ✅ and config.LocalNodeId
  • ✅ 45396 DHT tracker server ips,now,fly in at high speed to DHT network
  • ✅ Rich annotations
  • ✅ Friendly UML diagram rendering
  • ✅ china,please use VPN over GWF
  • ✅ fix Stuttering problem at startup
  • ✅ fix do one time bug,now to tick 30 Second to do it
  • ✅ fix public ip changed, cleanAll blackIp to do join

Introduction

DHT implements the bittorrent DHT protocol in Go. Now it includes:

It contains two modes, the standard mode and the crawling mode. The standard mode follows the BEPs, and you can use it as a standard dht server. The crawling mode aims to crawl as more metadata info as possiple. It doesn't follow the standard BEPs protocol. With the crawling mode, you can build another BTDigg.

bthub.io is a BT search engine based on the crawling mode.

Installation


go get -u github.com/hktalent/dht@latest

Example

Below is a simple spider. You can move here to see more samples.

cd sample/spider
go build spider.go
# your Elasticsearch is http://127.0.0.1:9200/dht_index
./spider -resUrl="http://127.0.0.1:9200/dht_index/_doc/" -address=":0"
open http://127.0.0.1:9200/dht_index/_search?q=GB%20and%20mp4&pretty=true
open http://127.0.0.1:9200/dht_index/_search?q=1080P%20GB%20and%20mp4&pretty=true
open http://127.0.0.1:9200/dht_index/_search?q=pentest%20pdf&pretty=true
import (
    "fmt"
    "github.com/hktalent/dht"
)

func main() {
    downloader := dht.NewWire(65535)
    go func() {
        // once we got the request result
        for resp := range downloader.Response() {
            fmt.Println(resp.InfoHash, resp.MetadataInfo)
        }
    }()
    go downloader.Run()

    config := dht.NewCrawlConfig()
    config.OnAnnouncePeer = func(infoHash, ip string, port int) {
        // request to download the metadata info
        downloader.Request([]byte(infoHash), ip, port)
    }
    d := dht.New(config)

    d.Run()
}

Download

You can download the demo compiled binary file here.

Note

  • The default crawl mode configure costs about 300M RAM. Set MaxNodes and BlackListMaxSize to fit yourself.
  • Now it cant't run in LAN because of NAT.

TODO

  • ✅ NAT Traversal.
  • ✅ Implements the full BEP-3.
  • ✅ Optimization.

FAQ

Why it is slow compared to other spiders ?

Well, maybe there are several reasons.

  • DHT aims to implements the standard BitTorrent DHT protocol, not born for crawling the DHT network.
  • NAT Traversal issue. You run the crawler in a local network.
  • It will block ip which looks like bad and a good ip may be mis-judged.

License

MIT, read more here

Documentation

Overview

Package dht implements the bittorrent dht protocol. For more information see http://www.bittorrent.org/beps/bep_0005.html.

Index

Constants

View Source
const (
	// StandardMode follows the standard protocol
	StandardMode = iota
	// CrawlMode for crawling the dht network.值为1
	CrawlMode
)
View Source
const (
	// REQUEST represents request message type
	REQUEST = iota
	// DATA represents data message type
	DATA
	// REJECT represents reject message type
	REJECT
)
View Source
const (
	// BLOCK is 2 ^ 14
	BLOCK = 16384
	// MaxMetadataSize represents the max medata it can accept
	MaxMetadataSize = BLOCK * 1000
	// EXTENDED represents it is a extended message
	EXTENDED = 20
	// HANDSHAKE represents handshake bit
	HANDSHAKE = 0
)

Variables

View Source
var (
	// ErrNotReady is the error when DHT is not initialized.
	ErrNotReady = errors.New("dht is not ready")
	// ErrOnGetPeersResponseNotSet is the error that config
	// OnGetPeersResponseNotSet is not set when call dht.GetPeers.
	ErrOnGetPeersResponseNotSet = errors.New("OnGetPeersResponse is not set")
	ErrOnAnnouncePeerNotSet     = errors.New("OnAnnouncePeer is not set")
)
View Source
var (
	LocalNodeId = hex.EncodeToString([]byte("https://ee.51pwn.com"))[:20]
)

Functions

func Decode

func Decode(data []byte) (result interface{}, err error)

Decode decodes a bencoded string to string, int, list or map.

func DecodeDict

func DecodeDict(data []byte, start int) (
	result interface{}, index int, err error)

DecodeDict decodes a map value.

func DecodeInt

func DecodeInt(data []byte, start int) (
	result interface{}, index int, err error)

DecodeInt decodes int value in the data.

func DecodeList

func DecodeList(data []byte, start int) (
	result interface{}, index int, err error)

DecodeList decodes a list value.

func DecodeString

func DecodeString(data []byte, start int) (
	result interface{}, index int, err error)

DecodeString decodes a string in the data. It returns a tuple (decoded result, the end position, error).

func Encode

func Encode(data interface{}) string

Encode encodes a string, int, dict or list value to a bencoded string.

func EncodeDict

func EncodeDict(data map[string]interface{}) string

EncodeDict encodes a dict value.

func EncodeInt

func EncodeInt(data int) string

EncodeInt encodes a int value.

func EncodeList

func EncodeList(data []interface{}) string

EncodeList encodes a list value.

func EncodeString

func EncodeString(data string) string

EncodeString encodes a string value.

func Log added in v1.3.0

func Log(a ...interface{})

日志处理

func ParseKey

func ParseKey(data map[string]interface{}, key string, t string) error

ParseKey parses the key in dict data. `t` is type of the keyed value. It's one of "int", "string", "map", "list".

func ParseKeys

func ParseKeys(data map[string]interface{}, pairs [][]string) error

ParseKeys parses keys. It just wraps ParseKey.

Types

type Config

type Config struct {
	// 本地节点id
	LocalNodeId string
	// in mainline dht, k = 8
	K int
	// for crawling mode, we put all nodes in one bucket, so KBucketSize may
	// not be K
	KBucketSize int
	// candidates are udp, udp4, udp6
	Network string
	// format is `ip:port`
	Address string
	// the prime nodes through which we can join in dht network
	PrimeNodes []string
	// the kbucket expired duration
	KBucketExpiredAfter time.Duration
	// the node expired duration
	NodeExpriedAfter time.Duration
	// how long it checks whether the bucket is expired
	CheckKBucketPeriod time.Duration
	// peer token expired duration
	TokenExpiredAfter time.Duration
	// the max transaction id
	MaxTransactionCursor uint64
	// how many nodes routing table can hold
	MaxNodes int
	// callback when got get_peers request
	OnGetPeers func(string, string, int)
	// callback when receive get_peers response
	OnGetPeersResponse func(string, *Peer)
	// callback when got announce_peer request
	OnAnnouncePeer func(string, string, int)
	// blcoked ips
	BlockedIPs []string
	// blacklist size
	BlackListMaxSize int
	// StandardMode or CrawlMode
	Mode int
	// the times it tries when send fails
	Try int
	// the size of packet need to be dealt with
	PacketJobLimit int
	// the size of packet handler
	PacketWorkerLimit int
	// the nodes num to be fresh in a kbucket
	RefreshNodeNum int
	// 发布的资源信息
	AnnouncePeerLists []string
	GetPeerLists      []string
	StunList          StunList
	PublicIp          string
	QueryWorkLimit    int
	Log               *log.Logger
}

Config represents the configure of dht.

func NewCrawlConfig

func NewCrawlConfig() *Config

NewCrawlConfig returns a config in crawling mode. 爬虫配置 1、节点和kbucket有效期为0 2、监测kbucket周期5秒 3、当前node为空节点 4、当前配置从 NewStandardConfig 获得模版后再进行修改的配置

func NewStandardConfig

func NewStandardConfig() *Config

NewStandardConfig returns a Config pointer with default values. default:

BlackListMaxSize:     65536
MaxTransactionCursor:math.MaxUint32
Address:    ":0"
Network:     "udp4",
K:           8,
KBucketSize: 8,
// 下面几个时间参数一般不要调整,是DHT协议的规范约束
KBucketExpiredAfter、NodeExpriedAfter:15分钟
CheckKBucketPeriod:30秒
TokenExpiredAfter:10分钟

type DHT

type DHT struct {
	*Config

	Ready bool
	// contains filtered or unexported fields
}

DHT represents a DHT node.

func New

func New(config *Config) *DHT

New returns a DHT pointer. If config is nil, then config will be set to the default config. 注意: 1、创建了一个随机id的节点 workerTokens满了,数量等于 PacketWorkerLimit时,数据就丢弃

func (*DHT) AnnouncePeer added in v1.0.6

func (dht *DHT) AnnouncePeer(infoHash string) error

1、通过infoHash 通知相邻节点,我提供、有某资源的下载、关注infoHash的种子文件 2、这个过程只是通知当前内存中得到的相邻节点 3、通过config.OnAnnouncePeer得到反馈 4、加到发布的列表中,定时器进行发布,不仅仅是一次,每10秒执行一次

func (*DHT) DoAllGetPeers added in v1.2.5

func (dht *DHT) DoAllGetPeers()

1、执行所有想获取的infoHash信息 2、通过OnGetPeersResponse 回调获取结果

func (*DHT) GetPeers

func (dht *DHT) GetPeers(infoHash string) error

GetPeers returns peers who have announced having infoHash. GetPeers 向相邻节点发起匿名 infohash查询 注意:

1、这种查询使用时需要间隔时间不停查询,直到有结果
2、这里只是向当前内存路由表中临近的节点发起一次 get_peers 查询,没有查到是不管的
3、通过OnGetPeersResponse 获取结果

func (*DHT) IsCrawlMode

func (dht *DHT) IsCrawlMode() bool

IsCrawlMode returns whether mode is CrawlMode.

func (*DHT) IsStandardMode

func (dht *DHT) IsStandardMode() bool

IsStandardMode returns whether mode is StandardMode.

func (*DHT) Join2addr added in v1.2.7

func (dht *DHT) Join2addr(addr string)

func (*DHT) Log added in v1.2.5

func (dht *DHT) Log(args ...interface{})

func (*DHT) RemoveAnnouncePeer added in v1.2.3

func (dht *DHT) RemoveAnnouncePeer(infoHash string) bool

remove publish peer

func (*DHT) Run

func (dht *DHT) Run()

Run starts the dht. 1、初始化,监听 2、并行异步不停息接收udp数据 3、并行异步不停加入临近、活跃节点,也就是加入DHT网络 4、路由表的时候,继续加入joinDHT网络 5、transaction管理表 为空(size==0)的时候,刷新路由表生命周期 6、每CheckKBucketPeriod(30)秒执行一次join 加入DHT网络

func (*DHT) Stop added in v1.3.3

func (dht *DHT) Stop()

type Fn added in v1.2.3

type Fn func()

定义函数类型

type IStunList added in v1.1.2

type IStunList interface {
	GetStunList() []string
	GetDhtList() []string
	GetDhtMultiaddr() []multiaddr.Multiaddr
}

type MyTicker added in v1.2.3

type MyTicker struct {
	MyTick *time.Ticker
	Runner Fn
	// contains filtered or unexported fields
}

定时器中的成员

func NewMyTick added in v1.2.3

func NewMyTick(interval int, f Fn) *MyTicker

func (*MyTicker) Start added in v1.2.3

func (t *MyTicker) Start()

启动定时器需要执行的任务

func (*MyTicker) Stop added in v1.2.3

func (t *MyTicker) Stop()

type Peer

type Peer struct {
	IP   net.IP
	Port int
	// contains filtered or unexported fields
}

Peer represents a peer contact. 每个peer有:ip、port、token

func (*Peer) CompactIPPortInfo

func (p *Peer) CompactIPPortInfo() string

CompactIPPortInfo returns "Compact node info". See http://www.bittorrent.org/beps/bep_0005.html.

type Request

type Request struct {
	InfoHash []byte
	IP       string
	Port     int
}

Request represents the request context.

type Response

type Response struct {
	Request
	MetadataInfo []byte
}

Response contains the request context and the metadata info.

type StunList added in v1.1.2

type StunList struct {
}

func (StunList) GetDhtList added in v1.1.2

func (r StunList) GetDhtList() []string

func (StunList) GetDhtMma added in v1.1.2

func (r StunList) GetDhtMma() []multiaddr.Multiaddr

func (StunList) GetDhtUdpLists added in v1.1.2

func (r StunList) GetDhtUdpLists() []string

func (StunList) GetSelfPublicIpPort added in v1.2.2

func (r StunList) GetSelfPublicIpPort() (string, int)

获取本机NAT的public ip和port

func (StunList) GetSelfPublicIpPort1 added in v1.2.7

func (r StunList) GetSelfPublicIpPort1() (string, int)

func (StunList) GetStunLists added in v1.2.2

func (r StunList) GetStunLists() []string

获取stun服务器列表

func (StunList) SliceIndex added in v1.3.0

func (r StunList) SliceIndex(element string, data []string) int

从data中查找element

type Wire

type Wire struct {
	// contains filtered or unexported fields
}

Wire represents the wire protocol.

func NewWire

func NewWire(blackListSize, requestQueueSize, workerQueueSize int) *Wire

NewWire returns a Wire pointer.

  • blackListSize: the blacklist size
  • requestQueueSize: the max requests it can buffers
  • workerQueueSize: the max goroutine downloading workers

func (*Wire) Request

func (wire *Wire) Request(infoHash []byte, ip string, port int)

Request pushes the request to the queue.

func (*Wire) Response

func (wire *Wire) Response() <-chan Response

Response returns a chan of Response.

func (*Wire) Run

func (wire *Wire) Run()

Run starts the peer wire protocol.

Directories

Path Synopsis
sample

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL