Documentation ¶
Index ¶
- Constants
- func MustParsePeerList(path string) []string
- func ParsePeerList(path string) ([]string, error)
- type BMSOption
- type BaseCrawler
- type CrawlResult
- type CrawlerOption
- func CreateOptionsFromEnv() ([]CrawlerOption, error)
- func MustCreateOptionsFromEnv() []CrawlerOption
- func WithAdditionalCrawlerConfig(crawlerConfig map[string]any) CrawlerOption
- func WithBMS(server string, monitor string, authToken string, botnet string, ...) CrawlerOption
- func WithCustomBlacklist(blacklist []string) CrawlerOption
- func WithCustomCrawlIntervals(discoveryInterval uint32, trackingInterval uint32, ...) CrawlerOption
- func WithCustomWorkerCounts(discoveryWorkerCount uint32, trackingWorkerCount uint32, ...) CrawlerOption
- func WithIncludeSpecialUseIPs(includeSpecialUseIPs bool) CrawlerOption
- func WithLogger(logger *slog.Logger) CrawlerOption
- type PeerFinder
- type TCPBotnetCrawler
- type UDPBotnetCrawler
Examples ¶
Constants ¶
const ( // The context key that is used to store as which loop something is executed in (e.g. discovery_loop) // Mostly useful for logging ContextKeyLoop contextKey = iota // The context key that is used to store as which worker ID is executing something // Mostly useful for logging ContextKeyWorkerID contextKey = iota )
Variables ¶
This section is empty.
Functions ¶
func MustParsePeerList ¶
MustParsePeerList is like ParsePeerList but will panic if it encounters an error.
func ParsePeerList ¶
ParsePeerList takes a CSV file of peers (in format ip:port), checks that it's valid and transforms it to a string slice.
It can be used by crawler implementations to parse a peerlist needed to bootstrap a crawler in a common way.
Types ¶
type BMSOption ¶
type BMSOption func(*bmsConfig)
Functions which implement CrawlerOption can be passed to WithBMS as additional options.
func WithBMSCampaign ¶
WithBMSCampaign can be used to set a campaign when sending results to BMS. Has to be passed to WithBMS, as it only can be used in combination with a BMS connection.
func WithBMSPublicIP ¶
WithBMSCampaign can be used to set the public IP address of a crawler which will be written to the BMS database. Has to be passed to WithBMS, as it only can be used in combination with a BMS connection.
type BaseCrawler ¶
type BaseCrawler struct {
// contains filtered or unexported fields
}
The BaseCrawler struct represents a basecrawler instance.
To create a new instance, use NewCrawler.
func NewCrawler ¶
func NewCrawler(botnetImplementation any, bootstrapPeers []string, options ...CrawlerOption) (*BaseCrawler, error)
NewCrawler creates a new crawler based on the basecrawler which embeds the given specific implementation.
The first parameter is the actual implementation of the botnet protocol. The passed struct has to either implement the TCPBotnetCrawler or the UDPBotnetCrawler interface.
You can make sure your struct implements e.g. the UDPBotnetCrawler interface at compile time by putting the following in your code:
// Make sure that someImplementation is implementing the UDPBotnetCrawler interface (at compile time) var _ basecrawler.UDPBotnetCrawler = &someImplementation{}
The entries of the bootstrap peerlist have to be in format parsable by net.SplitHostPort (oftentimes it's simply ip:port).
You can pass optional config to the crawler as all further parameters. All options have to implement the CrawlerOption type (see CrawlerOption for available options). If you want to read optional crawler configuration from environment variables (in a common way), you can use CreateOptionsFromEnv (see the configuration section in the readme for possible environment variables).
Although calling this method will already start a BMS session (if configured with BMS), it will not start the actual crawling. To start it, call BaseCrawler.Start. The main reason for this layout is that at some point we want to introduce crawler instrumentation, i.e. that you have a piece of software that is able to crawl multiple botnets (and therefore contains multiple crawler instances) which is controlled by a central management server.
func (*BaseCrawler) IsBlacklisted ¶
func (bc *BaseCrawler) IsBlacklisted(ipPort string) bool
IsBlacklisted returns whether the crawler considers a bot blacklisted and therefore won't crawl it.
The bot should be provided as ip:port (so that net.SplitHostPort). If the given bot can't be parsed it will be considered as blacklisted (to make sure the crawler won't crawl any bogon IPs).
By default the blacklist contains all special-use IP addresses. This can be changed by providing the WithIncludeSpecialUseIPs option. Additional IP ranges can be blacklisted by providing the WithCustomBlacklist option.
func (*BaseCrawler) Logger ¶
func (bc *BaseCrawler) Logger() *slog.Logger
Logger returns the internal logger of the basecrawler.
You may want to pass this logger to your crawler implementation manually, so that you can use the same logger (which is mostly needed if you let CreateOptionsFromEnv create the logger for you).
func (*BaseCrawler) Start ¶
func (bc *BaseCrawler) Start(ctx context.Context) error
Start starts the crawler instance. Depending on the crawler configuration, it will spawn several Go routines:
- A Go routine that manages the discovery loop which itself starts more worker Go routines.
- A Go routine that manages the tracking loop which itself starts more worker Go routines.
- A Go routine that manages the find-peer loop which itself starts more worker Go routines (if the crawler implementation implements the PeerFinder interface).
- A Go routine that sends crawled bot replies, edges and failed tries to BMS (if configured to use BMS).
- A Go routine that tries to reconnect to BMS if the connection broke (if configured to use BMS).
You can stop these Go routines by passing a cancelable context (like in the CancelContext example).
This method is meant for instrumentation of multiple crawler instances (though we never got around to use it). The basic idea is to have multiple crawler instances that can be exchanged (stop one crawler, start another one) on the fly by a central management instance.
Example (CancelContext) ¶
package main import ( "context" "time" "github.com/botnet-monitoring/basecrawler" ) func main() { // Providing this to NewCrawler actually will result in an error since it neither implements TCPBotnetCrawler nor UDPBotnetCrawler someProperImplementation := struct{}{} crawler, err := basecrawler.NewCrawler( someProperImplementation, []string{ "192.0.2.1:20001", "192.0.2.2:20002", "192.0.2.3:20003", }, ) if err != nil { panic(err) } ctx, cancel := context.WithCancel(context.Background()) crawler.Start(ctx) time.Sleep(5 * time.Second) cancel() // Do other stuff }
Output:
func (*BaseCrawler) Stop ¶
func (bc *BaseCrawler) Stop(ctx context.Context, reason bmsclient.DisconnectReason) error
Stop currently just ends the internal BMS session with the provided disconnect reason (if there's an active BMS session). If you want to stop the crawl loops, pass a cancelable context to BaseCrawler.Start and cancel it.
This method is meant for instrumentation of multiple crawler instances (though we never got around to use it). Please also note that the crawler very likely won't start again once stopped (which we would need to fix before being able to do proper instrumentation).
type CrawlResult ¶
type CrawlResult int
CrawlResult represents the result of a crawl attempt of a bot
const ( // Crawled host responded but response classified as benign BENIGN_REPLY CrawlResult = iota // Crawled host responded and response is definitely from a malicious bot BOT_REPLY CrawlResult = iota // Crawled host did not respond NO_REPLY CrawlResult = iota )
type CrawlerOption ¶
type CrawlerOption func(*config)
Functions which implement CrawlerOption can be passed to NewCrawler as additional options.
func CreateOptionsFromEnv ¶
func CreateOptionsFromEnv() ([]CrawlerOption, error)
CreateOptionsFromEnv creates a collection of crawler options based on environment variables. It can be used by crawler implementations to configure the basecrawler and is part of this module as every crawler implementation would need to re-implement it otherwise.
See the configuration section in the readme for available environment variables.
If LOG_LEVEL is provided, this function will create a logger. Please note that this might overwrite the logger provided by the crawler implementation (as options are applied in order). To get the created logger you can use BaseCrawler.Logger.
func MustCreateOptionsFromEnv ¶
func MustCreateOptionsFromEnv() []CrawlerOption
MustCreateOptionsFromEnv is like CreateOptionsFromEnv but will panic if it encounters an error.
func WithAdditionalCrawlerConfig ¶
func WithAdditionalCrawlerConfig(crawlerConfig map[string]any) CrawlerOption
WithAdditionalCrawlerConfig can be used to provide custom crawler configuration to the crawler implementation (e.g. SendPeerRequest or ReadReply).
The given config map will be passed as-is to all functions contained in TCPBotnetCrawler, UDPBotnetCrawler and PeerFinder.
func WithBMS ¶
func WithBMS(server string, monitor string, authToken string, botnet string, options ...BMSOption) CrawlerOption
WithBMS can be used to configure the crawler to send the crawling results to a BMS server.
The server has to be passed as ip:port (e.g. localhost:8083), the authToken as base64-encoded string (which has contain exactly 32 bytes). The given monitor ID and botnet ID have to exist on the BMS server, otherwise trying to start the crawler will return an error.
You can pass optional BMS config as last parameter via BMSOption.
Please note that the option cannot check validity of the used values, so creating the BMS client or server might fail when trying to start the crawler.
func WithCustomBlacklist ¶
func WithCustomBlacklist(blacklist []string) CrawlerOption
WithCustomBlacklist can be used to add IP address ranges to the crawler's blacklist (see BaseCrawler.IsBlacklisted).
The given strings have to be in CIDR notation. You might want to exclude your own crawlers from the crawling, so e.g. if you have two crawlers running on 198.51.100.1 and 198.51.100.2, you probably want to pass a string slice with 198.51.100.1/32 and 198.51.100.2/32.
func WithCustomCrawlIntervals ¶
func WithCustomCrawlIntervals(discoveryInterval uint32, trackingInterval uint32, discoveryRemoveInterval uint32, trackingRemoveInterval uint32) CrawlerOption
WithCustomCrawlIntervals can be used to change how often the crawler will crawl potential bots and after how long of being unresponsive it will drop them.
All intervals are considered to be seconds. If you only want to change on of the intervals, pass zero for the other parameters. Defaults are 300s for discovery interval and tracking interval, and 900s for discovery remove interval and tracking remove interval.
func WithCustomWorkerCounts ¶
func WithCustomWorkerCounts(discoveryWorkerCount uint32, trackingWorkerCount uint32, findPeerWorkerCount uint32) CrawlerOption
WithCustomWorkerCounts can be used to change the amount of workers the various loops will spawn.
If you only want to change the amount of one of the loops, pass zero for the other parameters. By default the discovery loop and tracking loop will spawn 10000 worker Go routines and the find-peer loop will spawn 100 worker Go routines (if it's used).
func WithIncludeSpecialUseIPs ¶
func WithIncludeSpecialUseIPs(includeSpecialUseIPs bool) CrawlerOption
WithIncludeSpecialUseIPs can be used to include special-use IP addresses into the crawling (see BaseCrawler.IsBlacklisted).
You probably want to leave this setting on its default (so that the blacklist contains the special-use IP addresses), however e.g. for local testing you might want to include them.
func WithLogger ¶
func WithLogger(logger *slog.Logger) CrawlerOption
WithLogger can be used to pass a custom logger to the basecrawler. By default, the basecrawler doesn't log anything, so if you want to have any logs, you have to pass this option.
If you also use CreateOptionsFromEnv, it might create its own logger (depending on the value of LOG_LEVEL) which might overwrite another provided logger (options are applied in the order they are passed). If you want to pass the created logger to your crawler implementation, use BaseCrawler.Logger.