engine

package
v3.86.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 11, 2024 License: AGPL-3.0 Imports: 48 Imported by: 1

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func FragmentFirstLineAndLink(chunk *sources.Chunk) (int64, *int64, string)

FragmentFirstLineAndLink extracts the first line number and the link from the chunk metadata. It returns:

  • The first line number of the fragment.
  • A pointer to the line number, facilitating direct updates.
  • The link associated with the fragment. This link may be updated in the chunk metadata if there's a change in the line number.

func FragmentLineOffset added in v3.4.3

func FragmentLineOffset(chunk *sources.Chunk, result *detectors.Result) (int64, bool)

FragmentLineOffset sets the line number for a provided source chunk with a given detector result.

func SetResultLineNumber added in v3.19.0

func SetResultLineNumber(chunk *sources.Chunk, result *detectors.Result, fragStart int64, mdLine *int64) bool

SetResultLineNumber sets the line number in the provided result.

func SupportsLineNumbers added in v3.25.0

func SupportsLineNumbers(sourceType sourcespb.SourceType) bool

SupportsLineNumbers determines if a line number can be found for a source type.

func UpdateLink(ctx context.Context, metadata *source_metadatapb.MetaData, link string, line int64) error

UpdateLink updates the link of the provided source metadata.

Types

type Config added in v3.78.2

type Config struct {
	// Number of concurrent scanner workers,
	// also serves as a multiplier for other worker types (e.g., detector workers, notifier workers)
	Concurrency int

	Decoders                      []decoders.Decoder
	Detectors                     []detectors.Detector
	DetectorVerificationOverrides map[config.DetectorID]bool
	IncludeDetectors              string
	ExcludeDetectors              string
	CustomVerifiersOnly           bool
	VerifierEndpoints             map[string]string

	// Verify determines whether the scanner will verify candidate secrets.
	Verify bool

	// Defines which results will be notified by the engine
	// (e.g., verified, unverified, unknown)
	Results               map[string]struct{}
	LogFilteredUnverified bool

	// FilterEntropy filters out unverified results using Shannon entropy.
	FilterEntropy float64
	// FilterUnverified sets the filterUnverified flag on the engine. If set to
	// true, the engine will only return the first unverified result for a chunk for a detector.
	FilterUnverified      bool
	ShouldScanEntireChunk bool

	Dispatcher ResultsDispatcher

	// SourceManager is used to manage the sources and units.
	// TODO (ahrav): Update this comment, i'm dumb and don't really know what else it does.
	SourceManager *sources.SourceManager

	// PrintAvgDetectorTime sets the printAvgDetectorTime flag on the engine. If set to
	// true, the engine will print the average time taken by each detector.
	// This option allows us to measure the time taken for each detector ONLY if
	// the engine is configured to print the results.
	// Calculating the average time taken by each detector is an expensive operation
	// and should be avoided unless specified by the user.
	PrintAvgDetectorTime bool

	// VerificationOverlap determines whether the scanner will attempt to verify candidate secrets
	// that have been detected by multiple detectors.
	// By default, it is set to true.
	VerificationOverlap bool

	// DetectorWorkerMultiplier is used to determine the number of detector workers to spawn.
	DetectorWorkerMultiplier int

	// NotificationWorkerMultiplier is used to determine the number of notification workers to spawn.
	NotificationWorkerMultiplier int

	// VerificationOverlapWorkerMultiplier is used to determine the number of verification overlap workers to spawn.
	VerificationOverlapWorkerMultiplier int
}

Config used to configure the engine.

type Engine

type Engine struct {
	WgNotifier sync.WaitGroup
	// contains filtered or unexported fields
}

Engine represents the core scanning engine responsible for detecting secrets in input data. It manages the lifecycle of the scanning process, including initialization, worker management, and result notification. The engine is designed to be flexible and configurable, allowing for customization through various options and configurations.

func NewEngine added in v3.78.2

func NewEngine(ctx context.Context, cfg *Config) (*Engine, error)

NewEngine creates a new Engine instance with the provided configuration.

func (*Engine) ChunksChan

func (e *Engine) ChunksChan() <-chan *sources.Chunk

func (*Engine) DetectorAvgTime

func (e *Engine) DetectorAvgTime() map[string][]time.Duration

DetectorAvgTime returns the average time taken by each detector.

func (*Engine) Finish added in v3.6.1

func (e *Engine) Finish(ctx context.Context) error

Finish waits for running sources to complete and workers to finish scanning chunks before closing their respective channels. Once Finish is called, no more sources may be scanned by the engine.

func (*Engine) GetDetectorsMetrics added in v3.46.0

func (e *Engine) GetDetectorsMetrics() map[string]time.Duration

GetDetectorsMetrics returns a copy of the average time taken by each detector.

func (*Engine) GetMetrics added in v3.46.0

func (e *Engine) GetMetrics() Metrics

GetMetrics returns a copy of Metrics. It's safe for concurrent use, and the caller can't modify the original data.

func (*Engine) HasFoundResults added in v3.46.0

func (e *Engine) HasFoundResults() bool

HasFoundResults returns true if any results are found.

func (*Engine) ResultsChan

func (e *Engine) ResultsChan() chan detectors.ResultWithMetadata

func (*Engine) ScanChunk added in v3.51.0

func (e *Engine) ScanChunk(chunk *sources.Chunk)

ScanChunk injects a chunk into the output stream of chunks to be scanned. This method should rarely be used. TODO(THOG-1577): Remove when dependencies no longer rely on this functionality.

func (*Engine) ScanCircleCI added in v3.23.0

func (e *Engine) ScanCircleCI(ctx context.Context, token string) (sources.JobProgressRef, error)

ScanCircleCI scans CircleCI logs.

func (*Engine) ScanDocker added in v3.41.0

ScanDocker scans a given docker connection.

func (*Engine) ScanElasticsearch added in v3.77.0

func (e *Engine) ScanElasticsearch(ctx context.Context, c sources.ElasticsearchConfig) (sources.JobProgressRef, error)

ScanElasticsearch scans a Elasticsearch installation.

func (*Engine) ScanFileSystem

ScanFileSystem scans a given file system.

func (*Engine) ScanGCS added in v3.29.0

ScanGCS with the provided options.

func (*Engine) ScanGit

ScanGit scans any git source.

func (*Engine) ScanGitHub

ScanGitHub scans GitHub with the provided options.

func (*Engine) ScanGitHubExperimental added in v3.80.6

func (e *Engine) ScanGitHubExperimental(ctx context.Context, c sources.GitHubExperimentalConfig) (sources.JobProgressRef, error)

ScanGitHubExperimental scans GitHub using an experimental feature. Consider all functionality to be in an alpha release here.

func (*Engine) ScanGitLab

ScanGitLab scans GitLab with the provided configuration.

func (*Engine) ScanHuggingface added in v3.80.0

func (e *Engine) ScanHuggingface(ctx context.Context, c HuggingfaceConfig) (sources.JobProgressRef, error)

ScanGitHub scans HuggingFace with the provided options.

func (*Engine) ScanJenkins added in v3.78.0

func (e *Engine) ScanJenkins(ctx context.Context, jenkinsConfig JenkinsConfig) (sources.JobProgressRef, error)

ScanJenkins scans Jenkins logs.

func (*Engine) ScanPostman added in v3.71.0

ScanPostman scans Postman with the provided options.

func (*Engine) ScanS3

ScanS3 scans S3 buckets.

func (*Engine) ScanSyslog added in v3.4.3

ScanSyslog is a source that scans syslog files.

func (*Engine) ScanTravisCI added in v3.62.0

func (e *Engine) ScanTravisCI(ctx context.Context, token string) (sources.JobProgressRef, error)

ScanTravisCI scans TravisCI logs.

func (*Engine) Start added in v3.78.2

func (e *Engine) Start(ctx context.Context)

Start initializes and activates the engine's processing pipeline. It sets up various default configurations, prepares lookup structures for detectors, and kickstarts all necessary workers. Once started, the engine begins processing input data to identify secrets.

type HuggingfaceConfig added in v3.80.0

type HuggingfaceConfig struct {
	Endpoint           string
	Models             []string
	Spaces             []string
	Datasets           []string
	Organizations      []string
	Users              []string
	IncludeModels      []string
	IgnoreModels       []string
	IncludeSpaces      []string
	IgnoreSpaces       []string
	IncludeDatasets    []string
	IgnoreDatasets     []string
	SkipAllModels      bool
	SkipAllSpaces      bool
	SkipAllDatasets    bool
	IncludeDiscussions bool
	IncludePrs         bool
	Token              string
	Concurrency        int
}

HuggingFaceConfig represents the configuration for HuggingFace.

type JenkinsConfig added in v3.78.0

type JenkinsConfig struct {
	Endpoint              string
	Username              string
	Password              string
	Header                string
	InsecureSkipVerifyTLS bool
}

type Metrics added in v3.46.0

type Metrics struct {
	BytesScanned           uint64
	ChunksScanned          uint64
	VerifiedSecretsFound   uint64
	UnverifiedSecretsFound uint64
	AvgDetectorTime        map[string]time.Duration

	ScanDuration time.Duration
	// contains filtered or unexported fields
}

Metrics for the scan engine for external consumption.

type Printer added in v3.46.0

type Printer interface {
	Print(ctx context.Context, r *detectors.ResultWithMetadata) error
}

Printer is used to format found results and output them to the user. Ex JSON, plain text, etc. Please note printer implementations SHOULD BE thread safe.

type PrinterDispatcher added in v3.78.2

type PrinterDispatcher struct {
	// contains filtered or unexported fields
}

PrinterDispatcher wraps an existing Printer implementation and adapts it to the ResultsDispatcher interface.

func NewPrinterDispatcher added in v3.78.2

func NewPrinterDispatcher(printer Printer) *PrinterDispatcher

NewPrinterDispatcher creates a new PrinterDispatcher instance with the provided Printer.

func (*PrinterDispatcher) Dispatch added in v3.78.2

Dispatch sends the result to the printer.

type ResultsDispatcher added in v3.78.2

type ResultsDispatcher interface {
	Dispatch(ctx context.Context, result detectors.ResultWithMetadata) error
}

ResultsDispatcher is an interface for dispatching findings of detected results. Implementations can vary from printing results to the console to sending results to an external system.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL