ahocorasick

package
v3.80.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 17, 2024 License: AGPL-3.0 Imports: 6 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Core added in v3.78.1

type Core struct {
	// contains filtered or unexported fields
}

Core encapsulates the operations and data structures used for keyword matching via the Aho-Corasick algorithm. It is responsible for constructing and managing the trie for efficient substring searches, as well as mapping keywords to their associated detectors for rapid lookups.

func NewAhoCorasickCore

func NewAhoCorasickCore(allDetectors []detectors.Detector, opts ...CoreOption) *Core

NewAhoCorasickCore allocates and initializes a new instance of AhoCorasickCore. It uses the provided detector slice to create a map from keywords to detectors and build the Aho-Corasick prefilter trie.

func (*Core) FindDetectorMatches added in v3.78.1

func (ac *Core) FindDetectorMatches(chunkData []byte) []*DetectorMatch

FindDetectorMatches finds the matching detectors for a given chunk of data using the Aho-Corasick algorithm. It returns a slice of DetectorMatch instances, each containing the detector key, detector, a slice of matchSpans, and the corresponding matched portions of the chunk data.

Each matchSpan represents a position in the chunk data where a keyword was found, along with the corresponding span (start and end positions). The span is determined based on the configured spanCalculator strategy. Adjacent or overlapping matches are merged to avoid duplicating or overlapping the matched portions of the chunk data.

The matches field contains the actual byte slices of the matched portions from the chunk data.

func (*Core) KeywordsToDetectors added in v3.78.1

func (ac *Core) KeywordsToDetectors() map[string][]DetectorKey

type CoreOption added in v3.78.1

type CoreOption func(*Core)

CoreOption is a functional option type for configuring an AhoCorasickCore instance.

func WithSpanCalculator added in v3.78.1

func WithSpanCalculator(spanCalculator spanCalculator) CoreOption

WithSpanCalculator sets the span calculator for AhoCorasickCore.

type DetectorKey

type DetectorKey struct {
	// contains filtered or unexported fields
}

DetectorKey is used to identify a detector in the keywordsToDetectors map. Multiple detectors can have the same detector type but different versions. This allows us to identify a detector by its type and version. An additional (optional) field is provided to disambiguate multiple custom detectors. This type is exported even though none of its fields are so that the AhoCorasickCore can populate passed-in maps keyed on this type without exposing any of its internals to consumers.

func CreateDetectorKey added in v3.67.0

func CreateDetectorKey(d detectors.Detector) DetectorKey

CreateDetectorKey creates a unique key for each detector from its type, version, and, for custom regex detectors, its name.

func (DetectorKey) Type added in v3.67.2

Type returns the detector type of the key.

type DetectorMatch added in v3.78.1

type DetectorMatch struct {
	Key DetectorKey
	detectors.Detector
	// contains filtered or unexported fields
}

DetectorMatch represents a detected pattern's metadata in a data chunk. It encapsulates the key identifying a specific detector, the detector instance itself, the start and end offsets of the matched keyword in the chunk, and the matched portions of the chunk data.

func (*DetectorMatch) Matches added in v3.78.1

func (d *DetectorMatch) Matches() [][]byte

Matches returns a slice of byte slices, each representing a matched portion of the chunk data.

type EntireChunkSpanCalculator added in v3.78.1

type EntireChunkSpanCalculator struct{}

EntireChunkSpanCalculator is a strategy that calculates the match span to use the entire chunk data. This is used when we want to match against the full length of the provided chunk.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL