Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Core ¶ added in v3.78.1
type Core struct {
// contains filtered or unexported fields
}
Core encapsulates the operations and data structures used for keyword matching via the Aho-Corasick algorithm. It is responsible for constructing and managing the trie for efficient substring searches, as well as mapping keywords to their associated detectors for rapid lookups.
func NewAhoCorasickCore ¶
func NewAhoCorasickCore(allDetectors []detectors.Detector, opts ...CoreOption) *Core
NewAhoCorasickCore allocates and initializes a new instance of AhoCorasickCore. It uses the provided detector slice to create a map from keywords to detectors and build the Aho-Corasick prefilter trie.
func (*Core) FindDetectorMatches ¶ added in v3.78.1
func (ac *Core) FindDetectorMatches(chunkData []byte) []*DetectorMatch
FindDetectorMatches finds the matching detectors for a given chunk of data using the Aho-Corasick algorithm. It returns a slice of DetectorMatch instances, each containing the detector key, detector, a slice of matchSpans, and the corresponding matched portions of the chunk data.
Each matchSpan represents a position in the chunk data where a keyword was found, along with the corresponding span (start and end positions). The span is determined based on the configured spanCalculator strategy. Adjacent or overlapping matches are merged to avoid duplicating or overlapping the matched portions of the chunk data.
The matches field contains the actual byte slices of the matched portions from the chunk data.
func (*Core) KeywordsToDetectors ¶ added in v3.78.1
func (ac *Core) KeywordsToDetectors() map[string][]DetectorKey
type CoreOption ¶ added in v3.78.1
type CoreOption func(*Core)
CoreOption is a functional option type for configuring an AhoCorasickCore instance.
func WithSpanCalculator ¶ added in v3.78.1
func WithSpanCalculator(spanCalculator spanCalculator) CoreOption
WithSpanCalculator sets the span calculator for AhoCorasickCore.
type DetectorKey ¶
type DetectorKey struct {
// contains filtered or unexported fields
}
DetectorKey is used to identify a detector in the keywordsToDetectors map. Multiple detectors can have the same detector type but different versions. This allows us to identify a detector by its type and version. An additional (optional) field is provided to disambiguate multiple custom detectors. This type is exported even though none of its fields are so that the AhoCorasickCore can populate passed-in maps keyed on this type without exposing any of its internals to consumers.
func CreateDetectorKey ¶ added in v3.67.0
func CreateDetectorKey(d detectors.Detector) DetectorKey
CreateDetectorKey creates a unique key for each detector from its type, version, and, for custom regex detectors, its name.
func (DetectorKey) Type ¶ added in v3.67.2
func (k DetectorKey) Type() detectorspb.DetectorType
Type returns the detector type of the key.
type DetectorMatch ¶ added in v3.78.1
type DetectorMatch struct { Key DetectorKey detectors.Detector // contains filtered or unexported fields }
DetectorMatch represents a detected pattern's metadata in a data chunk. It encapsulates the key identifying a specific detector, the detector instance itself, the start and end offsets of the matched keyword in the chunk, and the matched portions of the chunk data.
func (*DetectorMatch) Matches ¶ added in v3.78.1
func (d *DetectorMatch) Matches() [][]byte
Matches returns a slice of byte slices, each representing a matched portion of the chunk data.
type EntireChunkSpanCalculator ¶ added in v3.78.1
type EntireChunkSpanCalculator struct{}
EntireChunkSpanCalculator is a strategy that calculates the match span to use the entire chunk data. This is used when we want to match against the full length of the provided chunk.