parser

package
v1.6.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 9, 2024 License: MIT Imports: 32 Imported by: 1

README

![gopherbadger-tag-do-not-edit]

Parser

Parser is in charge of turning raw log lines into objects that can be manipulated by heuristics. Parsing has several stages represented by directories on config/stage. The alphabetical order dictates the order in which the stages/parsers are processed.

The runtime representation of a line being parsed (or an overflow) is an Event, and has fields that can be manipulated by user :

  • Parsed : a string dict containing parser outputs
  • Meta : a string dict containing meta information about the event
  • Line : a raw line representation
  • Overflow : a representation of the overflow if applicable

The Event structure goes through the stages, being altered with each parsing step. It's the same object that will be later poured into buckets.

Parser configuration

A parser configuration is a Node object, that can contain grok patterns, enrichement instructions.

For example :

filter: "evt.Line.Labels.type == 'testlog'"
debug: true
onsuccess: next_stage
name: tests/base-grok
pattern_syntax:
  MYCAP: ".*"
nodes:
  - grok:
      pattern: ^xxheader %{MYCAP:extracted_value} trailing stuff$
      apply_on: Line.Raw
statics:
  - meta: log_type
    value: parsed_testlog
Name

optional if present and prometheus or profiling are activated, stats will be generated for this node.

Filter

filter: "Line.Src endsWith '/foobar'"

  • optional filter : an expression that will be evaluated against the runtime of a line (Event)
    • if the filter is present and returns false, node is not evaluated
    • if filter is absent or present and returns true, node is evaluated
Debug flag

debug: true

  • optional debug : a bool that sets debug of the node to true (applies at runtime and configuration parsing)
OnSuccess flag

onsuccess: next_stage|continue

  • mandatory indicates the behavior to follow if the node succeeds. next_stage make the line go to the next stage, while continue will continue processing the current stage.
Statics
statics:
    - meta: service
      value: tcp
    - meta: source_ip
      expression: "Event['source_ip']"
    - parsed: "new_connection"
      expression: "Event['tcpflags'] contains 'S' ? 'true' : 'false'"
    - target: Parsed.this_is_a_test
      value: foobar

Statics apply when a node is considered successful, and are used to alter the Event structure. An empty node, a node with a grok pattern that succeeded or an enrichment directive that worked are successful nodes. Statics can :

  • meta: add/alter an entry in the Meta dict
  • parsed: add/alter an entry in the Parsed dict
  • target: indicate a destination field by name, such as Meta.my_key The source of data can be :
  • value: a static value
  • expr_result : the result of an expression
Grok patterns

Grok patterns are used to parse one field of Event into one or several others :

grok:
  name: "TCPDUMP_OUTPUT"
  apply_on: message

name is the name of a pattern loaded from patterns/. Base patterns can be seen on the repo : https://github.com/crowdsecurity/grokky/blob/master/base.go


grok:
  pattern: "^%{GREEDYDATA:request}\\?%{GREEDYDATA:http_args}$"
  apply_on: request

pattern which is a valid pattern, optionally with an apply_on that indicates to which field it should be applied

Patterns syntax

Present at the Event level, the pattern_syntax is a list of subgroks to be declared.

pattern_syntax:
  DIR: "^.*/"
  FILE: "[^/].*$"
Enrichment

The Enrichment mechanism is exposed via statics :

statics:
  - method: GeoIpCity
    expression: Meta.source_ip
  - meta: IsoCode
    expression: Enriched.IsoCode
  - meta: IsInEU
    expression: Enriched.IsInEU

The GeoIpCity method is called with the value of Meta.source_ip. Enrichment plugins can output one or more key:values in the Enriched map, and it's up to the user to copy the relevant values to Meta or such.

Trees

The Node object allows as well a nodes entry, which is a list of Node entries, allowing you to build trees.

filter: "Event['program'] == 'nginx'" #A
nodes: #A'
  - grok: #B
      name: "NGINXACCESS"
      # this statics will apply only if the above grok pattern matched
      statics: #B'
        - meta: log_type
          value: "http_access-log"
  - grok: #C
      name: "NGINXERROR"
      statics:
        - meta: log_type
          value: "http_error-log"
statics: #D
  - meta: service
    value: http

The evaluation process of a node is as follows:

  • apply the filter (A), if it doesn't match, exit
  • iterate over the list of nodes (A') and apply the node process to each.
  • if a grok entry is present, process it
    • if the grok entry returned data, apply the local statics of the node (if the grok 'B' was successful, apply B' statics)
  • if any of the nodes or the grok was successful, apply the statics (D)

Code Organisation

Main structs :

  • Node (config.go) : the runtime representation of parser configuration
  • Event (runtime.go) : the runtime representation of the line being parsed

Main funcs :

  • CompileNode : turns YAML into runtime-ready tree (Node)
  • ProcessNode : process the raw line against the parser tree, and produces ready-for-buckets data

Documentation

Index

Constants

This section is empty.

Variables

View Source
var DumpFolder string
View Source
var NodesHits = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "cs_node_hits_total",
		Help: "Total events entered node.",
	},
	[]string{"source", "type", "name"},
)
View Source
var NodesHitsKo = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "cs_node_hits_ko_total",
		Help: "Total events unsuccessfully exited node.",
	},
	[]string{"source", "type", "name"},
)
View Source
var NodesHitsOk = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "cs_node_hits_ok_total",
		Help: "Total events successfully exited node.",
	},
	[]string{"source", "type", "name"},
)
View Source
var NodesWlHits = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "cs_node_wl_hits_total",
		Help: "Total events processed by whitelist node.",
	},
	[]string{"source", "type", "name", "reason"},
)
View Source
var NodesWlHitsOk = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "cs_node_wl_hits_ok_total",
		Help: "Total events successfully whitelisted by node.",
	},
	[]string{"source", "type", "name", "reason"},
)
View Source
var ParseDump bool
View Source
var StageParseCache dumps.ParserResults
View Source
var StageParseMutex sync.Mutex

Functions

func GenDateParse

func GenDateParse(date string) (string, time.Time)

func GeoIpASN

func GeoIpASN(field string, p *types.Event, plog *log.Entry) (map[string]string, error)

func GeoIpCity

func GeoIpCity(field string, p *types.Event, plog *log.Entry) (map[string]string, error)

func IpToRange

func IpToRange(field string, p *types.Event, plog *log.Entry) (map[string]string, error)

func Parse

func Parse(ctx UnixParserCtx, xp types.Event, nodes []Node) (types.Event, error)

func ParseDate

func ParseDate(in string, p *types.Event, plog *log.Entry) (map[string]string, error)

func SetTargetByName

func SetTargetByName(target string, value string, evt *types.Event) bool

ok, this is kinda experimental, I don't know how bad of an idea it is ..

Types

type DataCapture added in v1.5.3

type DataCapture struct {
	Name            string        `yaml:"name,omitempty"`
	Key             string        `yaml:"key,omitempty"`
	KeyExpression   *vm.Program   `yaml:"-"`
	Value           string        `yaml:"value,omitempty"`
	ValueExpression *vm.Program   `yaml:"-"`
	TTL             string        `yaml:"ttl,omitempty"`
	TTLVal          time.Duration `yaml:"-"`
	MaxMapSize      int           `yaml:"size,omitempty"`
	Strategy        string        `yaml:"strategy,omitempty"`
}

type EnrichFunc

type EnrichFunc func(string, *types.Event, *log.Entry) (map[string]string, error)

should be part of a package shared with enrich/geoip.go

type Enricher added in v1.2.0

type Enricher struct {
	Name       string
	EnrichFunc EnrichFunc
}

type EnricherCtx

type EnricherCtx struct {
	Registered map[string]*Enricher
}

func Loadplugin

func Loadplugin() (EnricherCtx, error)

mimic plugin loading

type ExprWhitelist added in v1.4.0

type ExprWhitelist struct {
	Filter *vm.Program
}

type ExtraField added in v1.5.3

type ExtraField struct {
	//if the target is indicated by name Struct.Field etc,
	TargetByName string `yaml:"target,omitempty"`
	//if the target field is in Event map
	Parsed string `yaml:"parsed,omitempty"`
	//if the target field is in Meta map
	Meta string `yaml:"meta,omitempty"`
	//if the target field is in Enriched map
	Enriched string `yaml:"enriched,omitempty"`
	//the source is a static value
	Value string `yaml:"value,omitempty"`
	//or the result of an Expression
	ExpValue     string      `yaml:"expression,omitempty"`
	RunTimeValue *vm.Program `json:"-"` //the actual compiled filter
	//or an enrichment method
	Method string `yaml:"method,omitempty"`
}

Used mostly for statics

type GrokPattern added in v1.5.3

type GrokPattern struct {
	//the field to which regexp is going to apply
	TargetField string `yaml:"apply_on,omitempty"`
	//the grok/regexp by name (loaded from patterns/*)
	RegexpName string `yaml:"name,omitempty"`
	//a proper grok pattern
	RegexpValue string `yaml:"pattern,omitempty"`
	//the runtime form of regexpname / regexpvalue
	RunTimeRegexp grokky.Pattern `json:"-"` //the actual regexp
	//the output of the expression is going to be the source for regexp
	ExpValue     string      `yaml:"expression,omitempty"`
	RunTimeValue *vm.Program `json:"-"` //the actual compiled filter
	//a grok can contain statics that apply if pattern is successful
	Statics []ExtraField `yaml:"statics,omitempty"`
}

type InitFunc

type InitFunc func(map[string]string) (interface{}, error)

type Node

type Node struct {
	FormatVersion string `yaml:"format"`
	// Enable config + runtime debug of node via config o/
	Debug bool `yaml:"debug,omitempty"`
	// If enabled, the node (and its child) will report their own statistics
	Profiling bool `yaml:"profiling,omitempty"`
	// Name, author, description and reference(s) for parser pattern
	Name        string   `yaml:"name,omitempty"`
	Author      string   `yaml:"author,omitempty"`
	Description string   `yaml:"description,omitempty"`
	References  []string `yaml:"references,omitempty"`
	// if debug is present in the node, keep its specific Logger in runtime structure
	Logger *log.Entry `yaml:"-"`
	// This is mostly a hack to make writing less repetitive.
	// relying on stage, we know which field to parse, and we
	// can also promote log to next stage on success
	Stage string `yaml:"stage,omitempty"`
	// OnSuccess allows to tag a node to be able to move log to next stage on success
	OnSuccess string `yaml:"onsuccess,omitempty"`

	// Filter is executed at runtime (with current log line as context)
	// and must succeed or node is exited
	Filter        string      `yaml:"filter,omitempty"`
	RunTimeFilter *vm.Program `yaml:"-" json:"-"` // the actual compiled filter
	// If node has leafs, execute all of them until one asks for a 'break'
	LeavesNodes []Node `yaml:"nodes,omitempty"`
	// Flag used to describe when to 'break' or return an 'error'
	EnrichFunctions EnricherCtx

	/* If the node is actually a leaf, it can have : grok, enrich, statics */
	// pattern_syntax are named grok patterns that are re-utilized over several grok patterns
	SubGroks yaml.MapSlice `yaml:"pattern_syntax,omitempty"`

	// Holds a grok pattern
	Grok GrokPattern `yaml:"grok,omitempty"`
	// Statics can be present in any type of node and is executed last
	Statics []ExtraField `yaml:"statics,omitempty"`
	// Stash allows to capture data from the log line and store it in an accessible cache
	Stash []DataCapture `yaml:"stash,omitempty"`
	// Whitelists
	Whitelist Whitelist           `yaml:"whitelist,omitempty"`
	Data      []*types.DataSource `yaml:"data,omitempty"`
	// contains filtered or unexported fields
}

func LoadStages

func LoadStages(stageFiles []Stagefile, pctx *UnixParserCtx, ectx EnricherCtx) ([]Node, error)

func (*Node) CheckExprWL added in v1.5.5

func (n *Node) CheckExprWL(cachedExprEnv map[string]interface{}, p *types.Event) (bool, error)

func (*Node) CheckIPsWL added in v1.5.5

func (n *Node) CheckIPsWL(p *types.Event) bool

func (*Node) CompileWLs added in v1.5.5

func (n *Node) CompileWLs() (bool, error)

func (*Node) ContainsExprLists added in v1.5.5

func (n *Node) ContainsExprLists() bool

func (*Node) ContainsIPLists added in v1.5.5

func (n *Node) ContainsIPLists() bool

func (*Node) ContainsWLs added in v1.5.5

func (n *Node) ContainsWLs() bool

func (*Node) ProcessStatics added in v1.0.0

func (n *Node) ProcessStatics(statics []ExtraField, event *types.Event) error

type Parsers added in v1.0.0

type Parsers struct {
	Ctx             *UnixParserCtx
	Povfwctx        *UnixParserCtx
	StageFiles      []Stagefile
	PovfwStageFiles []Stagefile
	Nodes           []Node
	Povfwnodes      []Node
	EnricherCtx     EnricherCtx
}

func LoadParsers added in v1.0.0

func LoadParsers(cConfig *csconfig.Config, parsers *Parsers) (*Parsers, error)

func NewParsers added in v1.5.0

func NewParsers(hub *cwhub.Hub) *Parsers

Return new parsers nodes and povfwnodes are already initialized in parser.LoadStages

type Stagefile

type Stagefile struct {
	Filename string `yaml:"filename"`
	Stage    string `yaml:"stage"`
}

type UnixParserCtx

type UnixParserCtx struct {
	Grok       grokky.Host
	Stages     []string
	Profiling  bool
	DataFolder string
}

func Init added in v1.0.0

func Init(c map[string]interface{}) (*UnixParserCtx, error)

type Whitelist added in v1.4.0

type Whitelist struct {
	Reason  string   `yaml:"reason,omitempty"`
	Ips     []string `yaml:"ip,omitempty"`
	B_Ips   []net.IP
	Cidrs   []string `yaml:"cidr,omitempty"`
	B_Cidrs []*net.IPNet
	Exprs   []string `yaml:"expression,omitempty"`
	B_Exprs []*ExprWhitelist
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL