parser

package
v1.5.11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 20, 2023 License: MIT Imports: 33 Imported by: 0

README

![gopherbadger-tag-do-not-edit]

Parser

Parser is in charge of turning raw log lines into objects that can be manipulated by heuristics. Parsing has several stages represented by directories on config/stage. The alphabetical order dictates the order in which the stages/parsers are processed.

The runtime representation of a line being parsed (or an overflow) is an Event, and has fields that can be manipulated by user :

  • Parsed : a string dict containing parser outputs
  • Meta : a string dict containing meta information about the event
  • Line : a raw line representation
  • Overflow : a representation of the overflow if applicable

The Event structure goes through the stages, being altered with each parsing step. It's the same object that will be later poured into buckets.

Parser configuration

A parser configuration is a Node object, that can contain grok patterns, enrichement instructions.

For example :

filter: "evt.Line.Labels.type == 'testlog'"
debug: true
onsuccess: next_stage
name: tests/base-grok
pattern_syntax:
  MYCAP: ".*"
nodes:
  - grok:
      pattern: ^xxheader %{MYCAP:extracted_value} trailing stuff$
      apply_on: Line.Raw
statics:
  - meta: log_type
    value: parsed_testlog
Name

optional if present and prometheus or profiling are activated, stats will be generated for this node.

Filter

filter: "Line.Src endsWith '/foobar'"

  • optional filter : an expression that will be evaluated against the runtime of a line (Event)
    • if the filter is present and returns false, node is not evaluated
    • if filter is absent or present and returns true, node is evaluated
Debug flag

debug: true

  • optional debug : a bool that sets debug of the node to true (applies at runtime and configuration parsing)
OnSuccess flag

onsuccess: next_stage|continue

  • mandatory indicates the behavior to follow if the node succeeds. next_stage make the line go to the next stage, while continue will continue processing the current stage.
Statics
statics:
    - meta: service
      value: tcp
    - meta: source_ip
      expression: "Event['source_ip']"
    - parsed: "new_connection"
      expression: "Event['tcpflags'] contains 'S' ? 'true' : 'false'"
    - target: Parsed.this_is_a_test
      value: foobar

Statics apply when a node is considered successful, and are used to alter the Event structure. An empty node, a node with a grok pattern that succeeded or an enrichment directive that worked are successful nodes. Statics can :

  • meta: add/alter an entry in the Meta dict
  • parsed: add/alter an entry in the Parsed dict
  • target: indicate a destination field by name, such as Meta.my_key The source of data can be :
  • value: a static value
  • expr_result : the result of an expression
Grok patterns

Grok patterns are used to parse one field of Event into one or several others :

grok:
  name: "TCPDUMP_OUTPUT"
  apply_on: message

name is the name of a pattern loaded from patterns/. Base patterns can be seen on the repo : https://github.com/crowdsecurity/grokky/blob/master/base.go


grok:
  pattern: "^%{GREEDYDATA:request}\\?%{GREEDYDATA:http_args}$"
  apply_on: request

pattern which is a valid pattern, optionally with an apply_on that indicates to which field it should be applied

Patterns syntax

Present at the Event level, the pattern_syntax is a list of subgroks to be declared.

pattern_syntax:
  DIR: "^.*/"
  FILE: "[^/].*$"
Enrichment

The Enrichment mechanism is exposed via statics :

statics:
  - method: GeoIpCity
    expression: Meta.source_ip
  - meta: IsoCode
    expression: Enriched.IsoCode
  - meta: IsInEU
    expression: Enriched.IsInEU

The GeoIpCity method is called with the value of Meta.source_ip. Enrichment plugins can output one or more key:values in the Enriched map, and it's up to the user to copy the relevant values to Meta or such.

Trees

The Node object allows as well a nodes entry, which is a list of Node entries, allowing you to build trees.

filter: "Event['program'] == 'nginx'" #A
nodes: #A'
  - grok: #B
      name: "NGINXACCESS"
      # this statics will apply only if the above grok pattern matched
      statics: #B'
        - meta: log_type
          value: "http_access-log"
  - grok: #C
      name: "NGINXERROR"
      statics:
        - meta: log_type
          value: "http_error-log"
statics: #D
  - meta: service
    value: http

The evaluation process of a node is as follows:

  • apply the filter (A), if it doesn't match, exit
  • iterate over the list of nodes (A') and apply the node process to each.
  • if a grok entry is present, process it
    • if the grok entry returned data, apply the local statics of the node (if the grok 'B' was successful, apply B' statics)
  • if any of the nodes or the grok was successful, apply the statics (D)

Code Organisation

Main structs :

  • Node (config.go) : the runtime representation of parser configuration
  • Event (runtime.go) : the runtime representation of the line being parsed

Main funcs :

  • CompileNode : turns YAML into runtime-ready tree (Node)
  • ProcessNode : process the raw line against the parser tree, and produces ready-for-buckets data

Documentation

Index

Constants

This section is empty.

Variables

View Source
var DumpFolder string
View Source
var NodesHits = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "cs_node_hits_total",
		Help: "Total events entered node.",
	},
	[]string{"source", "type", "name"},
)
View Source
var NodesHitsKo = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "cs_node_hits_ko_total",
		Help: "Total events unsuccessfully exited node.",
	},
	[]string{"source", "type", "name"},
)
View Source
var NodesHitsOk = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Name: "cs_node_hits_ok_total",
		Help: "Total events successfully exited node.",
	},
	[]string{"source", "type", "name"},
)
View Source
var ParseDump bool
View Source
var StageParseCache map[string]map[string][]ParserResult
View Source
var StageParseMutex sync.Mutex

Functions

func GenDateParse

func GenDateParse(date string) (string, time.Time)

func GeoIPASNInit

func GeoIPASNInit(cfg map[string]string) (interface{}, error)

func GeoIPCityInit

func GeoIPCityInit(cfg map[string]string) (interface{}, error)

func GeoIpASN

func GeoIpASN(field string, p *types.Event, ctx interface{}, plog *log.Entry) (map[string]string, error)

func GeoIpCity

func GeoIpCity(field string, p *types.Event, ctx interface{}, plog *log.Entry) (map[string]string, error)

func IpToRange

func IpToRange(field string, p *types.Event, ctx interface{}, plog *log.Entry) (map[string]string, error)

func IpToRangeInit

func IpToRangeInit(cfg map[string]string) (interface{}, error)

func Parse

func Parse(ctx UnixParserCtx, xp types.Event, nodes []Node) (types.Event, error)

func ParseDate

func ParseDate(in string, p *types.Event, x interface{}, plog *log.Entry) (map[string]string, error)

func SetTargetByName

func SetTargetByName(target string, value string, evt *types.Event) bool

ok, this is kinda experimental, I don't know how bad of an idea it is ..

Types

type EnrichFunc

type EnrichFunc func(string, *types.Event, interface{}, *log.Entry) (map[string]string, error)

should be part of a packaged shared with enrich/geoip.go

type Enricher

type Enricher struct {
	Name       string
	InitFunc   InitFunc
	EnrichFunc EnrichFunc
	Ctx        interface{}
}

type EnricherCtx

type EnricherCtx struct {
	Registered map[string]*Enricher
}

func Loadplugin

func Loadplugin(path string) (EnricherCtx, error)

mimic plugin loading

type ExprWhitelist

type ExprWhitelist struct {
	Filter       *vm.Program
	ExprDebugger *exprhelpers.ExprDebugger // used to debug expression by printing the content of each variable of the expression
}

type InitFunc

type InitFunc func(map[string]string) (interface{}, error)

type Node

type Node struct {
	FormatVersion string `yaml:"format"`
	//Enable config + runtime debug of node via config o/
	Debug bool `yaml:"debug,omitempty"`
	//If enabled, the node (and its child) will report their own statistics
	Profiling bool `yaml:"profiling,omitempty"`
	//Name, author, description and reference(s) for parser pattern
	Name        string   `yaml:"name,omitempty"`
	Author      string   `yaml:"author,omitempty"`
	Description string   `yaml:"description,omitempty"`
	References  []string `yaml:"references,omitempty"`
	//if debug is present in the node, keep its specific Logger in runtime structure
	Logger *log.Entry `yaml:"-"`
	//This is mostly a hack to make writing less repetitive.
	//relying on stage, we know which field to parse, and we
	//can also promote log to next stage on success
	Stage string `yaml:"stage,omitempty"`
	//OnSuccess allows to tag a node to be able to move log to next stage on success
	OnSuccess string `yaml:"onsuccess,omitempty"`

	//Filter is executed at runtime (with current log line as context)
	//and must succeed or node is exited
	Filter        string                    `yaml:"filter,omitempty"`
	RunTimeFilter *vm.Program               `yaml:"-" json:"-"` //the actual compiled filter
	ExprDebugger  *exprhelpers.ExprDebugger `yaml:"-" json:"-"` //used to debug expression by printing the content of each variable of the expression
	//If node has leafs, execute all of them until one asks for a 'break'
	LeavesNodes []Node `yaml:"nodes,omitempty"`
	//Flag used to describe when to 'break' or return an 'error'
	EnrichFunctions EnricherCtx

	/* If the node is actually a leaf, it can have : grok, enrich, statics */
	//pattern_syntax are named grok patterns that are re-utilized over several grok patterns
	SubGroks yaml.MapSlice `yaml:"pattern_syntax,omitempty"`

	//Holds a grok pattern
	Grok types.GrokPattern `yaml:"grok,omitempty"`
	//Statics can be present in any type of node and is executed last
	Statics []types.ExtraField `yaml:"statics,omitempty"`
	//Stash allows to capture data from the log line and store it in an accessible cache
	Stash []types.DataCapture `yaml:"stash,omitempty"`
	//Whitelists
	Whitelist Whitelist           `yaml:"whitelist,omitempty"`
	Data      []*types.DataSource `yaml:"data,omitempty"`
	// contains filtered or unexported fields
}

func LoadStages

func LoadStages(stageFiles []Stagefile, pctx *UnixParserCtx, ectx EnricherCtx) ([]Node, error)

func (*Node) ProcessStatics

func (n *Node) ProcessStatics(statics []types.ExtraField, event *types.Event) error

type ParserResult

type ParserResult struct {
	Evt     types.Event
	Success bool
}

type Parsers

type Parsers struct {
	Ctx             *UnixParserCtx
	Povfwctx        *UnixParserCtx
	StageFiles      []Stagefile
	PovfwStageFiles []Stagefile
	Nodes           []Node
	Povfwnodes      []Node
	EnricherCtx     EnricherCtx
}

func LoadParsers

func LoadParsers(cConfig *csconfig.Config, parsers *Parsers) (*Parsers, error)

func NewParsers

func NewParsers() *Parsers

Return new parsers nodes and povfwnodes are already initialized in parser.LoadStages

type Stagefile

type Stagefile struct {
	Filename string `yaml:"filename"`
	Stage    string `yaml:"stage"`
}

type UnixParserCtx

type UnixParserCtx struct {
	Grok       grokky.Host
	Stages     []string
	Profiling  bool
	DataFolder string
}

func Init

func Init(c map[string]interface{}) (*UnixParserCtx, error)

type Whitelist

type Whitelist struct {
	Reason  string   `yaml:"reason,omitempty"`
	Ips     []string `yaml:"ip,omitempty"`
	B_Ips   []net.IP
	Cidrs   []string `yaml:"cidr,omitempty"`
	B_Cidrs []*net.IPNet
	Exprs   []string `yaml:"expression,omitempty"`
	B_Exprs []*ExprWhitelist
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL