languagemodels

package
v0.0.0-...-8bf0527 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 3, 2025 License: Apache-2.0 Imports: 7 Imported by: 0

README

Language Detection And Library Injection

Overview

Language detection and library injection feature is composed of several component:

  • Process Collector:
    • runs in the agent pod on every node of the cluster.
    • collects languages of host processes detected by the process agent and stores them in workloadmeta
  • client:
    • runs within the agent pod on every node of the cluster.
    • subscribes to process events in workloadmeta
    • reports detected languages (aggregated by pods and containers) to the language detection handler.
    • refreshes language TTL by periodically reporting all actively detected languages to the handler
  • handler:
    • runs within the cluster agent.
    • responsible for processing requests received from the language detection client
    • keeps track of a TTL (expiration timestamp) for each detected languages
    • periodically checks for expired languages and removes them
    • parses the requests, extracts detected languages, and pushes them to workload metadata store on the appropriate resource type.
    • for a pod that is a child of a deployment, the API handler will push its detected languages to the corresponding deployment entity in workload metadata store.
  • patcher:
    • runs within the cluster agent.
    • responsible for patching pod owner resources (such as deployments, statefulsets, daemonsets, etc.) with language annotations based on languages reported by the language detection client.
    • subscribes to workload metadata store events.

Detected Languages vs Injectable Languages

We make a distinction between detected languages and injectable languages:

  • Detected languages are:
    • detected and reported by the language detection client
    • have a valid TTL (a TTL that didn't expire yet)
    • populated in workload metadata store by the language detection API handler
  • Injectable languages are:
    • languages parsed from language annotations existing on a kubernetes resource
    • populated in workload metadata store by a kubernetes informer

It is the responsibility of the patcher to make sure that injectable languages always converge to detectable languages. For this reason, the patcher observes the current state of injectable languages and detected languages and constructs a patch that modifies the language annotations so that they contain only the detected languages. Once the language annotations are updated, the informer will update the injectable languages, which will become identical to detected languages.

For example, when deploying a python application for the first time, the following will happen:

  • Initial deployment:
    • detected languages = []
    • injectable languages = []
    • language annotations: {}
  • When language detection client reports detecting python:
    • detected languages = ["python",]
    • injectable languages = []
    • language annotations: {}
  • When patcher receives update event and patches deployment:
    • detected languages = ["python",]
    • injectable languages = []
    • language annotations: {"container-name": "python"}
  • When informer receives annotation update:
    • detected languages = ["python",]
    • injectable languages = ["python",]
    • language annotations: {"container-name": "python"}

Cleanup Mechanism

After multiple rollouts to their deployments, applications might be modified. Some containers might be removed, others might be added. A cleanup mechanism is implemented in order to make sure that when a language is removed due to modifying the application, the language is removed from the language detection annotations and also from workload metadata store.

As mentioned previously, the PLD API handler keeps track of a TTL (expiration time) for each detected language. The PLD client sends periodic requests to the PLD API Handler in order to refresh the TTL of languages that are still being detected by the node agent.

The cluster agent scans periodically the TTLs of the detected languages, and removes the expired languages from workload metadata store.

Consequently, the patcher will receive an event indicating that Detected Languages have been modified. It then takes action by adjusting the language annotations so that expired languages are removed from language annotations and therefore excluded from the injection process in the admission controller.

Sequence Diagram

The sequence diagram below shows the flow of execution of the feature:

sequenceDiagram
    box Datadog Agent
    participant PC as Process Collector
    participant DAW as Workload Metadata Store
    participant LDC as Language Detection Client
    end

    box Cluster Agent
    participant DCAW as Workload Metadata Store
    participant LDH as Language Detection Handler
    participant LDP as Language Detection Patcher
    end

    box Kubernetes
    participant KAS as Kubernetes API Server
    end


    loop Every TTL_REFRESH
        LDC->>LDH: Periodically Refresh TTL of languages the are still detected
    end


    loop Every CLEANUP_PERIOD
        LDH->>LDH: Periodically clear expired languages
        LDH->>DCAW: Unset expired languages
    end
    PC->>DAW: Store detected languages
    DAW->>LDC: Notify process language events

    loop Every CLIENT_PERIOD
        LDC->>LDH: Reports Newly Detected Languages
    end

    LDH->>DCAW: Push Detected Languages
    DCAW->>LDP: Notify changes in `Detected Languages`
    LDP->>DCAW: Checks Detected and Injectable Languages
    LDP->>KAS: Sends Annotations Patch
    KAS->>DCAW: Update Injectable Languages

Documentation

Index

Constants

View Source
const (

	// AnnotationPrefix represents a prefix of the language detection annotations
	AnnotationPrefix string = "internal.dd.datadoghq.com/"
)

Variables

View Source
var AnnotationRegex = regexp.MustCompile(`internal\.dd\.datadoghq\.com\/(init\.)?(.+?)\.detected_langs`)

AnnotationRegex defines the regex pattern of language detection annotations

Functions

func ExtractContainerFromAnnotationKey

func ExtractContainerFromAnnotationKey(annotationKey string) (string, bool)

ExtractContainerFromAnnotationKey extracts container name from annotation key and indicates if it is an init container if the annotation key is not a language annotation it returns an empty container name

func GetLanguageAnnotationKey

func GetLanguageAnnotationKey(containerName string) string

GetLanguageAnnotationKey returns the language annotation key for the specified container

Types

type Container

type Container struct {
	Name string
	Init bool
}

Container identifies a pod container by its name and an init boolean flag

func NewContainer

func NewContainer(containerName string) *Container

NewContainer creates and returns a new Container object with unset init flag

func NewInitContainer

func NewInitContainer(containerName string) *Container

NewInitContainer creates and returns a new Container object with set init flag

type ContainersLanguages

type ContainersLanguages map[Container]LanguageSet

ContainersLanguages handles mapping containers to language sets

func (ContainersLanguages) ToAnnotations

func (c ContainersLanguages) ToAnnotations() map[string]string

ToAnnotations converts the containers languages to language annotations

func (ContainersLanguages) ToProto

func (c ContainersLanguages) ToProto() (containersDetailsProto, initContainersDetailsProto []*pbgo.ContainerLanguageDetails)

ToProto returns two proto messages ContainerLanguageDetails The first one contains standard containers The second one contains init containers

type Detector

type Detector interface {
	DetectLanguage(proc Process) (Language, error)
}

Detector is an interface for detecting the language of a process

type Language

type Language struct {
	Name    LanguageName
	Version string
}

Language contains metadata collected from the call to `DetectLanguage`

type LanguageName

type LanguageName string

LanguageName is a string enum that represents a detected language name.

const (
	// Go language name.
	Go LanguageName = "go"

	// Node language name.
	Node LanguageName = "node"

	// Dotnet language name.
	Dotnet LanguageName = "dotnet"

	// Python language name.
	Python LanguageName = "python"

	// Java language name.
	Java LanguageName = "java"

	// Ruby language name.
	Ruby LanguageName = "ruby"

	// PHP language name.
	PHP LanguageName = "php"

	// Unknown language name.
	Unknown LanguageName = ""
)

type LanguageSet

type LanguageSet map[LanguageName]struct{}

LanguageSet represents a set of languages

func (LanguageSet) Add

func (s LanguageSet) Add(language LanguageName) bool

Add adds a new language to the language set returns false if the language is already included in the set, and true otherwise

func (LanguageSet) ToProto

func (s LanguageSet) ToProto() []*pbgo.Language

ToProto returns a proto message Language

type Process

type Process interface {
	GetPid() int32
	GetCommand() string
	GetCmdline() []string
}

Process is an interface that exposes the fields necessary to detect a language

type TimedContainersLanguages

type TimedContainersLanguages map[Container]TimedLanguageSet

TimedContainersLanguages handles mapping containers to timed language sets

func (TimedContainersLanguages) EqualTo

EqualTo checks if current TimedContainersLanguages object has identical content in comparison another TimedContainersLanguages

func (TimedContainersLanguages) GetOrInitialize

func (c TimedContainersLanguages) GetOrInitialize(container Container) *TimedLanguageSet

GetOrInitialize returns the language set of a container if it exists, or initializes it otherwise

func (TimedContainersLanguages) Merge

Merge merges another containers languages object to the current object Returns true if new languages were added, and false otherwise

func (TimedContainersLanguages) RemoveExpiredLanguages

func (c TimedContainersLanguages) RemoveExpiredLanguages() bool

RemoveExpiredLanguages removes expired languages from each container language set Returns true if at least one language is expired and removed

type TimedLanguageSet

type TimedLanguageSet map[LanguageName]time.Time

TimedLanguageSet handles storing sets of languages along with their expiration times

func (TimedLanguageSet) Add

func (s TimedLanguageSet) Add(language LanguageName, expiration time.Time) bool

Add adds a new language to the language set with an expiration time returns false if the language is already included in the set, and true otherwise

func (TimedLanguageSet) EqualTo

func (s TimedLanguageSet) EqualTo(other TimedLanguageSet) bool

EqualTo determines if the current timed languageset has the same languages as another timed languageset

func (TimedLanguageSet) Has

func (s TimedLanguageSet) Has(language LanguageName) bool

Has returns whether the set contains a specific language

func (TimedLanguageSet) Merge

func (s TimedLanguageSet) Merge(other TimedLanguageSet) bool

Merge merges another timed language set with the current language set returns true if the set new languages were introduced, and false otherwise

func (TimedLanguageSet) Remove

func (s TimedLanguageSet) Remove(language LanguageName)

Remove deletes a language from the language set

func (TimedLanguageSet) RemoveExpired

func (s TimedLanguageSet) RemoveExpired() bool

RemoveExpired removes all expired languages from the set Returns true if at least one language is expired and removed

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL