e2e

package module

v0.0.0-...-c0724c2 Latest Latest Go to latest Published: Nov 23, 2024 License: MIT Imports: 59 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Azure/agentbaker

README ¶

AgentBaker E2E Testing

This directory contains files related to the AgentBaker E2E testing framework.

Overview

AgentBaker E2E tests verify that node bootstrapping artifacts generated by the AgentBaker API are correct and capable of integrating Azure VMs into Azure Kubernetes Service (AKS) clusters.

From a high-level, each E2E scenario makes a call out to the primary node-bootstrapping API GetLatestNodeBootstrapping with a set of parameters (represented by a NodeBootstrappingConfiugration) which define the given scenario to generate CSE and custom data. A new VMSS containing a single VM will then be created and associated with an AKS cluster that is already running in Azure. The CSE and custom data generated by AgentBaker will then be applied to the new VM so it can bootstrap and register itself with the apiserver of the running cluster. Liveness and health checks and then run to make sure the new VM's kubelet is posting NodeReady to the cluster's apiserver, and that workload pods can successfully be run on it. Lastly, a set of validation commands are remotely executed on the VM to ensure its live state (file existsnce, sysctl settings, etc.) is as expected.

sequenceDiagram
    E2E->>+ARM: Get or Create AKS Cluster
    ARM-->>-E2E: Cluster details
    E2E->>+AgentBakerCode: Fetch VM Configuration (include CSE)
    AgentBakerCode-->>-E2E: VM Configuration
    E2E->>+ARM: Create VM using fetched VM Config in cluster network
    ARM-->>-E2E: VM instance
    E2E->>+KubeAPI: Create test Pod
    KubeAPI->>+TestPod: Perform healthcheck
    TestPod-->>-KubeAPI: Healthcheck OK
    KubeAPI-->>-E2E: Test Pod ready
    E2E->>+KubeAPI: Execute test validators
    KubeAPI->>+DebugPod: Execute test validator
    DebugPod->>+VM: Execute test validator
    VM-->>-DebugPod: Test results
    DebugPod-->>-KubeAPI: Test results
    KubeAPI-->>-E2E: Final results

Running Locally

Note: if you have changed code or artifacts used to generate custom data or custom script extension payloads, you should first run make generate from the root of the AgentBaker repository.

To run the E2E test suite locally, use e2e-local.sh. This script sets up the go test command.

Check config.go for the default configuration parameters. You can override these parameters by setting ENV variables.

Create a .env file in the e2e directory to set environment variables and avoid manual setup each time you run tests. Refer to .env.sample for an example.

Running Specific Tests

Use TAGS_TO_RUN= to specify scenarios based on tags. By default, all scenarios run. Multiple tags should be comma-separated and are case-insensitive. Check logs for test tags.

Example:

TAGS_TO_RUN="os=ubuntu,arch=amd64,wasm=false,gpu=false,imagename=1804gen2containerd" ./e2e-local.sh

To exclude scenarios, use TAGS_TO_SKIP=. Scenarios with any specified tags will be skipped (this logic is different to TAGS_TO_RUN).

To run a specific test, use the test name:

TAGS_TO_RUN="name=Test_azurelinuxv2" ./e2e-local.sh
# or
go test -run Test_azurelinuxv2 -v -timeout 90m

Debugging

Set KEEP_VMSS=true to retain bootstrapped VMs for debugging. Setting this will also have the VM's private SSH key included in each scenario's log bundle. When using this flag, please ensure to run only test you need to debug, as the VMs will not be deleted after the test run.

Running Tests Manually

Run tests with custom arguments after setting required environment variables:

go test -parallel 100 -timeout 90m -v -count 1

Important go test flags:

-v: Verbose output
-parallel 100: Run 100 tests in parallel, default is limited to the number of cores
-timeout 90m: Set timeout, default is 10 minutes which is often exceeded
-count 1: Disable test caching

Cleanup

Azure resources are deleted periodically by an external garbage collector. Locally stopped tests attempt a graceful shutdown to clean up resources. Old VMs are deleted on startup unless created with KEEP_VMSS=true.

IDE Configuration

Global Settings

Set GOFLAGS="-timeout=90m -parallel=100" in your shell configuration file.

GoLand

In Run > Edit Configurations..., set -timeout=90m -parallel=100 in the Go tool arguments field.

VSCode

Add to settings.json:

{
  "go.testFlags": ["-parallel=100", "-v"],
  "go.testTimeout": "90m"
}

Package Structure

The top-level package of the Golang E2E implementation is named e2e and is entirely separate from all AgentBaker packages.

The definitions and entry points for each test scenario, ran by go test, are located in scenario_test.go.

E2E VHDs.

Node images are pushed to Shared Image Gallery (SIG). Each image is tagged with branch name and build id. By default E2E tests use latest version of images from SIG with branch=refs/heads/master tag.

Using VHD Images from Custom ADO Builds

Set SIG_VERSION_TAG_NAME and SIG_VERSION_TAG_VALUE to specify custom VHD builds:

SIG_VERSION_TAG_NAME=buildId SIG_VERSION_TAG_VALUE=123456789 TAGS_TO_RUN="os=ubuntu2204" ./e2e-local.sh

Registering New VHD SKUs

When adding tests for a new VHD image, ensure to add a delete-lock to prevent the garbage collector from deleting the image version.

Scenarios

E2E scenarios can be configured with VMSS configuration mutators that change/set properties on the VMSS model used to deploy the new VM to be bootstrapped. This is primarily useful when testing out different VM SKUs, especially for GPU-enabled scenarios which affect which code paths AgentBaker will use to generate CSE and custom data

Further, in order to support E2E scenarios which test different underlying AKS cluster configurations, such as the cluster's network plugin, each E2E scenario uses one of the predefined clusters. Same cluster can be reused in different test runs. If cluster doesn't exist a new one will be created automatically.

Lastly, E2E scenarios also consist of a list of live VM validators. Each live VM validator consists of a description, a bash command which will actually be run on the newly bootstrapped VM, and an "asserter" function that will perform assertions on the contents of both the stdout and stderr streams that result from the execution of the command. The validators can be used to assert on numerous types of properties of the live VM, such as the live file system and kernel state.

Log Collection

Each E2E scenario will generate its own logs after execution. Currently, these logs consist of:

cluster-provision.log - CSE execution log, retrieved from /var/log/azure/aks/cluster-provision.log (collected in success and CSE failure cases)
kubelet.log - the kubelet systemd unit's logs retrived by running journalctl -u kubelet on the VM after bootstrapping has finished (collected in success and CSE failure cases)
vmssId.txt - a single line text file containing the unique resource ID of the VMSS created by the respective scenario, mainly collected for the purposes of posthoc resource deletion (collected in all cases where the VMSS is able to be created)

These logs will be uploaded in a bundle of the format:

└── scenario-logs
    └── <scenario>
        ├── cluster-provision.log
        ├── kubelet.log
        ├── vmssId.txt

Coverage report

After a PR is created in AgentBaker's repo on GitHub, a pipeline calculating code coverage changes will automatically run.

We are utilizing coveralls to display the coverage report. The coverage report will be available in the PR's description. You can also view previous runs for the AgentBaker repo here.

We calculate code coverage for both unit tests and E2E tests.

E2E coverage report

To generate E2E coverage reports, we use code coverage changes introduced in Go 1.20.

Coverage report is generated by running AgentBaker's API server locally as a binary created with the -cover flag. E2E tests are then ran against that binary.

The following packages are used during calculation of coverage for E2E tests:

- github.com/Azure/agentbaker/apiserver
- github.com/Azure/agentbaker/cmd
- github.com/Azure/agentbaker/cmd/starter
- github.com/Azure/agentbaker/pkg/agent
- github.com/Azure/agentbaker/pkg/agent/datamodel
- github.com/Azure/agentbaker/pkg/templates

Generating E2E coverage report locally

You can generate an E2E coverage report while running the E2E tests locally. To do so, follow the steps below:

Build the AgentBaker server binary with -cover flag:

  cd cmd
  go build -cover -o baker -covermode count
  GOCOVERDIR=covdatafiles ./baker start &

Create directory for coverage report files

  mkdir -p covdatafiles

Run the binary

  GOCOVERDIR=covdatafiles ./baker start &

Run the E2E tests locally

  /bin/bash e2e/e2e-local.sh

Stop the binary - once the tests finish executing, you have to stop the binary with exit code 0 to generate the report. See the docs here.

  kill $(pgrep baker)

Display the coverage report within the terminal

  go tool covdata percent -i=./cmd/somedata

Documentation ¶

Index ¶

type Cluster
- func (c *Cluster) IsAzureCNI() (bool, error)
- func (c *Cluster) MaxPodsPerNode() (int, error)
type ClusterParams
type Config
type Kubeclient
type LiveVMValidator
type Scenario
type ScenarioRuntime
type Tags
- func (t Tags) MatchesAnyFilter(filters string) (bool, error)
- func (t Tags) MatchesFilters(filters string) (bool, error)
type VMCommandOutputAsserterFn
type VNet

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Cluster ¶

type Cluster struct {
	Model                          *armcontainerservice.ManagedCluster
	Kube                           *Kubeclient
	SubnetID                       string
	NodeBootstrappingConfiguration *datamodel.NodeBootstrappingConfiguration
	AKSNodeConfig                  *aksnodeconfigv1.Configuration
	Maintenance                    *armcontainerservice.MaintenanceConfiguration
}

func ClusterAzureNetwork ¶

func ClusterAzureNetwork(ctx context.Context, t *testing.T) (*Cluster, error)

func ClusterKubenet ¶

func ClusterKubenet(ctx context.Context, t *testing.T) (*Cluster, error)

Same cluster can be attempted to be created concurrently by different tests sync.Once is used to ensure that only one cluster for the set of tests is created

func ClusterKubenetAirgap ¶

func ClusterKubenetAirgap(ctx context.Context, t *testing.T) (*Cluster, error)

func (*Cluster) IsAzureCNI ¶

func (c *Cluster) IsAzureCNI() (bool, error)

Returns true if the cluster is configured with Azure CNI

func (*Cluster) MaxPodsPerNode ¶

func (c *Cluster) MaxPodsPerNode() (int, error)

Returns the maximum number of pods per node of the cluster's agentpool

type ClusterParams ¶

type ClusterParams struct {
	CACert         []byte
	BootstrapToken string
	FQDN           string
}

type Config ¶

type Config struct {
	// Cluster creates, updates or re-uses an AKS cluster for the scenario
	Cluster func(ctx context.Context, t *testing.T) (*Cluster, error)

	// VHD is the function called by the e2e suite on the given scenario to get its VHD selection
	VHD *config.Image

	// BootstrapConfigMutator is a function which mutates the base NodeBootstrappingConfig according to the scenario's requirements
	BootstrapConfigMutator func(*datamodel.NodeBootstrappingConfiguration)

	// AKSNodeConfigMutator if defined then aks-node-controller will be used to provision nodes
	AKSNodeConfigMutator func(*aksnodeconfigv1.Configuration)

	// VMConfigMutator is a function which mutates the base VMSS model according to the scenario's requirements
	VMConfigMutator func(*armcompute.VirtualMachineScaleSet)

	// LiveVMValidators is a slice of LiveVMValidator objects for performing any live VM validation
	// specific to the scenario that isn't covered in the set of common validators run with all scenarios
	LiveVMValidators []*LiveVMValidator
}

Config represents the configuration of an AgentBaker E2E scenario.

type Kubeclient ¶

type Kubeclient struct {
	Dynamic client.Client
	Typed   kubernetes.Interface
	Rest    *rest.Config
}

type LiveVMValidator ¶

type LiveVMValidator struct {
	// Description is the description of the validator and what it actually validates on the VM
	Description string

	// Command is the command string to be run on the live VM after node bootstrapping has succeeed
	Command string

	// Asserter is the validator's VMCommandOutputAsserterFn which will be run against command output
	Asserter VMCommandOutputAsserterFn

	// IsShellBuiltIn is a boolean flag which indicates whether or not the command is a shell built-in
	// that will fail when executed with sudo - requires separate command to avoid command not found error on node
	IsShellBuiltIn bool

	// TODO - extract this out of LiveVMValidator into a separate Pod level validator
	// IsPodNetwork is a boolean flags which indicates whether or not the validator should run on a pod that is NOT using
	// host's network interface. For example when testing connectivity from user pods to certain endpoints, we will set it to true
	IsPodNetwork bool
}

LiveVMValidator represents a command to be run on a live VM after node bootstrapping has succeeded that generates output which can be asserted against to make sure that the live VM itself is in the correct state

func CommandHasOutputValidator ¶

func CommandHasOutputValidator(commandToExecute string, expectedOutput string) *LiveVMValidator

func DirectoryValidator ¶

func DirectoryValidator(path string, files []string) *LiveVMValidator

func FileExcludesContentsValidator ¶

func FileExcludesContentsValidator(fileName string, contents string, contentsName string) *LiveVMValidator

func FileHasContentsValidator ¶

func FileHasContentsValidator(fileName string, contents string) *LiveVMValidator

func KubeletHasConfigFlagsValidator ¶

func KubeletHasConfigFlagsValidator(filePath string) *LiveVMValidator

KubeletHasConfigFlagsValidator checks kubelet is started with the right flags and configs.

func KubeletHasNotStoppedValidator ¶

func KubeletHasNotStoppedValidator() *LiveVMValidator

Ensure kubelet does not restart which can result in delays deploying pods and unnecessary nodepool scaling while the node is incapacitated. This is intended to stop services (e.g. nvidia-modprobe), restarting kubelet rather than specifying the dependency order to run before kubelet.service

func NonEmptyDirectoryValidator ¶

func NonEmptyDirectoryValidator(dirName string) *LiveVMValidator

func NvidiaModProbeInstalledValidator ¶

func NvidiaModProbeInstalledValidator() *LiveVMValidator

func NvidiaSMIInstalledValidator ¶

func NvidiaSMIInstalledValidator() *LiveVMValidator

func NvidiaSMINotInstalledValidator ¶

func NvidiaSMINotInstalledValidator() *LiveVMValidator

func ServiceCanRestartValidator ¶

func ServiceCanRestartValidator(serviceName string, restartTimeoutInSeconds int) *LiveVMValidator

func SysctlConfigValidator ¶

func SysctlConfigValidator(customSysctls map[string]string) *LiveVMValidator

func UlimitValidator ¶

func UlimitValidator(ulimits map[string]string) *LiveVMValidator

type Scenario ¶

type Scenario struct {
	// Description is a short description of what the scenario does and tests for
	Description string

	// Tags are used for filtering scenarios to run based on the tags provided
	Tags Tags

	// Config contains the configuration of the scenario
	Config

	// Runtime contains the runtime state of the scenario. It's populated in the beginning of the test run
	Runtime *ScenarioRuntime
}

Scenario represents an AgentBaker E2E scenario.

func (*Scenario) PrepareAKSNodeConfig ¶

func (s *Scenario) PrepareAKSNodeConfig()

func (*Scenario) PrepareNodeBootstrappingConfiguration ¶

func (s *Scenario) PrepareNodeBootstrappingConfiguration(nbc *datamodel.NodeBootstrappingConfiguration) (*datamodel.NodeBootstrappingConfiguration, error)

scenario's BootstrapConfigMutator on it, if configured.

func (*Scenario) PrepareRuntime ¶

func (s *Scenario) PrepareRuntime(ctx context.Context, t *testing.T)

func (*Scenario) PrepareVMSSModel ¶

func (s *Scenario) PrepareVMSSModel(ctx context.Context, t *testing.T, vmss *armcompute.VirtualMachineScaleSet)

PrepareVMSSModel mutates the input VirtualMachineScaleSet based on the scenario's VMConfigMutator, if configured. This method will also use the scenario's configured VHD selector to modify the input VMSS to reference the correct VHD resource.

type ScenarioRuntime ¶

type ScenarioRuntime struct {
	NBC           *datamodel.NodeBootstrappingConfiguration
	AKSNodeConfig *aksnodeconfigv1.Configuration
	Cluster       *Cluster
}

type Tags ¶

type Tags struct {
	Name                   string
	ImageName              string
	OS                     string
	Arch                   string
	Airgap                 bool
	GPU                    bool
	WASM                   bool
	ServerTLSBootstrapping bool
	Scriptless             bool
	KubeletCustomConfig    bool
}

func (Tags) MatchesAnyFilter ¶

func (t Tags) MatchesAnyFilter(filters string) (bool, error)

MatchesAnyFilter checks if the Tags struct matches at least one of the given filters. Filters are comma-separated "key=value" pairs (e.g., "gpu=true,os=x64"). Returns true if any filter matches, false if none match. Errors on invalid input.

func (Tags) MatchesFilters ¶

func (t Tags) MatchesFilters(filters string) (bool, error)

MatchesFilters checks if the Tags struct matches all given filters. Filters are comma-separated "key=value" pairs (e.g., "gpu=true,os=x64"). Returns true if all filters match, false otherwise. Errors on invalid input.

type VMCommandOutputAsserterFn ¶

type VMCommandOutputAsserterFn func(code, stdout, stderr string) error

VMCommandOutputAsserterFn is a function which takes in stdout and stderr stream content as strings and performs arbitrary assertions on them, returning an error in the case where the assertion fails

type VNet ¶

type VNet struct {
	// contains filtered or unexported fields
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
config
kubelet
toolkit

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL