runtimes/

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

README ¶

gVisor Runtime Tests

These tests execute language runtime test suites inside gVisor. They serve as high-level integration tests for the various runtimes.

Runtime Test Components

The runtime tests have the following components:

[images][runtime-images] - These are Docker images for each language runtime we test. The images contain all the particular runtime tests, and whatever other libraries or utilities are required to run the tests.
proctor - This is a binary that acts as an agent inside the container and provides a uniform command-line API to list and run the various language tests.
runner - This is the test entrypoint invoked by bazel run. This binary spawns Docker (using runsc runtime) and runs the language image with proctor binary mounted.
exclude - Holds a CSV file for each language runtime containing the full path of tests that should be excluded from running along with a reason for exclusion.

Testing Locally

The following make targets will run an entire runtime test suite locally.

Note: java runtime test take 1+ hours with 16 cores.

Language	Version	Running the test suite
Go	1.22	`make go1.22-runtime-tests`
Java	21	`make java21-runtime-tests`
NodeJS	22.2.0	`make nodejs22.2.0-runtime-tests`
Php	8.3.7	`make php8.3.7-runtime-tests`
Python	3.12.3	`make python3.12.3-runtime-tests`

You can modify the runtime test behaviors by passing in the following make variables:

RUNTIME_TESTS_FILTER: Comma-separated list of tests to run, even if otherwise excluded. Useful to debug single failing test cases.
RUNTIME_TESTS_PER_TEST_TIMEOUT: Modify per-test timeout. Useful when debugging a test that has a tendency to get stuck, in order to make it fail faster.
RUNTIME_TESTS_RUNS_PER_TEST: Number of times to run each test. Useful to find flaky tests.
RUNTIME_TESTS_FLAKY_IS_ERROR: Boolean indicating whether tests found flaky (i.e. running them multiple times has sometimes succeeded, sometimes failed) should be considered a test suite failure (true) or success (false).
RUNTIME_TESTS_FLAKY_SHORT_CIRCUIT: If true, when running tests multiple times, and a test has been found flaky (i.e. running it multiple times has succeeded at least once and failed at least once), exit immediately, rather than running all RUNTIME_TESTS_RUNS_PER_TEST attempts.

Example invocation:

$ make php8.1.1-runtime-tests \
    RUNTIME_TESTS_FILTER=ext/standard/tests/file/bug60120.phpt \
    RUNTIME_TESTS_PER_TEST_TIMEOUT=10s \
    RUNTIME_TESTS_RUNS_PER_TEST=100

Clean Up

Sometimes when runtime tests fail or when the testing container itself crashes unexpectedly, the containers are not removed or sometimes do not even exit. This can cause some docker commands like docker system prune to hang forever.

Here are some helpful commands (should be executed in order):

docker ps -a  # Lists all docker processes; useful when investigating hanging containers.
docker kill $(docker ps -a -q)  # Kills all running containers.
docker rm $(docker ps -a -q)  # Removes all exited containers.
docker system prune  # Remove unused data.

Updating Runtime Tests

To bump the version of an existing runtime test:

Update the Docker image for with the new runtime version. Rename the Dockerfile directory name and update any packages or downloaded urls to point to the new version. Test building the image with docker build images/runtimes/<new_runtime>.
Update runtime_test target. The name field must be the directory name for the Dockerfile created in Step 1.
Update Buildkite pipeline.
Run the tests, and triage any failures. Some language tests are flaky (or never pass at all), other failures may indicate a gVisor bug or divergence from Linux behavior.
Update the exclude file by renaming it with the right version and adding any failing tests to it with a reason.

Cleaning up exclude files

Usually when the runtime is updated, a lot has changed. Tests may have been deleted, modified (fixed or broken) or added. After you have an exclude list from step 3 above with which all runtime tests pass, it is useful to clean up the exclude files with the following steps:

Check for the existence of tests in the runtime image. See how each runtime lists all its tests (see ListTests() implementations in proctor/lib directory). Then you can compare against that list and remove any excluded tests that don't exist anymore.
Run all excluded tests with runc (native) for each runtime. If the test fails, we can consider the test as broken. Such tests should be marked with Broken test in the reason column. These tests don't provide a compatibility gap signal for gvisor. We can happily ignore them. Some tests which were previously broken may not be unbroken and for them the reason field should be cleared.
Run all the unbroken and non-flaky tests on runsc (gVisor). If the test is now passing, then the test should be removed from the exclude list. This effectively increases our testing surface. Once upon a time, this test was failing. Now it is passing. Something was fixed in between. Enabling this test is equivalent to adding a regression test for the fix.
Some tests are excluded and marked flaky. Run these tests 100 times on runsc (gVisor). If it does not flake, then you can remove it from the exclude list.
Finally, close all corresponding bugs for tests that are now passing. These bugs are stale.

Creating new runtime tests for an entirely new language is similar to the above, except that Step 1 is a bit harder. You have to figure out how to download and run the language tests in a Docker container. Once you have that, you must also implement the proctor/TestRunner interface for that language, so that proctor can list and run the tests in the image you created.

Directories ¶

Path	Synopsis
proctor Binary proctor runs the test for a particular runtime.	Binary proctor runs the test for a particular runtime.
lib Package lib contains proctor functions.	Package lib contains proctor functions.
runner Binary runner runs the runtime tests in a Docker container.	Binary runner runs the runtime tests in a Docker container.
lib Package lib provides utilities for runner.	Package lib provides utilities for runner.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL