Upgrade Tests
In order to get coverage for the upgrade process from an operator’s perspective,
we need an additional suite of tests that perform a complete knative upgrade.
Running these tests on every commit will ensure that we don’t introduce any
non-upgradeable changes, so every commit should be releasable.
This is inspired by kubernetes
upgrade testing
.
These tests are a pretty big hammer in that they cover more than just version
changes, but it’s one of the only ways to make sure we don’t accidentally make
breaking changes for now.
Flow
We’d like to validate that the upgrade doesn’t break any resources (they still
propagate events) and doesn't break our installation (we can still update
resources).
At a high level, we want to do this:
- Install the latest knative release.
- Create some resources.
- Install knative at HEAD.
- Run any post-install jobs that apply for the release to be.
- Test those resources, verify that we didn’t break anything.
To achieve that, we created an upgrade framework (knative.dev/pkg/test/upgrade).
This framework will enforce running upgrade tests in specific order and supports
continual verification of system under test. In case of Eventing Kafka it is:
- Install the latest release from GitHub.
- Run the
preupgrade
smoke tests.
- Start
continual
tests that will propagate events in the background, while
upgrading and downgrading.
- Install at HEAD (
ko apply -f config/
) and run the post-install jobs.
- Run the
postupgrade
smoke tests.
- Install the latest release from GitHub.
- Run the
postdowngrade
smoke tests.
- Stop and verify
continual
tests, checking if every event propagated well.
Tests
Smoke tests
Those are just stolen from the e2e tests as one of the simplest cases.
preupgrade, postupgrade, postdowngrade
Run the selected smoke tests for channel and source.
Probe test
In order to verify that we don't have data-plane unavailability during our
control-plane outages (when we're upgrading the eventing-kafka installation), we
run a prober test that continually sends events to a Kafka channel and sends
events to Kafka topic, during the entire upgrade/downgrade process. When the
upgrade completes, we make sure that all of those events propagated at least
once.
To achieve that a
wathola tool
has been created. It consists of 4 components: sender, forwarder,
receiver, and fetcher. Sender is the usual Kubernetes deployment that
publishes events to the system under tests (KafkaSource
or KafkaChannel
)
with given interval. When it terminates (by either SIGTERM
, or SIGINT
), a
finished
event is generated. Forwarder is a knative serving service that
scales up from zero to receive the sent events and forward them to given target
which is the receiver in our case. Receiver is an ordinary deployment that
collects events from multiple forwarders and has an endpoint /report
that can
be polled to get the status of received events. To fetch the report from within
the cluster fetcher comes in. It's a simple one time job, that will fetch the
report from receiver and print it on stdout as JSON. That enables the test
client to download fetcher logs and parse the JSON to get the final report.
Diagram below describe the setup:
K8s cluster | Test machine
|
(deploym.) (ksvc) (deploym.) |
+--------+ +-----------+ +----------+ | +------------+
| | | ++ | | | | |
| Sender | +-->| Forwarder ||----->+ Receiver | | + TestProber |
| | | | || | |<---+ | | |
+---+----+ | +------------| +----------+ | | +------------+
| | +-----------+ | |
| | | |
| | +---------+ |
| +--+------+ +---------+ | | |
+-----> | | | | Fetcher | |
| | Channel <-------+ Source | | | |
| | | | | +---------+ |
| +---------+ +----^----+ (job) |
| |
+----------------------------+
Probe test configuration
Probe test behavior can be influenced from outside without modifying its source
code. That can be beneficial if one would like to run upgrade tests in different
context. One such example might be running Eventing upgrade tests in place that
have Serving and Eventing both installed. In such environment one can set
environment variable E2E_UPGRADE_TESTS_SERVING_USE
to enable usage of ksvc
forwarder (which is disabled by default):
$ export E2E_UPGRADE_TESTS_SERVING_USE=true
Any option, apart from namespace, in
knative.dev/eventing/test/upgrade/prober.Config
struct can be influenced, by using E2E_UPGRADE_TESTS_XXXXX
environmental
variable prefix (using
kelseyhightower/envconfig
usage).