Knative Monitoring
Knative monitoring is a service that listens to prow job pubsub messages. It
scrapes through all the failure logs to catch test infrastructure failures.
System Diagram
Setup
Create the Cluster
gcloud container clusters create monitoring --enable-ip-alias --zone=us-central1-a
Note: The cluster connects to the CloudSQL instance via private IP. Thus, it is
required that the cluster is in the same zone as the CloudSQL instance.
Build and Deploy Changes
Update the Kubernetes components
monitoring_service.yaml
is the config to set up all the Kubernetes resources. Use kubectl apply
on the
monitoring_service file to make any updates.
Update Image
-
images/monitoring/Makefile
Commands to build and deploy the monitoring
images.
-
Update to use the latest image on GKE
kubectl rollout restart deployment.apps/monitoring-app
Check the rollout status
kubectl rollout status deployment.apps/monitoring-app
Clearing the alerts
From tools/monitoring/clearalerts
directory, run run_clear_alerts.sh
script.
Note: run_clear_alerts.sh
only works on linux machine. It builds the binary on
the local machine and copies it to the monitoring pod. On MacOS, the binary it
built returns error cannot execute binary file: Exec format error
when it runs
on the monitoring pod.