Cluster Monitoring Operator
The Cluster Monitoring Operator manages and updates the Prometheus-based monitoring stack deployed on top of OpenShift.
It contains the following components:
The deployed Prometheus Operator is meant to be leveraged by users to easily deploy new Prometheus setup for their application monitoring.
The Prometheus instance (prometheus-k8s
) is responsible for monitoring and alerting on cluster and OpenShift components. It should not be extended to monitor user applications.
Alertmanager is a cluster-global component for handling alerts generated by all Prometheus instances deployed in that cluster.
Metrics are collected from the following components
Important: The Prometheus Operator managed by the Cluster Monitoring Operator will by default only look for ServiceMonitor
resources in namespaces containing an openshift.io/cluster-monitoring
label (with any value).
Contributing new component integrations
The Cluster Monitoring Operator has many builtin ServiceMonitor
resources which enable discovering the metrics endpoints of a variety of well-known components.
To register a new builtin component, make the following changes:
To add a new builtin alerting rule:
Run make generate
after you modify the files and make sure to add the modified files to the commit.
Roadmap
- Monitor etcd
- Adapt Tectonic inherited alerts with OpenShift operational knowledge
Testing
End-to-end tests
Run e2e-tests with make e2e-test
.
Clean up after e2e-tests with make e2e-clean