Red Hat Observability Service
This project holds the configuration files for our internal Red Hat Observability Service based on Observatorium.
See our website for more information about RHOBS.
Requirements
macOS
- findutils (for GNU xargs)
- gnu-sed
Both can be installed using Homebrew: brew install gnu-sed findutils
. Afterwards, update the SED
and XARGS
variables in the Makefile to use gsed
and gxargs
or replace them in your environment.
Usage
This repository contains Jsonnet configuration that allows generating Kubernetes objects that compose RHOBS service and its observability.
RHOBS service
The jsonnet files for RHOBS service can be found in services directory. In order to compose RHOBS Service we import many Jsonnet libraries from different open source repositories including kube-thanos for Thanos components, Observatorium for Observatorium, Minio, Memcached, Gubernator, Dex components, thanos-receive-controller for Thanos receive controller component, parca for Parca component, observatorium api for API component, observatorium up for up component, rules-objstore for rules-objstore component.
Currently, RHOBS components are rendered as OpenShift Templates that allows parameters. This is how we deploy to multiple clusters, sharing the same configuration core, but having different details like resources or names.
This is why there might be a gap between vanilla Observatorium and RHOBS. We have plans to resolve this gap in the future.
Running make manifests
generates all required files into resources/services directory.
Observability
Similarly, in order to have observability (alerts, recording rules, dashboards) for our service we import mixins from various projects and compose all together in observability directory.
Running make prometheusrules grafana
generates all required files into resources/observability directory.
Updating Dependencies
Up-to-date list of jsonnet dependencies can be found in jsonnetfile.json. Fetching all deps is done through make vendor_jsonnet
utility.
To update a dependency, normally the process would be:
make vendor_jsonnet # This installs dependencies like `jb` thanks to Bingo project.
JB=`ls $(go env GOPATH)/bin/jb-* -t | head -1`
# Updates `kube-thanos` to master and sets the new hash in `jsonnetfile.lock.json`.
$JB update https://github.com/thanos-io/kube-thanos/jsonnet/kube-thanos@main
# Update all dependancies to master and sets the new hashes in `jsonnetfile.lock.json`.
$JB update
Testing cluster
The purpose of RHOBS testing cluster is to
experiment before changes are rolled out to staging and production environments. The objects in the cluster are managed by app-interface, however the testing cluster uses a different set of namespaces - observatorium{-logs,-metrics,-traces}-testing
.
Changes can be applied to the cluster manually, however they will be overridden by app-interface during the next deployment cycle.
Refresh token
The refresh token can be obtained via token-refresher.
./token-refresher --url=https://observatorium.apps.rhobs-testing.qqzf.p1.openshiftapps.com --oidc.client-id=observatorium-rhobs-testing --oidc.client-secret=<token> --log.level=debug --oidc.issuer-url=https://sso.redhat.com/auth/realms/redhat-external --oidc.audience=observatorium-telemeter-testing --file /tmp/token
cat /tmp/token
App Interface
Our deployments our managed by our Red Hat AppSRE team.
Updating Dashboards
Staging: Once the PR containing the dashboard changes is merged to main
it goes directly to stage environment - because the telemeter-dashboards
resourceTemplate refers the main
branch here.
Production: Update the commit hash ref in the saas file in the telemeterDashboards
resourceTemplate, for production environment.
Prometheus Rules and Alerts
Use synchronize.sh
to create a MR against app-interface
to update dashboards.
Components - Deployments, ServiceMonitors, ConfigMaps etc...
Staging: update the commit hash ref in https://gitlab.cee.redhat.com/service/app-interface/blob/master/data/services/telemeter/cicd/saas.yaml
Production: update the commit hash ref in https://gitlab.cee.redhat.com/service/app-interface/blob/master/data/services/telemeter/cicd/saas.yaml
CI Jobs
Jobs runs are posted in:
#sd-app-sre-info
for grafana dashboards
and
#team-monitoring-info
for everything else.
Troubleshooting
- Enable port forwarding for a user - example
- Add a pod name to the allowed list for port forwarding - example