README ¶
authz-operator
An operator to register k8s apps with the CERN Authorization Service, including Application and SSO (OIDC) registrations, and lifecycle policy enforcement, such as ownership transfer or expiration when the owner leaves CERN.
APIs
The authz-operator exposes several CRDs.
For interaction with the CERN Authorization Service API (AuthzAPI):
ApplicationRegistration
OIDCReturnUri
BootstrapApplicationRole
For managing lifecycle of OKD4 projects from ApplicationRegistration
:
ProjectLifecyclePolicy
ApplicationRegistration
The ApplicationRegistration creates and maintains the Application and OIDC registration objects in the AuthzAPI.
Once the objects are created in the AuthzAPI, they're linked to this CR by the ID
fields in its status.
The information in the Spec is the source of truth for these objects, meaning that changes in those fields from Authz will be overwritten by the operator. Fields labelled "initial*" only initialize and don't maintain their value. Such a field is the application owner, for which the source of truth is the AuthzAPI (to allow for lifecycle actions).
The Status fetches the last-known state of the linked objects in the AuthzAPI. It is also the place to look for errors reported by the API. The CRD was designed without the "conditions" pattern in mind, but is planned to be redesigned, to better convey the relevant information.
Only 1 ApplicationRegistration
is expect to exist per OKD project/namespace.
Since the project is the administrative/ownership domain in our services, it has a direct correspondence
with the concept of an Application in the Authorization service.
Note that for performance reasons, we run 2 instances of the ApplicationRegistrationReconciler
operating in different modes:
- the foreground instance processes regular Kubernetes resource events and applies changes to the Authz API.
However it does not read state from the Authz API if the state in Kubernetes is consistent.
This instance of the reconciler thus returns quickly in most cases.
With every reconcile, it requests a refresh of the
ApplicationRegistration
state from the background reconciler - the background instance only processes requests to fully refresh an
ApplicationRegistration
's status when requested by another controller (the foreground reconciler, or the lifecycle controller when it finds that anApplicationRegistration
's status is out of sync with the Authz API). These requests are received via an event channel.
This foreground/background separation is necessary because it takes too long to refresh status with each reconciliation
(controller default SyncPeriod
is 10h, and re-sync all ApplicationRegistration
on webeos
cluster with 5k ApplicationRegistration
would take >20m
during which time site creation from webservices portal fails, because it times out waiting for ApplicationRegistration
to be Created
- cf. INC3490535).
We went with 2 instances of the ApplicationRegistrationReconciler
operating in different modes, rather than 2 separate controllers, to allow
the foreground/background separation while minimizing changes to the existing logic.
Export of application details
Users may configure additional settings in the Application portal, such as the roles and to which groups they are mapped.
The ApplicationRegistrationExport
controller regularly exports this information from the Authz API into the status
of each ApplicationRegistration
.
This is done to help with recovery of deleted projects / applications, since OKD admins don't have access to the Authz internal database and the Authz team does not support restoring individual applications.
Ref: https://gitlab.cern.ch/webservices/webframeworks-planning/-/issues/483
OIDCReturnUri
OIDC return URIs are the valid addresses where the Identity Provider (keycloak) is allowed to redirect the user after successful authentication. For this reason they're also referred to as "redirect URIs". They are registered for each OIDC registration, and can be multiple. To better express this multiplicity and make them dynamically adjustable, they are expressed with a separate CR.
In a project with only an ApplicationRegistration the OIDC registration will initially have no redirect URIs
(and thus be non-functional).
Adding an OIDCReturnURI
will trigger the Application controller to include it in the OIDC registration.
The OIDCReturnURIs in the project are maintained in sync with the values in the AuthzAPI.
BootstrapApplicationRole
BootstrapApplicationRole creates but does not maintain Roles to the existing Application in the same namespace in the AuthzAPI.
Each Role is bound to the Application (without an Application, Role doesn't make sense), and once the role is created in the AuthzAPI, the role ID
is stored in the status and no further action is made from the controller.
Details on motivation can be seen here.
Synchronization of lifecycle-related application properties
Applications at CERN have an ownership "lifecycle" policy. It involves automatically updating resource ownership (owner & admin group) when somebody leaves the Organization and automatically deleting leftover resources linked to terminated computing accounts. The source of truth for the Owner and whether an application still exists is the AuthzAPI.
Although not an API, the operator includes a controller that enforces the lifecycle policy.
The operator periodically checks ownership status for each ApplicationRegistration
at the AuthzAPI
and, if ownership has changed or the linked Application has been deleted, updates the ApplicationRegistration
's .status
with the updated information.
A stress test was conducted for around 2500 ApplicationRegistrations
and the average reconciliation time is around 2min30sec.
Given this information the default reconciliation time for lifecycle has been set to 5minutes.
ProjectLifecyclePolicy
The ProjectLifecyclePolicy
CR controls how the authz-operator applies changes to lifecycle-related properties
of the application in the AuthzAPI to the OKD project/namespace containing an ApplicationRegistration
:
- manages a
rolebinding
granting admin permissions on the project to the owner/administrator group declared in the Application Portal. Important note: the authz-operator serviceaccount MUST itself be granted this cluster role so it can grant it to other users! - whether to delete the OKD project when the Application is deleted from the AuthzAPI
- maintain
ConsoleLinks
so the OKD console will show information and link to the Application's management page in the Application Portal - can propagate changes to the application's
Description
in the Application Portal to the OKD project's description
We expect exactly one ProjectLifecyclePolicy
per OKD project/namespace. Behavior is undefined if multiple CRs exist.
Setup & Deployment
Configuration
The authz-operator is configured with a set of environment variables:
env var | example | description |
---|---|---|
CLUSTER_NAME |
okd4-prod1 |
Name of the k8s cluster where the operator is deployed. Used in the ApplicationRegistration naming convention. |
AUTHZAPI_URL |
https://authorization-service-api.web.cern.ch |
API base URL for interacting with the Authorization service |
KC_ISSUER_URL |
https://auth.cern.ch/auth/realms/cern |
Identity provider (keycloak) issuer URL to fetch API access tokens from |
KC_CLIENT_ID |
authz-operator-okd4-prod |
For OAuth client credentials flow to get API access token |
KC_CLIENT_SECRET |
0789a3b8-fc2c-49d4-bfc9-eb1943f5977b |
For OAuth client credentials flow to get API access token |
LIFECYCLE_RECONCILE_PERIOD_MIN |
5 |
Time in minutes between periodic reconciliations of all ApplicationRegistrations with the lifecycle controller, if no value is set, the default will be set, which is 5 minutes |
AUTHZ_APPLICATIONS_PER_PAGE |
1000 |
Number of Applications per Page when retrieving all Applications in method GetMyApplications , if no value is set, the AuthzAPI default will be used (As of May/2021 is 1K Applications) |
For Internal tests to work, two extra environment variables need to be set:
env var | description |
---|---|
SVC_ACCOUNT_ID |
This is the OwnerID provided by the Auth API, should be found in the CI vars, to retrieve it, go to the AuthAPI Documentation (This will not work for production credentials) and get an Application owned by the Service account by ID, and the value should be returned as OwnerID |
MANAGER_ID |
The ID of our Manager Application, should be found in the CI vars and not change |
In Helm deployments, these values correspond to parameters explained in deploy/values.yaml
Deployment
Standard deployment is with the Helm chart.
For a new deployment we need to create new keycloack credentials
Create Keycloak credentials
The authz-operator's deployment needs to be known to the CERN Authorization service (AuthzSvc) as an Application. The AuthzSvc supports the concept of a "manager" for each Application. That's the role this operator plays for the resources it creates/manages from the AuthzSvc perspective.
The cluster admin needs to register an Application at the CERN Application portal and setup the OAuth client credentials flow:
- Create a new application
- An appropriate admin group should be specified
- Create an OIDC registration with client credentials
- Edit the Application -> SSO Registration
- New OIDC registration
- Set a random redirectURI eg "https://example.cern.ch"
- Advanced Options -> check "My application will need to get tokens using its own client ID and secret"
- Request group memberships:
authorization-service-identity-readers
authorization-service-applications-managers
Images
Gitlab CI is set up to automatically tag images in this repo's registry whenever any branch is pushed.
Development
Generated authorization-service-api client
Initially the authz-operator contained a hand-written Authorization Service API client (located in internal/authzapireq
).
As of June 2024 we use OpenAPI Generator that generates the Go code based on the OpenAPI spec ("Swagger").
This generated code might need to be updated from time to time to get access to new endpoints in the API.
# fetch API schema (commit it to Git so we can track changes)
curl -sSL https://authorization-service-api.web.cern.ch/swagger/v1.0/swagger.json -o authz-api-v1-swagger.json
# generate Go client and models (see details in the config file)
rm -rf internal/authzapiv1/
podman run --rm -v "${PWD}:/local" docker.io/openapitools/openapi-generator-cli:v7.6.0 generate \
--input-spec /local/authz-api-v1-swagger.json \
--generator-name go \
--output /local/internal/authzapiv1/ \
--config /local/openapi-generator-config.yaml
# remove common prefix from methods and structs
sed -i 's/ApiApiV10//g' internal/authzapiv1/*.go
sed -i 's/ApiV10//g' internal/authzapiv1/*.go
# fetch dependencies
go get -u ./internal/authzapiv1/...
go mod tidy
go mod vendor
# commit all changes in Git
go fmt ./internal/authzapiv1/...
git add -f authz-api-v1-swagger.json openapi-generator-config.yaml go.mod go.sum internal/authzapiv1/ vendor/
Design
For design documentation regarding `Application Role CRD, see here.
Discussions with Authorization service
Support applications managed by another service
- Add
application.managerId
- Define which properties can be modified on the authz / operator side
Support blocking applications (security)
- Resources lifecycle
OIDC: where is the SoT?
We discussed if OIDC-related information (esp redirectURIs
) should be a new CRD or part of AppReg: discussion
Especially, if redirectURIs
should be read from this operator's CRDs at all, or directly from an external source.
Supporting only a core OIDC flow type
This operator automates the most often-encountered cases, as a convenience, without wresting control from the end user. Therefore, marginal use cases don't need to be automated. With this reasoning, we don't support defining the OIDC flow type, because the only flow type supported by this operator will be the authentication code.
Similar for UserConsentRequired
: this is only relevant for content hosted outside of CERN, for which we don't care now.
Other OIDC flows
The user can always create a separate Application in the AuthzAPI from the UI and have fine-grained control over the OIDC details.
redirectURIs
webeos
single redirectURI that can be generated at creation of the ApplicationRegistration (just include it in the webeos project template) mysite => mysite.web.cern.ch/oidcsso/whatever (decided by the webeos-config-operator) => set ApplicationRegistration.spec.RedirectURI at project creation
PaaS / general use case
multiple hostnames, default /*
(or user-specified) return path
=> Route + annotation; we'll need a separate operator (PaasSite operator?)
to update ApplicationRegistration.spec.RedirectURI base on changes to routes
SAML
It is possible to create a SAML registration after the OIDC registration for the same application; therefore we're not complicating the Drupal use case with this decision.
Debugging on existing dev cluster
Uses:
- manual tests
- run one of the okd4-install integration test packages (switching use cases will restart the in-cluster operator, so we can run specific tests but not run the whole suite)
- attach vscode's debugger to running operator process
export CLUSTER_NAME=<...>
export KUBECONFIG=<...>
export KC_ISSUER_URL="https://keycloak-qa.cern.ch/auth/realms/cern"
export AUTHZAPI_URL="https://authorization-service-api-qa.web.cern.ch"
# Secrets
export KC_CLIENT_ID=$(oc get secret -n openshift-cern-authz-operator operator-keycloak-credentials -o json | jq -r '.data.CLIENT_ID' | base64 -d)
export KC_CLIENT_SECRET=$(oc get secret -n openshift-cern-authz-operator operator-keycloak-credentials -o json | jq -r '.data.CLIENT_SECRET' | base64 -d)
# disable in-cluster operator
oc edit application/authz-operator -n openshift-cern-argocd # <-- set replicas: 0
# If changes to CRDs...
# oc edit application/authz-operator-crds -n openshift-cern-argocd # <-- remove spec.syncPolicy.automated
# make manifests && oc replace -f config/crd/bases
# build & run...
make
bin/manager --zap-log-level=3 # use 6 for debug logs, 8 for trace logs
# run integration tests from okd4-install
# E.g. if okd4-install is checked out in ~/git/okd4-install
docker run --rm registry.cern.ch/paas-tools/okd4-install -v ~/git/okd4-install:/project -v ${KUBECONFIG:-~/.kube/kubeconfig}:/root/.kube/config -w /project -e CI_JOB_ID=${RANDOM} bats -tpr tests/1-common/1-authz-operator.bats
# When done with testing, revert changes to argocd Applications to put things back as they were
# or just sync the cluster-config Argocd application
Directories ¶
Path | Synopsis |
---|---|
api
|
|
v1alpha1
Package v1alpha1 contains API Schema definitions for the webservices v1alpha1 API group +kubebuilder:object:generate=true +groupName=webservices.cern.ch
|
Package v1alpha1 contains API Schema definitions for the webservices v1alpha1 API group +kubebuilder:object:generate=true +groupName=webservices.cern.ch |
cmd
|
|
internal
|
|