opendatahub-operator

command module

v2.25.0 Latest Latest Go to latest Published: Feb 26, 2025 License: Apache-2.0 Imports: 77 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/opendatahub-io/opendatahub-operator

README ¶

This operator is the primary operator for Open Data Hub. It is responsible for enabling Data science applications like Jupyter Notebooks, Modelmesh serving, Datascience pipelines etc. The operator makes use of DataScienceCluster CRD to deploy and configure these applications.

Usage

Prerequisites

If single model serving configuration is used or if Kserve component is used then please make sure to install the following operators before proceeding to create a DSCI and DSC instances.

Additionally installing Authorino operator & Service Mesh operator enhances user-experience by providing a single sign on experience.

Installation

The latest version of operator can be installed from the community-operators catalog on OperatorHub.

Please note that the latest releases are made in the Fast channel.
It can also be build and installed from source manually, see the Developer guide for further instructions.
1. Subscribe to operator by creating following subscription
```
cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: opendatahub-operator
  namespace: openshift-operators
spec:
  channel: fast
  name: opendatahub-operator
  source: community-operators
  sourceNamespace: openshift-marketplace
EOF
```
2. Create DSCInitialization CR manually. You can also use operator to create default DSCI CR by removing env variable DISABLE_DSC_CONFIG from CSV or changing the value to "false", followed by restarting the operator pod.
3. Create DataScienceCluster CR to enable components

Configuration

in ODH 2.23.1, we introduced a new feature which allows user to use their own application namespace than default one "opendatahub".

for new cluster, as this cluster has not been used for ODH or RHOAI. Here we use namespace A for example as targeted application namespace, please follow below steps before install ODH operator:
- create namespace A
- add label opendatahub.io/application-namespace: true onto namespace A. Only one namespace in the cluster can have this label.
- install ODH operator either from UI or by GitOps/CLI
- once Operator is up and running, manually create DSCI CR by set .spec.applicationsNamespace:A
- wait till DSCI status update to "Ready"
- continue to create DSC CR
for upgrade case, as ODH is running in the cluster.

Be aware: to switch to a different application namespace can cause more issues and require manual cleanup, therefore we suggest this to be done for new cluster.

Developer Guide

Pre-requisites

Go version go1.22
operator-sdk version can be updated to v1.31.1

Download manifests

The get_all_manifests.sh script facilitates the process of fetching manifests from remote git repositories. It is configured to work with a predefined map of components and their corresponding manifest locations.

Structure of `COMPONENT_MANIFESTS`

Each component is associated with its manifest location in the COMPONENT_MANIFESTS map. The key is the component's name, and the value is its location, formatted as <repo-org>:<repo-name>:<branch-name>:<source-folder>:<target-folder>

Workflow

The script clones the remote repository <repo-org>/<repo-name> from the specified <branch-name>.
It then copies the content from the relative path <source-folder> to the local opt/manifests/<target-folder> folder.

Local Storage

The script utilizes a local, empty folder named opt/manifests to host all required manifests, sourced directly from each component’s source repository.

Adding New Components

To include a new component in the list of manifest repositories, simply extend the COMPONENT_MANIFESTS map with a new entry, as shown below:

declare -A COMPONENT_MANIFESTS=(
  // existing components ...
  ["new-component"]="<repo-org>:<repo-name>:<branch-name>:<source-folder>:<target-folder>"
)

Customizing Manifests Source

You have the flexibility to change the source of the manifests. Invoke the get_all_manifests.sh script with specific flags, as illustrated below:

./get_all_manifests.sh --odh-dashboard="maistra:odh-dashboard:test-manifests:manifests:odh-dashboard"

If the flag name matches components key defined in COMPONENT_MANIFESTS it will overwrite its location, otherwise the command will fail.

for local development

make get-manifests

This first cleanup your local opt/manifests folder. Ensure back up before run this command if you have local changes of manifests want to reuse later.

for build operator image

make image-build

By default, building an image without any local changes(as a clean build) This is what the production build system is doing.

In order to build an image with local opt/manifests folder set USE_LOCAL make variable to true e.g make image-build USE_LOCAL=true"

Build Image

Custom operator image can be built using your local repository
```
make image IMG=quay.io/<username>/opendatahub-operator:<custom-tag>
```
The default image used is quay.io/opendatahub/opendatahub-operator:dev-0.0.1 when not supply argument for make image

To build multi-arch image, set environment variable PLATFORM

export PLATFORM=linux/amd64,linux/arm64,linux/ppc64le,linux/s390x
make image

Once the image is created, the operator can be deployed either directly, or through OLM. For each deployment method a kubeconfig should be exported
```
export KUBECONFIG=<path to kubeconfig>
```

Deployment

Deploying operator locally

Define operator namespace

export OPERATOR_NAMESPACE=<namespace-to-install-operator>

Deploy the created image in your cluster using following command:

make deploy IMG=quay.io/<username>/opendatahub-operator:<custom-tag> OPERATOR_NAMESPACE=<namespace-to-install-operator>

To remove resources created during installation use:
```
make undeploy
```

Deploying operator using OLM

To create a new bundle in defined operator namespace, run following command:
```
export OPERATOR_NAMESPACE=<namespace-to-install-operator>
make bundle
```
Note : Skip the above step if you want to run the existing operator bundle.

Build Bundle Image:

make bundle-build bundle-push BUNDLE_IMG=quay.io/<username>/opendatahub-operator-bundle:<VERSION>

Run the Bundle on a cluster:

operator-sdk run bundle quay.io/<username>/opendatahub-operator-bundle:<VERSION> --namespace $OPERATOR_NAMESPACE --decompression-image quay.io/project-codeflare/busybox:1.36

Test with customized manifests

There are 2 ways to test your changes with modification:

Each component in the DataScienceCluster CR has devFlags.manifests field, which can be used to pull down the manifests from the remote git repos of the respective components. By using this method, it overwrites manifests and creates customized resources for the respective components.
[Under implementation] build operator image with local manifests.

Update API docs

Whenever a new api is added or a new field is added to the CRD, please make sure to run the command:

make api-docs

This will ensure that the doc for the apis are updated accordingly.

Enabled logging

Global logger configuration can be changed with an environemnt variable ZAP_LOG_LEVEL or a command line switch --log-mode <mode> for example from CSV. Command line switch has higher priority. Valid values for <mode>: "" (as default) || prod || production || devel || development.

Verbosity level is INFO. To fine tune zap backend standard operator sdk zap switches can be used.

Log level can be changed by DSCI devFlags during runtime by setting .spec.devFlags.logLevel. It accepts the same values as --zap-log-level command line switch. See example :

apiVersion: dscinitialization.opendatahub.io/v1
kind: DSCInitialization
metadata:
  name: default-dsci
spec:
  devFlags:
    logLevel: debug
  ...

logmode	stacktrace level	verbosity	Output	Comments
devel	WARN	INFO	Console	lowest level, using epoch time
development	WARN	INFO	Console	same as devel
""	ERROR	INFO	JSON	default option
prod	ERROR	INFO	JSON	highest level, using human readable timestamp
production	ERROR	INFO	JSON	same as prod

Example DSCInitialization

Below is the default DSCI CR config

kind: DSCInitialization
apiVersion: dscinitialization.opendatahub.io/v1
metadata:
  name: default-dsci
spec:
  applicationsNamespace: opendatahub
  monitoring:
    managementState: Managed
    namespace: opendatahub
  serviceMesh:
    controlPlane:
      metricsCollection: Istio
      name: data-science-smcp
      namespace: istio-system
    managementState: Managed
  trustedCABundle:
    customCABundle: ''
    managementState: Managed

Apply this example with modification for your usage.

Example DataScienceCluster

When the operator is installed successfully in the cluster, a user can create a DataScienceCluster CR to enable ODH components. At a given time, ODH supports only one instance of the CR, which can be updated to get custom list of components.

Enable all components

apiVersion: datasciencecluster.opendatahub.io/v1
kind: DataScienceCluster
metadata:
  name: default-dsc
spec:
  components:
    codeflare:
      managementState: Managed
    dashboard:
      managementState: Managed
    datasciencepipelines:
      managementState: Managed
    kserve:
      managementState: Managed
      nim:
        managementState: Managed
      serving:
        ingressGateway:
          certificate:
            type: OpenshiftDefaultIngress
        managementState: Managed
        name: knative-serving
    kueue:
      managementState: Managed
    modelmeshserving:
      managementState: Managed
    modelregistry:
      managementState: Managed
      registriesNamespace: "odh-model-registries"
    ray:
      managementState: Managed
    trainingoperator:
      managementState: Managed
    trustyai:
      managementState: Managed
    workbenches:
      managementState: Managed
    feastoperator:
      managementState: Managed

Enable only Dashboard and Workbenches

apiVersion: datasciencecluster.opendatahub.io/v1
kind: DataScienceCluster
metadata:
  name: example
spec:
  components:
    dashboard:
      managementState: Managed
    workbenches:
      managementState: Managed

Note: Default value for managementState in component is false.

Run functional Tests

The functional tests are writted based on ginkgo and gomega. In order to run the tests, the user needs to setup the envtest which provides a mocked kubernetes cluster. A detailed explanation on how to configure envtest is provided here.

To run the test on individual controllers, change directory into the contorller's folder and run

ginkgo -v

This provides detailed logs of the test spec.

Note: When runninng tests for each controller, make sure to add the BinaryAssetsDirectory attribute in the envtest.Environment in the suite_test.go file. The value should point to the path where the envtest binaries are installed.

In order to run tests for all the controllers, we can use the make command

make unit-test

Note: The make command should be executed on the root project level.

Run e2e Tests

A user can run the e2e tests in the same namespace as the operator. To deploy opendatahub-operator refer to this section. The following environment variables must be set when running locally:

export KUBECONFIG=/path/to/kubeconfig

Ensure when testing RHODS operator in dev mode, no ODH CSV exists Once the above variables are set, run the following:

make e2e-test

Additional flags that can be passed to e2e-tests by setting up E2E_TEST_FLAGS variable. Following table lists all the available flags to run the tests:

Flag	Description	Default value
--skip-deletion	To skip running of `dsc-deletion` test that includes deleting `DataScienceCluster` resources. Assign this variable to `true` to skip DataScienceCluster deletion.	false
--test-operator-controller	To configure the execution of tests related to the Operator POD, this is useful to run e2e tests for an operator running out of the cluster i.e. for debugging purposes	true
--test-webhook	To configure the execution of tests rellated to the Operator WebHooks, this is useful to run e2e tests for an operator running out of the cluster i.e. for debugging purposes	true
--test-component	A repeatable flag that control what component should be tested, by default all component specific test are executed	true

Example command to run full test suite skipping the test for DataScienceCluster deletion.

make e2e-test OPERATOR_NAMESPACE=<namespace> E2E_TEST_FLAGS="--skip-deletion=true"

Example commands to run test suite for the dashboard component only, with the operator running out of the cluster.

make run-nowebhook

make e2e-test -e OPERATOR_NAMESPACE=<namespace> -e E2E_TEST_FLAGS="--test-operator-controller=false --test-webhook=false --test-component=dashboard"

Run Prometheus Unit Tests for Alerts

Unit tests for Prometheus alerts are included in the repository. You can run them using the following command:

make test-alerts

To check for alerts that don't have unit tests, run the below command:

make check-prometheus-alert-unit-tests

To add a new unit test file, name it the same as the rules file in the prometheus ConfigMap, just with the .rules suffix replaced with .unit-tests.yaml

API Overview

Please refer to api documentation

Component Integration

Please refer to components docs

Troubleshooting

Please refer to troubleshooting documentation

Upgrade testing

Please refer to upgrade testing documentation

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
apis
common
components +groupName=datasciencecluster.opendatahub.io	+groupName=datasciencecluster.opendatahub.io
components/v1alpha1 Package v1 contains API Schema definitions for the components v1 API group +kubebuilder:object:generate=true +groupName=components.platform.opendatahub.io	Package v1 contains API Schema definitions for the components v1 API group +kubebuilder:object:generate=true +groupName=components.platform.opendatahub.io
datasciencecluster/v1 Package v1 contains API Schema definitions for the datasciencecluster v1 API group	Package v1 contains API Schema definitions for the datasciencecluster v1 API group
dscinitialization/v1 Package v1 contains API Schema definitions for the dscinitialization v1 API group	Package v1 contains API Schema definitions for the dscinitialization v1 API group
features/v1 Package v1 contains API Schema definitions for the feature v1 API group	Package v1 contains API Schema definitions for the feature v1 API group
infrastructure/v1 +groupName=datasciencecluster.opendatahub.io	+groupName=datasciencecluster.opendatahub.io
services +groupName=dscinitialization.opendatahub.io	+groupName=dscinitialization.opendatahub.io
services/v1alpha1 Package v1 contains API Schema definitions for the services v1 API group +kubebuilder:object:generate=true +groupName=services.platform.opendatahub.io	Package v1 contains API Schema definitions for the services v1 API group +kubebuilder:object:generate=true +groupName=services.platform.opendatahub.io
controllers
certconfigmapgenerator Package certconfigmapgenerator contains generator logic of add cert configmap resource in user namespaces	Package certconfigmapgenerator contains generator logic of add cert configmap resource in user namespaces
components/codeflare
components/dashboard
components/datasciencepipelines
components/feastoperator
components/kserve
components/kueue
components/modelcontroller
components/modelmeshserving
components/modelregistry
components/ray
components/trainingoperator
components/trustyai
components/workbenches
datasciencecluster Package datasciencecluster contains controller logic of CRD DataScienceCluster	Package datasciencecluster contains controller logic of CRD DataScienceCluster
dscinitialization Package dscinitialization contains controller logic of CRD DSCInitialization.	Package dscinitialization contains controller logic of CRD DSCInitialization.
secretgenerator Package secretgenerator contains generator logic of secret resources used in Open Data Hub operator	Package secretgenerator contains generator logic of secret resources used in Open Data Hub operator
services/auth
services/monitoring
setupcontroller
status Package status provides a generic way to report status and conditions for any resource of type client.Object.	Package status provides a generic way to report status and conditions for any resource of type client.Object.
webhook
pkg
cluster Package cluster contains utility functions used to operate on cluster resources.	Package cluster contains utility functions used to operate on cluster resources.
cluster/gvk
common Package common contains utility functions used by different components for cluster related common operations, refer to package cluster	Package common contains utility functions used by different components for cluster related common operations, refer to package cluster
componentsregistry componentsregistry package is a registry of all components that can be managed by the operator TODO: it may make sense to put it under components/ when it's clear from the old stuff	componentsregistry package is a registry of all components that can be managed by the operator TODO: it may make sense to put it under components/ when it's clear from the old stuff
controller/actions
controller/actions/cacher
controller/actions/deleteresource
controller/actions/deploy
controller/actions/errors
controller/actions/gc
controller/actions/render
controller/actions/render/kustomize
controller/actions/render/template
controller/actions/resourcecacher
controller/actions/status/releases
controller/actions/updatestatus
controller/client
controller/conditions
controller/handlers
controller/manager
controller/predicates
controller/predicates/clusterrole
controller/predicates/component
controller/predicates/dependent
controller/predicates/generation
controller/predicates/hash
controller/predicates/partial
controller/predicates/resources
controller/reconciler
controller/types
conversion
deploy Package deploy provides utility functions used by each component to deploy manifests to the cluster.	Package deploy provides utility functions used by each component to deploy manifests to the cluster.
feature
feature/manifest
feature/provider
feature/resource
feature/serverless
feature/servicemesh
logger
manifests/kustomize
metadata/annotations
metadata/labels
plugins
resources
services/gc
trustedcabundle Package trustedcabundle provides utility functions to create and check trusted CA bundle configmap from DSCI CRD	Package trustedcabundle provides utility functions to create and check trusted CA bundle configmap from DSCI CRD
upgrade Package upgrade provides functions of upgrade ODH from v1 to v2 and vaiours v2 versions.	Package upgrade provides functions of upgrade ODH from v1 to v2 and vaiours v2 versions.
utils/test/envt
utils/test/fakeclient
utils/test/matchers
utils/test/matchers/jq
utils/test/mocks
utils/test/scheme
utils/test/testf
tests
envtestutil
integration/features/fixtures

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL