katib

module
v0.12.0-rc.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 19, 2021 License: Apache-2.0

README

logo

Build Status Coverage Status Go Report Card Releases Slack Status

Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architecture Search.

Katib is the project which is agnostic to machine learning (ML) frameworks. It can tune hyperparameters of applications written in any language of the users’ choice and natively supports many ML frameworks, such as TensorFlow, MXNet, PyTorch, XGBoost, and others.

Getting Started

Follow the getting-started guide on the Kubeflow website.

Name

Katib stands for secretary in Arabic.

Concepts in Katib

For a detailed description of the concepts in Katib and AutoML, check the Kubeflow documentation.

Katib has the concepts of Experiment, Suggestion, Trial and Worker Job.

Experiment

An Experiment represents a single optimization run over a feasible space. Each Experiment contains a configuration:

  1. Objective: What you want to optimize.
  2. Search Space: Constraints for configurations describing the feasible space.
  3. Search Algorithm: How to find the optimal configurations.

Katib Experiment is defined as a CRD. Check the detailed guide to configuring and running a Katib Experiment in the Kubeflow docs.

Suggestion

A Suggestion is a set of hyperparameter values that the hyperparameter tuning process has proposed. Katib creates a Trial to evaluate the suggested set of values.

Katib Suggestion is defined as a CRD.

Trial

A Trial is one iteration of the hyperparameter tuning process. A Trial corresponds to one worker job instance with a list of parameter assignments. The list of parameter assignments corresponds to a Suggestion.

Each Experiment runs several Trials. The Experiment runs the Trials until it reaches either the objective or the configured maximum number of Trials.

Katib Trial is defined as a CRD.

Worker Job

The Worker Job is the process that runs to evaluate a Trial and calculate its objective value.

The Worker Job can be any type of Kubernetes resource or Kubernetes CRD. Follow the Trial template guide to support your own Kubernetes resource in Katib.

Katib has these CRD examples in upstream:

Thus, Katib supports multiple frameworks with the help of different job kinds.

Search Algorithms

Katib currently supports several search algorithms. Follow the Kubeflow documentation to know more about each algorithm.

Hyperparameter Tuning

Components in Katib

Katib consists of several components as shown below. Each component is running on Kubernetes as a deployment. Each component communicates with others via GRPC and the API is defined at pkg/apis/manager/v1beta1/api.proto.

  • Katib main components:
    • katib-db-manager - the GRPC API server of Katib which is the DB Interface.
    • katib-mysql - the data storage backend of Katib using mysql.
    • katib-ui - the user interface of Katib.
    • katib-controller - the controller for the Katib CRDs in Kubernetes.

Web UI

Katib provides a Web UI. During 1.3 we've worked on a new iteration of the UI, which is rewritten in Angular and is utilizing the common code of the other Kubeflow dashboards.

The users are currently able to list, delete and create Experiments in their cluster via this new UI as well as inspect the owned Trials. One important missing functionalities are the ability to edit the Trial templates ConfigMaps and view Neural Architecture Search models. Check this Project to monitor the current progress.

katibui

To use the old Katib UI you can update the Katib image newName with the previous image tag docker.io/kubeflowkatib/katib-ui:v0.11.1 in the Kustomize manifests.

GRPC API documentation

Check the Katib v1beta1 API reference docs.

Installation

For standard installation of Katib with support for all job operators, install Kubeflow. Follow the documentation:

If you install Katib with other Kubeflow components, you can't submit Katib jobs in Kubeflow namespace. Check the Kubeflow documentation to know more about it.

Alternatively, if you want to install Katib manually with TF and PyTorch operators support, follow these steps:

Create Kubeflow namespace:

kubectl create namespace kubeflow

Clone Kubeflow manifest repository:

git clone -b v1.2-branch git@github.com:kubeflow/manifests.git
Set `MANIFESTS_DIR` to the cloned folder.
export MANIFESTS_DIR=<cloned-folder>

TF operator

For installing TF operator, run the following:

cd "${MANIFESTS_DIR}/tf-training/tf-job-crds/base"
kustomize build . | kubectl apply -f -
cd "${MANIFESTS_DIR}/tf-training/tf-job-operator/base"
kustomize build . | kubectl apply -f -

PyTorch operator

For installing PyTorch operator, run the following:

cd "${MANIFESTS_DIR}/pytorch-job/pytorch-job-crds/base"
kustomize build . | kubectl apply -f -
cd "${MANIFESTS_DIR}/pytorch-job/pytorch-operator/base/"
kustomize build . | kubectl apply -f -

Katib

Note that your kustomize version should be >= 3.2. To install Katib run:

git clone git@github.com:kubeflow/katib.git
make deploy

Check if all components are running successfully:

kubectl get pods -n kubeflow

Expected output:

NAME                                READY   STATUS    RESTARTS   AGE
katib-controller-858d6cc48c-df9jc   1/1     Running   1          20m
katib-db-manager-7966fbdf9b-w2tn8   1/1     Running   0          20m
katib-mysql-7f8bc6956f-898f9        1/1     Running   0          20m
katib-ui-7cf9f967bf-nm72p           1/1     Running   0          20m
pytorch-operator-55f966b548-9gq9v   1/1     Running   0          20m
tf-job-operator-796b4747d8-4fh82    1/1     Running   0          21m

Running examples

After deploy everything, you can run examples to verify the installation.

This is an example for TF operator:

kubectl create -f https://raw.githubusercontent.com/kubeflow/katib/master/examples/v1beta1/tfjob-example.yaml

This is an example for PyTorch operator:

kubectl create -f https://raw.githubusercontent.com/kubeflow/katib/master/examples/v1beta1/pytorchjob-example.yaml

Check the Kubeflow documentation how to monitor your Experiment status.

You can view your results in Katib UI. If you used standard installation, access the Katib UI via Kubeflow dashboard. Otherwise, port-forward the katib-ui:

kubectl -n kubeflow port-forward svc/katib-ui 8080:80

You can access the Katib UI using this URL: http://localhost:8080/katib/.

Katib SDK

Katib supports Python SDK:

Run make generate to update Katib SDK.

Cleanups

To delete installed TF and PyTorch operator run kubectl delete -f on the respective folders.

To delete Katib run make undeploy.

Quick Start

Please follow the Kubeflow documentation to submit your first Katib experiment.

Community

We are always growing our community and invite new users and AutoML enthusiasts to contribute to the Katib project. The following links provide information about getting involved in the community:

Blog posts

Events

Contributing

Please feel free to test the system! developer-guide.md is a good starting point for developers.

Citation

If you use Katib in a scientific publication, we would appreciate citations to the following paper:

A Scalable and Cloud-Native Hyperparameter Tuning System, George et al., arXiv:2006.02085, 2020.

Bibtex entry:

@misc{george2020katib,
    title={A Scalable and Cloud-Native Hyperparameter Tuning System},
    author={Johnu George and Ce Gao and Richard Liu and Hou Gang Liu and Yuan Tang and Ramdoot Pydipaty and Amit Kumar Saha},
    year={2020},
    eprint={2006.02085},
    archivePrefix={arXiv},
    primaryClass={cs.DC}
}

Directories

Path Synopsis
cmd
katib-controller/v1beta1
Katib-controller is a controller (operator) for Experiments and Trials
Katib-controller is a controller (operator) for Experiments and Trials
hack
pkg
apis/controller
Package apis contains Kubernetes API groups.
Package apis contains Kubernetes API groups.
apis/controller/common
Package experiment contains experiment API versions
Package experiment contains experiment API versions
apis/controller/common/v1beta1
Package v1beta1 contains API Schema definitions for the common v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/common/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=common.kubeflow.org Package v1beta1 contains API Schema definitions for the common v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/common/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=common.kubeflow.org
Package v1beta1 contains API Schema definitions for the common v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/common/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=common.kubeflow.org Package v1beta1 contains API Schema definitions for the common v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/common/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=common.kubeflow.org
apis/controller/experiments
Package experiments contains experiment API versions
Package experiments contains experiment API versions
apis/controller/experiments/v1beta1
Package v1beta1 contains API Schema definitions for the experiment v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/experiments/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=experiment.kubeflow.org Package v1beta1 contains API Schema definitions for the experiment v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/experiments/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=experiments.kubeflow.org
Package v1beta1 contains API Schema definitions for the experiment v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/experiments/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=experiment.kubeflow.org Package v1beta1 contains API Schema definitions for the experiment v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/experiments/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=experiments.kubeflow.org
apis/controller/suggestions
Package suggestions contains suggestion API versions
Package suggestions contains suggestion API versions
apis/controller/suggestions/v1beta1
Package v1beta1 contains API Schema definitions for the suggestion v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/suggestions/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=suggestion.kubeflow.org Package v1beta1 contains API Schema definitions for the suggestion v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/suggestions/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=suggestion.kubeflow.org
Package v1beta1 contains API Schema definitions for the suggestion v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/suggestions/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=suggestion.kubeflow.org Package v1beta1 contains API Schema definitions for the suggestion v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/suggestions/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=suggestion.kubeflow.org
apis/controller/trials
Package trials contains trial API versions
Package trials contains trial API versions
apis/controller/trials/v1beta1
Package v1beta1 contains API Schema definitions for the trial v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/trials/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=trial.kubeflow.org Package v1beta1 contains API Schema definitions for the trial v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/trials/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=trials.kubeflow.org
Package v1beta1 contains API Schema definitions for the trial v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/trials/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=trial.kubeflow.org Package v1beta1 contains API Schema definitions for the trial v1beta1 API group +k8s:openapi-gen=true +k8s:deepcopy-gen=package,register +k8s:conversion-gen=github.com/kubeflow/katib/pkg/apis/controller/trials/v1beta1 +k8s:defaulter-gen=TypeMeta +kubebuilder:subresource:status +groupName=trials.kubeflow.org
apis/manager/health
Package grpc_health_v1 is a generated protocol buffer package.
Package grpc_health_v1 is a generated protocol buffer package.
apis/manager/v1beta1
Package api_v1_beta1 is a generated protocol buffer package.
Package api_v1_beta1 is a generated protocol buffer package.
client/controller/clientset/versioned
This package has the automatically generated clientset.
This package has the automatically generated clientset.
client/controller/clientset/versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
client/controller/clientset/versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
client/controller/clientset/versioned/typed/common/v1beta1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
client/controller/clientset/versioned/typed/common/v1beta1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
client/controller/clientset/versioned/typed/experiments/v1beta1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
client/controller/clientset/versioned/typed/experiments/v1beta1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
client/controller/clientset/versioned/typed/suggestions/v1beta1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
client/controller/clientset/versioned/typed/suggestions/v1beta1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
client/controller/clientset/versioned/typed/trials/v1beta1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
client/controller/clientset/versioned/typed/trials/v1beta1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
mock/v1beta1/api
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
mock/v1beta1/db
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
mock/v1beta1/experiment/manifest
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
mock/v1beta1/experiment/suggestion
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
mock/v1beta1/suggestion/suggestionclient
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
mock/v1beta1/trial/managerclient
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
mock/v1beta1/util/katibclient
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
test

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL