v1

package

v0.0.0-...-ebf0d3c Latest Latest Go to latest Published: Oct 23, 2024 License: Apache-2.0 Imports: 15 Imported by: 1

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

gitee.com/vak80/training-operator

Links

Open Source Insights

Documentation ¶

Overview ¶

Package v1 is the v1 version of the API. +groupName=kubeflow.org

Package v1 contains API Schema definitions for the kubeflow.org v1 API group +kubebuilder:object:generate=true +groupName=kubeflow.org

Index ¶

Constants
Variables
func GetOpenAPIDefinitions(ref common.ReferenceCallback) map[string]common.OpenAPIDefinition
func IsChieforMaster(typ commonv1.ReplicaType) bool
func IsEvaluator(typ commonv1.ReplicaType) bool
func IsScheduler(typ commonv1.ReplicaType) bool
func IsWorker(typ commonv1.ReplicaType) bool
func RegisterDefaults(scheme *runtime.Scheme) error
func Resource(resource string) schema.GroupResource
func SetDefaults_MPIJob(mpiJob *MPIJob)
func SetDefaults_MXJob(mxjob *MXJob)
func SetDefaults_PaddleJob(job *PaddleJob)
func SetDefaults_PyTorchJob(job *PyTorchJob)
func SetDefaults_TFJob(tfJob *TFJob)
func SetDefaults_XGBoostJob(xgboostJob *XGBoostJob)
func SetObjectDefaults_MPIJob(in *MPIJob)
func SetObjectDefaults_MPIJobList(in *MPIJobList)
func SetObjectDefaults_MXJob(in *MXJob)
func SetObjectDefaults_MXJobList(in *MXJobList)
func SetObjectDefaults_PaddleJob(in *PaddleJob)
func SetObjectDefaults_PaddleJobList(in *PaddleJobList)
func SetObjectDefaults_PyTorchJob(in *PyTorchJob)
func SetObjectDefaults_PyTorchJobList(in *PyTorchJobList)
func SetObjectDefaults_TFJob(in *TFJob)
func SetObjectDefaults_TFJobList(in *TFJobList)
func SetObjectDefaults_XGBoostJob(in *XGBoostJob)
func SetObjectDefaults_XGBoostJobList(in *XGBoostJobList)
func ValidateV1MXJob(mxJob *MXJob) error
func ValidateV1MpiJobSpec(c *MPIJobSpec) error
func ValidateV1PaddleJob(paddleJob *PaddleJob) error
func ValidateV1PyTorchJob(pytorchJob *PyTorchJob) error
func ValidateV1TFJob(tfjob *TFJob) error
func ValidateV1XGBoostJob(xgboostJob *XGBoostJob) error
type ElasticPolicy
- func (in *ElasticPolicy) DeepCopy() *ElasticPolicy
- func (in *ElasticPolicy) DeepCopyInto(out *ElasticPolicy)
type JobModeType
type MPIJob
- func (in *MPIJob) DeepCopy() *MPIJob
- func (in *MPIJob) DeepCopyInto(out *MPIJob)
- func (in *MPIJob) DeepCopyObject() runtime.Object
type MPIJobList
- func (in *MPIJobList) DeepCopy() *MPIJobList
- func (in *MPIJobList) DeepCopyInto(out *MPIJobList)
- func (in *MPIJobList) DeepCopyObject() runtime.Object
type MPIJobSpec
- func (in *MPIJobSpec) DeepCopy() *MPIJobSpec
- func (in *MPIJobSpec) DeepCopyInto(out *MPIJobSpec)
type MXJob
- func (in *MXJob) DeepCopy() *MXJob
- func (in *MXJob) DeepCopyInto(out *MXJob)
- func (in *MXJob) DeepCopyObject() runtime.Object
type MXJobList
- func (in *MXJobList) DeepCopy() *MXJobList
- func (in *MXJobList) DeepCopyInto(out *MXJobList)
- func (in *MXJobList) DeepCopyObject() runtime.Object
type MXJobSpec
- func (in *MXJobSpec) DeepCopy() *MXJobSpec
- func (in *MXJobSpec) DeepCopyInto(out *MXJobSpec)
type MXJobStatus
- func (in *MXJobStatus) DeepCopy() *MXJobStatus
- func (in *MXJobStatus) DeepCopyInto(out *MXJobStatus)
type PaddleElasticPolicy
- func (in *PaddleElasticPolicy) DeepCopy() *PaddleElasticPolicy
- func (in *PaddleElasticPolicy) DeepCopyInto(out *PaddleElasticPolicy)
type PaddleJob
- func (in *PaddleJob) DeepCopy() *PaddleJob
- func (in *PaddleJob) DeepCopyInto(out *PaddleJob)
- func (in *PaddleJob) DeepCopyObject() runtime.Object
type PaddleJobList
- func (in *PaddleJobList) DeepCopy() *PaddleJobList
- func (in *PaddleJobList) DeepCopyInto(out *PaddleJobList)
- func (in *PaddleJobList) DeepCopyObject() runtime.Object
type PaddleJobSpec
- func (in *PaddleJobSpec) DeepCopy() *PaddleJobSpec
- func (in *PaddleJobSpec) DeepCopyInto(out *PaddleJobSpec)
type PyTorchJob
- func (in *PyTorchJob) DeepCopy() *PyTorchJob
- func (in *PyTorchJob) DeepCopyInto(out *PyTorchJob)
- func (in *PyTorchJob) DeepCopyObject() runtime.Object
type PyTorchJobList
- func (in *PyTorchJobList) DeepCopy() *PyTorchJobList
- func (in *PyTorchJobList) DeepCopyInto(out *PyTorchJobList)
- func (in *PyTorchJobList) DeepCopyObject() runtime.Object
type PyTorchJobSpec
- func (in *PyTorchJobSpec) DeepCopy() *PyTorchJobSpec
- func (in *PyTorchJobSpec) DeepCopyInto(out *PyTorchJobSpec)
type RDZVBackend
type RDZVConf
- func (in *RDZVConf) DeepCopy() *RDZVConf
- func (in *RDZVConf) DeepCopyInto(out *RDZVConf)
type SuccessPolicy
type TFJob
- func (in *TFJob) DeepCopy() *TFJob
- func (in *TFJob) DeepCopyInto(out *TFJob)
- func (in *TFJob) DeepCopyObject() runtime.Object
type TFJobList
- func (in *TFJobList) DeepCopy() *TFJobList
- func (in *TFJobList) DeepCopyInto(out *TFJobList)
- func (in *TFJobList) DeepCopyObject() runtime.Object
type TFJobSpec
- func (in *TFJobSpec) DeepCopy() *TFJobSpec
- func (in *TFJobSpec) DeepCopyInto(out *TFJobSpec)
type XGBoostJob
- func (in *XGBoostJob) DeepCopy() *XGBoostJob
- func (in *XGBoostJob) DeepCopyInto(out *XGBoostJob)
- func (in *XGBoostJob) DeepCopyObject() runtime.Object
type XGBoostJobList
- func (in *XGBoostJobList) DeepCopy() *XGBoostJobList
- func (in *XGBoostJobList) DeepCopyInto(out *XGBoostJobList)
- func (in *XGBoostJobList) DeepCopyObject() runtime.Object
type XGBoostJobSpec
- func (in *XGBoostJobSpec) DeepCopy() *XGBoostJobSpec
- func (in *XGBoostJobSpec) DeepCopyInto(out *XGBoostJobSpec)

Constants ¶

View Source

const (
	// MPIJobDefaultPortName is name of the port used to communicate between Master and Workers.
	MPIJobDefaultPortName = "mpi-port"
	// MPIJobDefaultPort is default value of the port.
	MPIJobDefaultPort = 9999
	// MPIJobDefaultContainerName is the name of the MPIJob container.
	MPIJobDefaultContainerName = "mpi"
	// MPIJobDefaultRestartPolicy is default RestartPolicy for ReplicaSpec.
	MPIJobDefaultRestartPolicy = commonv1.RestartPolicyNever
	MPIJobKind                 = "MPIJob"
	// MPIJobPlural is the MPIJobPlural for TFJob.
	MPIJobPlural = "mpijobs"
	// MPIJobSingular is the singular for TFJob.
	MPIJobSingular = "mpijob"
	// MPIJobFrameworkName is the name of the ML Framework
	MPIJobFrameworkName = "mpi"
	// MPIJobReplicaTypeLauncher is the type for launcher replica.
	MPIJobReplicaTypeLauncher commonv1.ReplicaType = "Launcher"
	// MPIJobReplicaTypeWorker is the type for worker replicas.
	MPIJobReplicaTypeWorker commonv1.ReplicaType = "Worker"
)

View Source

const (
	// MXJobDefaultPortName is name of the port used to communicate between scheduler and
	// servers & workers.
	MXJobDefaultPortName = "mxjob-port"
	// MXJobDefaultContainerName is the name of the MXJob container.
	MXJobDefaultContainerName = "mxnet"
	// MXJobDefaultPort is default value of the port.
	MXJobDefaultPort = 9091
	// MXJobDefaultRestartPolicy is default RestartPolicy for MXReplicaSpec.
	MXJobDefaultRestartPolicy = commonv1.RestartPolicyNever
	// MXJobKind is the kind name.
	MXJobKind = "MXJob"
	// MXJobPlural is the MXNetPlural for mxJob.
	MXJobPlural = "mxjobs"
	// MXJobSingular is the singular for mxJob.
	MXJobSingular = "mxjob"
	// MXJobFrameworkName is the name of the ML Framework
	MXJobFrameworkName = "mxnet"
	// MXJobReplicaTypeScheduler is the type for scheduler replica in MXNet.
	MXJobReplicaTypeScheduler commonv1.ReplicaType = "Scheduler"

	// MXJobReplicaTypeServer is the type for parameter servers of distributed MXNet.
	MXJobReplicaTypeServer commonv1.ReplicaType = "Server"

	// MXJobReplicaTypeWorker is the type for workers of distributed MXNet.
	// This is also used for non-distributed MXNet.
	MXJobReplicaTypeWorker commonv1.ReplicaType = "Worker"

	// MXJobReplicaTypeTunerTracker
	// This the auto-tuning tracker e.g. autotvm tracker, it will dispatch tuning task to TunerServer
	MXJobReplicaTypeTunerTracker commonv1.ReplicaType = "TunerTracker"

	// MXJobReplicaTypeTunerServer
	MXJobReplicaTypeTunerServer commonv1.ReplicaType = "TunerServer"

	// MXJobReplicaTypeTuner is the type for auto-tuning of distributed MXNet.
	// This is also used for non-distributed MXNet.
	MXJobReplicaTypeTuner commonv1.ReplicaType = "Tuner"
)

View Source

const (
	// PaddleJobDefaultPortName is name of the port used to communicate between Master and
	// workers.
	PaddleJobDefaultPortName = "master"
	// PaddleJobDefaultContainerName is the name of the PaddleJob container.
	PaddleJobDefaultContainerName = "paddle"
	// PaddleJobDefaultPort is default value of the port.
	PaddleJobDefaultPort = 36543
	// PaddleJobDefaultRestartPolicy is default RestartPolicy for PaddleReplicaSpec.
	PaddleJobDefaultRestartPolicy = commonv1.RestartPolicyOnFailure
	// PaddleJobKind is the kind name.
	PaddleJobKind = "PaddleJob"
	// PaddleJobPlural is the PaddlePlural for paddleJob.
	PaddleJobPlural = "paddlejobs"
	// PaddleJobSingular is the singular for paddleJob.
	PaddleJobSingular = "paddlejob"
	// PaddleJobFrameworkName is the name of the ML Framework
	PaddleJobFrameworkName = "paddle"
	// PaddleJobReplicaTypeMaster is the type of Master of distributed Paddle
	PaddleJobReplicaTypeMaster commonv1.ReplicaType = "Master"
	// PaddleJobReplicaTypeWorker is the type for workers of distributed Paddle.
	PaddleJobReplicaTypeWorker commonv1.ReplicaType = "Worker"
)

View Source

const (
	// PytorchJobDefaultPortName is name of the port used to communicate between Master and
	// workers.
	PytorchJobDefaultPortName = "pytorchjob-port"
	// PytorchJobDefaultContainerName is the name of the PyTorchJob container.
	PytorchJobDefaultContainerName = "pytorch"
	// PytorchJobDefaultPort is default value of the port.
	PytorchJobDefaultPort = 23456
	// PytorchJobDefaultRestartPolicy is default RestartPolicy for PyTorchReplicaSpec.
	PytorchJobDefaultRestartPolicy = commonv1.RestartPolicyOnFailure
	// PytorchJobKind is the kind name.
	PytorchJobKind = "PyTorchJob"
	// PytorchJobPlural is the PytorchPlural for pytorchJob.
	PytorchJobPlural = "pytorchjobs"
	// PytorchJobSingular is the singular for pytorchJob.
	PytorchJobSingular = "pytorchjob"
	// PytorchJobFrameworkName is the name of the ML Framework
	PytorchJobFrameworkName = "pytorch"
	// PyTorchJobReplicaTypeMaster is the type of Master of distributed PyTorch
	PyTorchJobReplicaTypeMaster commonv1.ReplicaType = "Master"
	// PyTorchJobReplicaTypeWorker is the type for workers of distributed PyTorch.
	PyTorchJobReplicaTypeWorker commonv1.ReplicaType = "Worker"
)

View Source

const (
	// TFJobDefaultPortName is name of the port used to communicate between PS and
	// workers.
	TFJobDefaultPortName = "tfjob-port"
	// TFJobDefaultContainerName is the name of the TFJob container.
	TFJobDefaultContainerName = "tensorflow"
	// TFJobDefaultPort is default value of the port.
	TFJobDefaultPort = 2222
	// TFJobDefaultRestartPolicy is default RestartPolicy for TFReplicaSpec.
	TFJobDefaultRestartPolicy = commonv1.RestartPolicyNever
	// TFJobKind is the kind name.
	TFJobKind = "TFJob"
	// TFJobPlural is the TensorflowPlural for TFJob.
	TFJobPlural = "tfjobs"
	// TFJobSingular is the singular for TFJob.
	TFJobSingular = "tfjob"
	// TFJobFrameworkName is the name of the ML Framework
	TFJobFrameworkName = "tensorflow"
)

View Source

const (
	// TFJobReplicaTypePS is the type for parameter servers of distributed TensorFlow.
	TFJobReplicaTypePS commonv1.ReplicaType = "PS"

	// TFJobReplicaTypeWorker is the type for workers of distributed TensorFlow.
	// This is also used for non-distributed TensorFlow.
	TFJobReplicaTypeWorker commonv1.ReplicaType = "Worker"

	// TFJobReplicaTypeChief is the type for chief worker of distributed TensorFlow.
	// If there is "chief" replica type, it's the "chief worker".
	// Else, worker:0 is the chief worker.
	TFJobReplicaTypeChief commonv1.ReplicaType = "Chief"

	// TFJobReplicaTypeMaster is the type for master worker of distributed TensorFlow.
	// This is similar to chief, and kept just for backwards compatibility.
	TFJobReplicaTypeMaster commonv1.ReplicaType = "Master"

	// TFJobReplicaTypeEval is the type for evaluation replica in TensorFlow.
	TFJobReplicaTypeEval commonv1.ReplicaType = "Evaluator"
)

View Source

const (
	// XGBoostJobDefaultPortName is name of the port used to communicate between Master and Workers.
	XGBoostJobDefaultPortName = "xgboostjob-port"
	// XGBoostJobDefaultContainerName is the name of the XGBoostJob container.
	XGBoostJobDefaultContainerName = "xgboost"
	// XGBoostJobDefaultPort is default value of the port.
	XGBoostJobDefaultPort = 9999
	// XGBoostJobDefaultRestartPolicy is default RestartPolicy for XGBReplicaSpecs.
	XGBoostJobDefaultRestartPolicy = commonv1.RestartPolicyNever
	// XGBoostJobKind is the kind name.
	XGBoostJobKind = "XGBoostJob"
	// XGBoostJobPlural is the XGBoostJobPlural for XGBoostJob.
	XGBoostJobPlural = "xgboostjobs"
	// XGBoostJobSingular is the singular for XGBoostJob.
	XGBoostJobSingular = "xgboostjob"
	// XGBoostJobFrameworkName is the name of the ML Framework
	XGBoostJobFrameworkName = "xgboost"
	// XGBoostJobReplicaTypeMaster is the type for master replica.
	XGBoostJobReplicaTypeMaster commonv1.ReplicaType = "Master"
	// XGBoostJobReplicaTypeWorker is the type for worker replicas.
	XGBoostJobReplicaTypeWorker commonv1.ReplicaType = "Worker"
)

Variables ¶

View Source

var (
	// GroupVersion is group version used to register these objects
	GroupVersion = schema.GroupVersion{Group: "kubeflow.org", Version: "v1"}

	MPIJobSchemeGroupVersionKind = schema.GroupVersionKind{Group: "kubeflow.org", Version: "v1", Kind: MPIJobKind}

	// SchemeBuilder is used to add go types to the GroupVersionKind scheme
	SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}

	// AddToScheme adds the types in this group-version to the given scheme.
	AddToScheme = SchemeBuilder.AddToScheme
)

View Source

var SchemeGroupVersion = GroupVersion

SchemeGroupVersion is group version used to register these objects.

Functions ¶

func GetOpenAPIDefinitions ¶

func GetOpenAPIDefinitions(ref common.ReferenceCallback) map[string]common.OpenAPIDefinition

func IsChieforMaster ¶

func IsChieforMaster(typ commonv1.ReplicaType) bool

IsChieforMaster returns true if the type is Master or Chief.

func IsEvaluator ¶

func IsEvaluator(typ commonv1.ReplicaType) bool

IsEvaluator returns true if the type is Evaluator.

func IsScheduler ¶

func IsScheduler(typ commonv1.ReplicaType) bool

IsScheduler returns true if the type is Scheduler.

func IsWorker ¶

func IsWorker(typ commonv1.ReplicaType) bool

IsWorker returns true if the type is Worker.

func RegisterDefaults ¶

func RegisterDefaults(scheme *runtime.Scheme) error

RegisterDefaults adds defaulters functions to the given scheme. Public to allow building arbitrary schemes. All generated defaulters are covering - they call all nested defaulters.

func Resource ¶

func Resource(resource string) schema.GroupResource

Resource takes an unqualified resource and returns a Group-qualified GroupResource.

func SetDefaults_MPIJob ¶

func SetDefaults_MPIJob(mpiJob *MPIJob)

func SetDefaults_MXJob ¶

func SetDefaults_MXJob(mxjob *MXJob)

SetDefaults_MXJob sets any unspecified values to defaults.

func SetDefaults_PaddleJob ¶

func SetDefaults_PaddleJob(job *PaddleJob)

SetDefaults_PaddleJob sets any unspecified values to defaults.

func SetDefaults_PyTorchJob ¶

func SetDefaults_PyTorchJob(job *PyTorchJob)

SetDefaults_PyTorchJob sets any unspecified values to defaults.

func SetDefaults_TFJob ¶

func SetDefaults_TFJob(tfJob *TFJob)

SetDefaults_TFJob sets any unspecified values to defaults.

func SetDefaults_XGBoostJob ¶

func SetDefaults_XGBoostJob(xgboostJob *XGBoostJob)

SetDefaults_XGBoostJob sets any unspecified values to defaults.

func SetObjectDefaults_MPIJob ¶

func SetObjectDefaults_MPIJob(in *MPIJob)

func SetObjectDefaults_MPIJobList ¶

func SetObjectDefaults_MPIJobList(in *MPIJobList)

func SetObjectDefaults_MXJob ¶

func SetObjectDefaults_MXJob(in *MXJob)

func SetObjectDefaults_MXJobList ¶

func SetObjectDefaults_MXJobList(in *MXJobList)

func SetObjectDefaults_PaddleJob ¶

func SetObjectDefaults_PaddleJob(in *PaddleJob)

func SetObjectDefaults_PaddleJobList ¶

func SetObjectDefaults_PaddleJobList(in *PaddleJobList)

func SetObjectDefaults_PyTorchJob ¶

func SetObjectDefaults_PyTorchJob(in *PyTorchJob)

func SetObjectDefaults_PyTorchJobList ¶

func SetObjectDefaults_PyTorchJobList(in *PyTorchJobList)

func SetObjectDefaults_TFJob ¶

func SetObjectDefaults_TFJob(in *TFJob)

func SetObjectDefaults_TFJobList ¶

func SetObjectDefaults_TFJobList(in *TFJobList)

func SetObjectDefaults_XGBoostJob ¶

func SetObjectDefaults_XGBoostJob(in *XGBoostJob)

func SetObjectDefaults_XGBoostJobList ¶

func SetObjectDefaults_XGBoostJobList(in *XGBoostJobList)

func ValidateV1MXJob ¶

func ValidateV1MXJob(mxJob *MXJob) error

ValidateV1MXJob checks that the kubeflowv1.MXJobSpec is valid.

func ValidateV1MpiJobSpec ¶

func ValidateV1MpiJobSpec(c *MPIJobSpec) error

func ValidateV1PaddleJob ¶

func ValidateV1PaddleJob(paddleJob *PaddleJob) error

func ValidateV1PyTorchJob ¶

func ValidateV1PyTorchJob(pytorchJob *PyTorchJob) error

func ValidateV1TFJob ¶

func ValidateV1TFJob(tfjob *TFJob) error

func ValidateV1XGBoostJob ¶

func ValidateV1XGBoostJob(xgboostJob *XGBoostJob) error

Types ¶

type ElasticPolicy ¶

type ElasticPolicy struct {
	// minReplicas is the lower limit for the number of replicas to which the training job
	// can scale down.  It defaults to null.
	// +optional
	MinReplicas *int32 `json:"minReplicas,omitempty"`
	// upper limit for the number of pods that can be set by the autoscaler; cannot be smaller than MinReplicas, defaults to null.
	// +optional
	MaxReplicas *int32 `json:"maxReplicas,omitempty"`

	RDZVBackend *RDZVBackend `json:"rdzvBackend,omitempty"`
	RDZVPort    *int32       `json:"rdzvPort,omitempty"`
	RDZVHost    *string      `json:"rdzvHost,omitempty"`
	RDZVID      *string      `json:"rdzvId,omitempty"`
	// RDZVConf contains additional rendezvous configuration (<key1>=<value1>,<key2>=<value2>,...).
	RDZVConf []RDZVConf `json:"rdzvConf,omitempty"`
	// Start a local standalone rendezvous backend that is represented by a C10d TCP store
	// on port 29400. Useful when launching single-node, multi-worker job. If specified
	// --rdzv_backend, --rdzv_endpoint, --rdzv_id are auto-assigned; any explicitly set values
	// are ignored.
	Standalone *bool `json:"standalone,omitempty"`
	// Number of workers per node; supported values: [auto, cpu, gpu, int].
	NProcPerNode *int32 `json:"nProcPerNode,omitempty"`

	MaxRestarts *int32 `json:"maxRestarts,omitempty"`

	// Metrics contains the specifications which are used to calculate the
	// desired replica count (the maximum replica count across all metrics will
	// be used).  The desired replica count is calculated with multiplying the
	// ratio between the target value and the current value by the current
	// number of pods. Ergo, metrics used must decrease as the pod count is
	// increased, and vice-versa.  See the individual metric source types for
	// more information about how each type of metric must respond.
	// If not set, the HPA will not be created.
	// +optional
	Metrics []autoscalingv2.MetricSpec `json:"metrics,omitempty"`
}

func (*ElasticPolicy) DeepCopy ¶

func (in *ElasticPolicy) DeepCopy() *ElasticPolicy

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ElasticPolicy.

func (*ElasticPolicy) DeepCopyInto ¶

func (in *ElasticPolicy) DeepCopyInto(out *ElasticPolicy)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type JobModeType ¶

type JobModeType string

JobModeType id the type for JobMode

const (
	// Train Mode, in this mode requested MXReplicaSpecs need
	// has Server, Scheduler, Worker
	MXTrain JobModeType = "MXTrain"

	// Tune Mode, in this mode requested MXReplicaSpecs need
	// has Tuner
	MXTune JobModeType = "MXTune"
)

type MPIJob ¶

type MPIJob struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
	Spec              MPIJobSpec         `json:"spec,omitempty"`
	Status            commonv1.JobStatus `json:"status,omitempty"`
}

func (*MPIJob) DeepCopy ¶

func (in *MPIJob) DeepCopy() *MPIJob

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MPIJob.

func (*MPIJob) DeepCopyInto ¶

func (in *MPIJob) DeepCopyInto(out *MPIJob)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*MPIJob) DeepCopyObject ¶

func (in *MPIJob) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type MPIJobList ¶

type MPIJobList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []MPIJob `json:"items"`
}

func (*MPIJobList) DeepCopy ¶

func (in *MPIJobList) DeepCopy() *MPIJobList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MPIJobList.

func (*MPIJobList) DeepCopyInto ¶

func (in *MPIJobList) DeepCopyInto(out *MPIJobList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*MPIJobList) DeepCopyObject ¶

func (in *MPIJobList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type MPIJobSpec ¶

type MPIJobSpec struct {

	// Specifies the number of slots per worker used in hostfile.
	// Defaults to 1.
	// +optional
	SlotsPerWorker *int32 `json:"slotsPerWorker,omitempty"`

	// CleanPodPolicy defines the policy that whether to kill pods after the job completes.
	// Defaults to None.
	CleanPodPolicy *commonv1.CleanPodPolicy `json:"cleanPodPolicy,omitempty"`

	// `MPIReplicaSpecs` contains maps from `MPIReplicaType` to `ReplicaSpec` that
	// specify the MPI replicas to run.
	MPIReplicaSpecs map[commonv1.ReplicaType]*commonv1.ReplicaSpec `json:"mpiReplicaSpecs"`

	// MainContainer specifies name of the main container which
	// executes the MPI code.
	MainContainer string `json:"mainContainer,omitempty"`

	// `RunPolicy` encapsulates various runtime policies of the distributed training
	// job, for example how to clean up resources and how long the job can stay
	// active.
	RunPolicy commonv1.RunPolicy `json:"runPolicy,omitempty"`
}

func (*MPIJobSpec) DeepCopy ¶

func (in *MPIJobSpec) DeepCopy() *MPIJobSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MPIJobSpec.

func (*MPIJobSpec) DeepCopyInto ¶

func (in *MPIJobSpec) DeepCopyInto(out *MPIJobSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type MXJob ¶

type MXJob struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   MXJobSpec          `json:"spec,omitempty"`
	Status commonv1.JobStatus `json:"status,omitempty"`
}

MXJob is the Schema for the mxjobs API

func (*MXJob) DeepCopy ¶

func (in *MXJob) DeepCopy() *MXJob

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MXJob.

func (*MXJob) DeepCopyInto ¶

func (in *MXJob) DeepCopyInto(out *MXJob)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*MXJob) DeepCopyObject ¶

func (in *MXJob) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type MXJobList ¶

type MXJobList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []MXJob `json:"items"`
}

MXJobList contains a list of MXJob

func (*MXJobList) DeepCopy ¶

func (in *MXJobList) DeepCopy() *MXJobList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MXJobList.

func (*MXJobList) DeepCopyInto ¶

func (in *MXJobList) DeepCopyInto(out *MXJobList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*MXJobList) DeepCopyObject ¶

func (in *MXJobList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type MXJobSpec ¶

type MXJobSpec struct {
	// RunPolicy encapsulates various runtime policies of the distributed training
	// job, for example how to clean up resources and how long the job can stay
	// active.
	//+kubebuilder:validation:Optional
	RunPolicy commonv1.RunPolicy `json:"runPolicy"`

	// JobMode specify the kind of MXjob to do. Different mode may have
	// different MXReplicaSpecs request
	JobMode JobModeType `json:"jobMode"`

	// MXReplicaSpecs is map of commonv1.ReplicaType and commonv1.ReplicaSpec
	// specifies the MX replicas to run.
	// For example,
	//   {
	//     "Scheduler": commonv1.ReplicaSpec,
	//     "Server": commonv1.ReplicaSpec,
	//     "Worker": commonv1.ReplicaSpec,
	//   }
	MXReplicaSpecs map[commonv1.ReplicaType]*commonv1.ReplicaSpec `json:"mxReplicaSpecs"`
}

MXJobSpec defines the desired state of MXJob

func (*MXJobSpec) DeepCopy ¶

func (in *MXJobSpec) DeepCopy() *MXJobSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MXJobSpec.

func (*MXJobSpec) DeepCopyInto ¶

func (in *MXJobSpec) DeepCopyInto(out *MXJobSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type MXJobStatus ¶

type MXJobStatus struct {
}

MXJobStatus defines the observed state of MXJob

func (*MXJobStatus) DeepCopy ¶

func (in *MXJobStatus) DeepCopy() *MXJobStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MXJobStatus.

func (*MXJobStatus) DeepCopyInto ¶

func (in *MXJobStatus) DeepCopyInto(out *MXJobStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type PaddleElasticPolicy ¶

type PaddleElasticPolicy struct {
	// minReplicas is the lower limit for the number of replicas to which the training job
	// can scale down.  It defaults to null.
	// +optional
	MinReplicas *int32 `json:"minReplicas,omitempty"`
	// upper limit for the number of pods that can be set by the autoscaler; cannot be smaller than MinReplicas, defaults to null.
	// +optional
	MaxReplicas *int32 `json:"maxReplicas,omitempty"`

	// MaxRestarts is the limit for restart times of pods in elastic mode.
	// +optional
	MaxRestarts *int32 `json:"maxRestarts,omitempty"`

	// Metrics contains the specifications which are used to calculate the
	// desired replica count (the maximum replica count across all metrics will
	// be used).  The desired replica count is calculated with multiplying the
	// ratio between the target value and the current value by the current
	// number of pods. Ergo, metrics used must decrease as the pod count is
	// increased, and vice-versa.  See the individual metric source types for
	// more information about how each type of metric must respond.
	// If not set, the HPA will not be created.
	// +optional
	Metrics []autoscalingv2.MetricSpec `json:"metrics,omitempty"`
}

func (*PaddleElasticPolicy) DeepCopy ¶

func (in *PaddleElasticPolicy) DeepCopy() *PaddleElasticPolicy

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PaddleElasticPolicy.

func (*PaddleElasticPolicy) DeepCopyInto ¶

func (in *PaddleElasticPolicy) DeepCopyInto(out *PaddleElasticPolicy)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type PaddleJob ¶

type PaddleJob struct {
	// Standard Kubernetes type metadata.
	metav1.TypeMeta `json:",inline"`

	metav1.ObjectMeta `json:"metadata,omitempty"`

	// Specification of the desired state of the PaddleJob.
	Spec PaddleJobSpec `json:"spec,omitempty"`

	// Most recently observed status of the PaddleJob.
	// Read-only (modified by the system).
	Status commonv1.JobStatus `json:"status,omitempty"`
}

PaddleJob Represents a PaddleJob resource.

func (*PaddleJob) DeepCopy ¶

func (in *PaddleJob) DeepCopy() *PaddleJob

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PaddleJob.

func (*PaddleJob) DeepCopyInto ¶

func (in *PaddleJob) DeepCopyInto(out *PaddleJob)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*PaddleJob) DeepCopyObject ¶

func (in *PaddleJob) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type PaddleJobList ¶

type PaddleJobList struct {
	// Standard type metadata.
	metav1.TypeMeta `json:",inline"`

	// Standard list metadata.
	metav1.ListMeta `json:"metadata,omitempty"`

	// List of PaddleJobs.
	Items []PaddleJob `json:"items"`
}

PaddleJobList is a list of PaddleJobs.

func (*PaddleJobList) DeepCopy ¶

func (in *PaddleJobList) DeepCopy() *PaddleJobList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PaddleJobList.

func (*PaddleJobList) DeepCopyInto ¶

func (in *PaddleJobList) DeepCopyInto(out *PaddleJobList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*PaddleJobList) DeepCopyObject ¶

func (in *PaddleJobList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type PaddleJobSpec ¶

type PaddleJobSpec struct {
	// RunPolicy encapsulates various runtime policies of the distributed training
	// job, for example how to clean up resources and how long the job can stay
	// active.
	//+kubebuilder:validation:Optional
	RunPolicy commonv1.RunPolicy `json:"runPolicy"`

	// ElasticPolicy holds the elastic policy for paddle job.
	ElasticPolicy *PaddleElasticPolicy `json:"elasticPolicy,omitempty"`

	// A map of PaddleReplicaType (type) to ReplicaSpec (value). Specifies the Paddle cluster configuration.
	// For example,
	//   {
	//     "Master": PaddleReplicaSpec,
	//     "Worker": PaddleReplicaSpec,
	//   }
	PaddleReplicaSpecs map[commonv1.ReplicaType]*commonv1.ReplicaSpec `json:"paddleReplicaSpecs"`
}

PaddleJobSpec is a desired state description of the PaddleJob.

func (*PaddleJobSpec) DeepCopy ¶

func (in *PaddleJobSpec) DeepCopy() *PaddleJobSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PaddleJobSpec.

func (*PaddleJobSpec) DeepCopyInto ¶

func (in *PaddleJobSpec) DeepCopyInto(out *PaddleJobSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type PyTorchJob ¶

type PyTorchJob struct {
	// Standard Kubernetes type metadata.
	metav1.TypeMeta `json:",inline"`

	metav1.ObjectMeta `json:"metadata,omitempty"`

	// Specification of the desired state of the PyTorchJob.
	Spec PyTorchJobSpec `json:"spec,omitempty"`

	// Most recently observed status of the PyTorchJob.
	// Read-only (modified by the system).
	Status commonv1.JobStatus `json:"status,omitempty"`
}

PyTorchJob Represents a PyTorchJob resource.

func (*PyTorchJob) DeepCopy ¶

func (in *PyTorchJob) DeepCopy() *PyTorchJob

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PyTorchJob.

func (*PyTorchJob) DeepCopyInto ¶

func (in *PyTorchJob) DeepCopyInto(out *PyTorchJob)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*PyTorchJob) DeepCopyObject ¶

func (in *PyTorchJob) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type PyTorchJobList ¶

type PyTorchJobList struct {
	// Standard type metadata.
	metav1.TypeMeta `json:",inline"`

	// Standard list metadata.
	metav1.ListMeta `json:"metadata,omitempty"`

	// List of PyTorchJobs.
	Items []PyTorchJob `json:"items"`
}

PyTorchJobList is a list of PyTorchJobs.

func (*PyTorchJobList) DeepCopy ¶

func (in *PyTorchJobList) DeepCopy() *PyTorchJobList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PyTorchJobList.

func (*PyTorchJobList) DeepCopyInto ¶

func (in *PyTorchJobList) DeepCopyInto(out *PyTorchJobList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*PyTorchJobList) DeepCopyObject ¶

func (in *PyTorchJobList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type PyTorchJobSpec ¶

type PyTorchJobSpec struct {
	// RunPolicy encapsulates various runtime policies of the distributed training
	// job, for example how to clean up resources and how long the job can stay
	// active.
	//+kubebuilder:validation:Optional
	RunPolicy commonv1.RunPolicy `json:"runPolicy"`

	ElasticPolicy *ElasticPolicy `json:"elasticPolicy,omitempty"`

	// A map of PyTorchReplicaType (type) to ReplicaSpec (value). Specifies the PyTorch cluster configuration.
	// For example,
	//   {
	//     "Master": PyTorchReplicaSpec,
	//     "Worker": PyTorchReplicaSpec,
	//   }
	PyTorchReplicaSpecs map[commonv1.ReplicaType]*commonv1.ReplicaSpec `json:"pytorchReplicaSpecs"`
}

PyTorchJobSpec is a desired state description of the PyTorchJob.

func (*PyTorchJobSpec) DeepCopy ¶

func (in *PyTorchJobSpec) DeepCopy() *PyTorchJobSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PyTorchJobSpec.

func (*PyTorchJobSpec) DeepCopyInto ¶

func (in *PyTorchJobSpec) DeepCopyInto(out *PyTorchJobSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type RDZVBackend ¶

type RDZVBackend string

const (
	// BackendC10D is the rendezvous backend type for C10d.
	BackendC10D RDZVBackend = "c10d"
	// BackendETCD is the rendezvous backend type for ETCD.
	BackendETCD RDZVBackend = "etcd"
	// BackendETCDV2 is the rendezvous backend type for ETCD v2.
	BackendETCDV2 RDZVBackend = "etcd-v2"
)

type RDZVConf ¶

type RDZVConf struct {
	Key   string `json:"key,omitempty"`
	Value string `json:"value,omitempty"`
}

func (*RDZVConf) DeepCopy ¶

func (in *RDZVConf) DeepCopy() *RDZVConf

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RDZVConf.

func (*RDZVConf) DeepCopyInto ¶

func (in *RDZVConf) DeepCopyInto(out *RDZVConf)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type SuccessPolicy ¶

type SuccessPolicy string

SuccessPolicy is the success policy.

const (
	SuccessPolicyDefault    SuccessPolicy = ""
	SuccessPolicyAllWorkers SuccessPolicy = "AllWorkers"
)

type TFJob ¶

type TFJob struct {
	// Standard Kubernetes type metadata.
	metav1.TypeMeta `json:",inline"`

	// +optional
	metav1.ObjectMeta `json:"metadata,omitempty"`

	// Specification of the desired state of the TFJob.
	// +optional
	Spec TFJobSpec `json:"spec,omitempty"`

	// Most recently observed status of the TFJob.
	// Populated by the system.
	// Read-only.
	// +optional
	Status commonv1.JobStatus `json:"status,omitempty"`
}

TFJob represents a TFJob resource.

func (*TFJob) DeepCopy ¶

func (in *TFJob) DeepCopy() *TFJob

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new TFJob.

func (*TFJob) DeepCopyInto ¶

func (in *TFJob) DeepCopyInto(out *TFJob)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*TFJob) DeepCopyObject ¶

func (in *TFJob) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type TFJobList ¶

type TFJobList struct {
	// Standard type metadata.
	metav1.TypeMeta `json:",inline"`

	// Standard list metadata.
	// +optional
	metav1.ListMeta `json:"metadata,omitempty"`

	// List of TFJobs.
	Items []TFJob `json:"items"`
}

TFJobList is a list of TFJobs.

func (*TFJobList) DeepCopy ¶

func (in *TFJobList) DeepCopy() *TFJobList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new TFJobList.

func (*TFJobList) DeepCopyInto ¶

func (in *TFJobList) DeepCopyInto(out *TFJobList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*TFJobList) DeepCopyObject ¶

func (in *TFJobList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type TFJobSpec ¶

type TFJobSpec struct {
	// RunPolicy encapsulates various runtime policies of the distributed training
	// job, for example how to clean up resources and how long the job can stay
	// active.
	//+kubebuilder:validation:Optional
	RunPolicy commonv1.RunPolicy `json:"runPolicy"`

	// SuccessPolicy defines the policy to mark the TFJob as succeeded.
	// Default to "", using the default rules.
	// +optional
	SuccessPolicy *SuccessPolicy `json:"successPolicy,omitempty"`

	// A map of TFReplicaType (type) to ReplicaSpec (value). Specifies the TF cluster configuration.
	// For example,
	//   {
	//     "PS": ReplicaSpec,
	//     "Worker": ReplicaSpec,
	//   }
	TFReplicaSpecs map[commonv1.ReplicaType]*commonv1.ReplicaSpec `json:"tfReplicaSpecs"`

	// A switch to enable dynamic worker
	EnableDynamicWorker bool `json:"enableDynamicWorker,omitempty"`
}

TFJobSpec is a desired state description of the TFJob.

func (*TFJobSpec) DeepCopy ¶

func (in *TFJobSpec) DeepCopy() *TFJobSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new TFJobSpec.

func (*TFJobSpec) DeepCopyInto ¶

func (in *TFJobSpec) DeepCopyInto(out *TFJobSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type XGBoostJob ¶

type XGBoostJob struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   XGBoostJobSpec     `json:"spec,omitempty"`
	Status commonv1.JobStatus `json:"status,omitempty"`
}

XGBoostJob is the Schema for the xgboostjobs API +k8s:openapi-gen=true

func (*XGBoostJob) DeepCopy ¶

func (in *XGBoostJob) DeepCopy() *XGBoostJob

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new XGBoostJob.

func (*XGBoostJob) DeepCopyInto ¶

func (in *XGBoostJob) DeepCopyInto(out *XGBoostJob)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*XGBoostJob) DeepCopyObject ¶

func (in *XGBoostJob) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type XGBoostJobList ¶

type XGBoostJobList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []XGBoostJob `json:"items"`
}

XGBoostJobList contains a list of XGBoostJob

func (*XGBoostJobList) DeepCopy ¶

func (in *XGBoostJobList) DeepCopy() *XGBoostJobList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new XGBoostJobList.

func (*XGBoostJobList) DeepCopyInto ¶

func (in *XGBoostJobList) DeepCopyInto(out *XGBoostJobList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*XGBoostJobList) DeepCopyObject ¶

func (in *XGBoostJobList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type XGBoostJobSpec ¶

type XGBoostJobSpec struct {
	// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
	// Important: Run "make" to regenerate code after modifying this file
	//+kubebuilder:validation:Optional
	RunPolicy commonv1.RunPolicy `json:"runPolicy"`

	XGBReplicaSpecs map[commonv1.ReplicaType]*commonv1.ReplicaSpec `json:"xgbReplicaSpecs"`
}

XGBoostJobSpec defines the desired state of XGBoostJob

func (*XGBoostJobSpec) DeepCopy ¶

func (in *XGBoostJobSpec) DeepCopy() *XGBoostJobSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new XGBoostJobSpec.

func (*XGBoostJobSpec) DeepCopyInto ¶

func (in *XGBoostJobSpec) DeepCopyInto(out *XGBoostJobSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL