trainer

package
v0.0.0-...-c377703 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 14, 2017 License: Apache-2.0 Imports: 21 Imported by: 0

Documentation

Overview

training is a package for managing MXNet training jobs.

Index

Constants

View Source
const (
	NAMESPACE string = "default"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type KubernetesLabels

type KubernetesLabels map[string]string

KubernetesLabels represents a set of labels to apply to a Kubernetes resources.

func (KubernetesLabels) ToSelector

func (l KubernetesLabels) ToSelector() (string, error)

ToSelector converts the labels to a selector matching the labels.

type MXReplicaSet

type MXReplicaSet struct {
	ClientSet kubernetes.Interface
	// Job is a pointer to the TrainingJob to which this replica belongs.
	Job  *TrainingJob
	Spec spec.MxReplicaSpec
}

MXReplicaSet is a set of MX processes all acting as the same role (e.g. worker

func NewMXReplicaSet

func NewMXReplicaSet(clientSet kubernetes.Interface, mxReplicaSpec spec.MxReplicaSpec, job *TrainingJob) (*MXReplicaSet, error)

func (*MXReplicaSet) Create

func (s *MXReplicaSet) Create() error

func (*MXReplicaSet) Delete

func (s *MXReplicaSet) Delete() error

Delete deletes the replicas

func (*MXReplicaSet) GetStatus

func (s *MXReplicaSet) GetStatus() (spec.MxReplicaStatus, error)

Status returns the status of the replica set.

func (*MXReplicaSet) Labels

func (s *MXReplicaSet) Labels() KubernetesLabels

Labels returns the labels for this replica set.

type MXReplicaSetInterface

type MXReplicaSetInterface interface {
	Create() error
	Delete() error
	GetStatus() (spec.MxReplicaStatus, error)
}

MXReplicas is an interface for managing a set of replicas.

type MxConfig

type MxConfig struct {
	Task map[string]interface{} `json:"task"`
}

MXConfig is a struct representing the MXNET config. This struct is turned into an environment which is used by MXNET processes to configure themselves.

type TrainingJob

type TrainingJob struct {
	KubeCli kubernetes.Interface

	Replicas []*MXReplicaSet
	// contains filtered or unexported fields
}

TODO(jlewi): We should switch a New pattern and make trainingJob private so we can ensure correctness on creation.

func NewJob

func NewJob(kubeCli kubernetes.Interface, mxJobClient k8sutil.MxJobClient, mxjob *spec.MxJob, stopC <-chan struct{}, wg *sync.WaitGroup, config *spec.ControllerConfig) (*TrainingJob, error)

func (*TrainingJob) Delete

func (j *TrainingJob) Delete()

func (*TrainingJob) GetStatus

func (j *TrainingJob) GetStatus() (spec.State, []*spec.MxReplicaStatus, error)

func (*TrainingJob) Update

func (j *TrainingJob) Update(newJob *spec.MxJob)

Update sends an update event for the job.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL