controller

package
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 22, 2018 License: Apache-2.0 Imports: 22 Imported by: 0

Documentation

Overview

Package controller provides a Kubernetes controller for a TensorFlow job resource.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrVersionOutdated is a exported var to capture the error in apiserver
	ErrVersionOutdated = errors.New("requested version is outdated in apiserver")

	// DefaultJobBackOff is the max backoff period, exported for the e2e test
	DefaultJobBackOff = 10 * time.Second
	// MaxJobBackOff is the max backoff period, exported for the e2e test
	MaxJobBackOff = 360 * time.Second
)

Functions

This section is empty.

Types

type Controller

type Controller struct {
	KubeClient  kubernetes.Interface
	TFJobClient tfjobclient.Interface

	TFJobLister listers.TFJobLister
	TFJobSynced cache.InformerSynced

	// WorkQueue is a rate limited work queue. This is used to queue work to be
	// processed instead of performing it as soon as a change happens. This
	// means we can ensure we only process a fixed amount of resources at a
	// time, and makes it easy to ensure we are never processing the same item
	// simultaneously in two different workers.
	//
	// Items in the work queue correspond to the name of the job.
	// In response to various events (e.g. Add, Update, Delete), the informer
	// is configured to add events to the queue. Since the item in the queue
	// represents a job and not a particular event, we end up aggregating events for
	// a job and ensure that a particular job isn't being processed by multiple
	// workers simultaneously.
	//
	// We rely on the informer to periodically generate Update events. This ensures
	// we regularly check on each TFJob and take any action needed.
	//
	// If there is a problem processing a job, processNextWorkItem just requeues
	// the work item. This ensures that we end up retrying it. In this case
	// we rely on the rateLimiter in the worker queue to retry with exponential
	// backoff.
	WorkQueue workqueue.RateLimitingInterface
	// contains filtered or unexported fields
}

Controller is structure to manage various service clients

func New

func New(kubeClient kubernetes.Interface, tfJobClient tfjobclient.Interface,
	config tfv1alpha1.ControllerConfig, tfJobInformerFactory informers.SharedInformerFactory,
	enableGangScheduling bool) (*Controller, error)

New method sets up service client handles and returns controller object

func (*Controller) Run

func (c *Controller) Run(threadiness int, stopCh <-chan struct{}) error

Run will set up the event handlers for types we are interested in, as well as syncing informer caches and starting workers. It will block until stopCh is closed, at which point it will shutdown the workqueue and wait for workers to finish processing their current work items.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL