Documentation ¶
Overview ¶
Package v2alpha1 is the modelzetes API. +groupName=tensorchord.ai
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( // localSchemeBuilder and AddToScheme will stay in k8s.io/kubernetes. SchemeBuilder runtime.SchemeBuilder AddToScheme = localSchemeBuilder.AddToScheme Kind = "Inference" )
var SchemeGroupVersion = schema.GroupVersion{Group: controller.GroupName, Version: "v2alpha1"}
SchemeGroupVersion is group version used to register these objects
Functions ¶
func Resource ¶
func Resource(resource string) schema.GroupResource
Resource takes an unqualified resource and returns a Group qualified GroupResource
Types ¶
type Framework ¶
type Framework string
Framework is the inference framework. It is only used to set the default port and command. For example, if the framework is "gradio", the default port is 7860 and the default command is "python app.py". You could override these defaults by setting the port and command fields and framework to `other`.
type Inference ¶
type Inference struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` Spec InferenceSpec `json:"spec"` }
Inference describes an Inference
func (*Inference) DeepCopy ¶
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Inference.
func (*Inference) DeepCopyInto ¶
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (*Inference) DeepCopyObject ¶
DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
type InferenceList ¶
type InferenceList struct { metav1.TypeMeta `json:",inline"` metav1.ListMeta `json:"metadata"` Items []Inference `json:"items"` }
InferenceList is a list of inference resources
func (*InferenceList) DeepCopy ¶
func (in *InferenceList) DeepCopy() *InferenceList
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new InferenceList.
func (*InferenceList) DeepCopyInto ¶
func (in *InferenceList) DeepCopyInto(out *InferenceList)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (*InferenceList) DeepCopyObject ¶
func (in *InferenceList) DeepCopyObject() runtime.Object
DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
type InferenceSpec ¶
type InferenceSpec struct { Name string `json:"name"` Image string `json:"image"` // Scaling is the scaling configuration for the inference. Scaling *ScalingConfig `json:"scaling,omitempty"` // Framework is the inference framework. Framework Framework `json:"framework,omitempty"` // Port is the port exposed by the inference. Port *int32 `json:"port,omitempty"` // HTTPProbePath is the path of the http probe. HTTPProbePath *string `json:"http_probe_path,omitempty"` // Command to run when starting the Command *string `json:"command,omitempty"` // EnvVars can be provided to set environment variables for the inference runtime. EnvVars map[string]string `json:"envVars,omitempty"` // Constraints are specific to the operator. Constraints []string `json:"constraints,omitempty"` // Secrets list of secrets to be made available to inference Secrets []string `json:"secrets,omitempty"` // Labels are metadata for inferences which may be used by the // faas-provider or the gateway Labels map[string]string `json:"labels,omitempty"` // Annotations are metadata for inferences which may be used by the // faas-provider or the gateway Annotations map[string]string `json:"annotations,omitempty"` // Limits for inference Resources *v1.ResourceRequirements `json:"resources,omitempty"` }
InferenceSpec defines the desired state of Inference
func (*InferenceSpec) DeepCopy ¶
func (in *InferenceSpec) DeepCopy() *InferenceSpec
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new InferenceSpec.
func (*InferenceSpec) DeepCopyInto ¶
func (in *InferenceSpec) DeepCopyInto(out *InferenceSpec)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type ScalingConfig ¶
type ScalingConfig struct { // MinReplicas is the lower limit for the number of replicas to which the // autoscaler can scale down. It defaults to 0. MinReplicas *int32 `json:"min_replicas,omitempty"` // MaxReplicas is the upper limit for the number of replicas to which the // autoscaler can scale up. It cannot be less that minReplicas. It defaults // to 1. MaxReplicas *int32 `json:"max_replicas,omitempty"` // TargetLoad is the target load. In capacity mode, it is the expected number of the inflight requests per replica. TargetLoad *int32 `json:"target_load,omitempty"` // Type is the scaling type. It can be either "capacity" or "rps". Default is "capacity". Type *ScalingType `json:"type,omitempty"` // ZeroDuration is the duration of zero load before scaling down to zero. Default is 5 minutes. ZeroDuration *int32 `json:"zero_duration,omitempty"` // StartupDuration is the duration of startup time. StartupDuration *int32 `json:"startup_duration,omitempty"` }
func (*ScalingConfig) DeepCopy ¶
func (in *ScalingConfig) DeepCopy() *ScalingConfig
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScalingConfig.
func (*ScalingConfig) DeepCopyInto ¶
func (in *ScalingConfig) DeepCopyInto(out *ScalingConfig)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type ScalingType ¶
type ScalingType string
const ( ScalingTypeCapacity ScalingType = "capacity" ScalingTypeRPS ScalingType = "rps" )