Documentation ¶
Overview ¶
Package v1alpha1 contains API Schema definitions for the inference v1alpha1 API group +kubebuilder:object:generate=true +groupName=inference.llmaz.io
Index ¶
- Constants
- Variables
- func Resource(resource string) schema.GroupResource
- type BackendName
- type BackendRuntime
- type BackendRuntimeArg
- type BackendRuntimeConfig
- type BackendRuntimeList
- type BackendRuntimeSpec
- type BackendRuntimeStatus
- type ElasticConfig
- type Playground
- type PlaygroundList
- type PlaygroundSpec
- type PlaygroundStatus
- type ResourceRequirements
- type Service
- type ServiceList
- type ServiceSpec
- type ServiceStatus
Constants ¶
const ( // PlaygroundProgressing means the Playground is progressing now, such as waiting for the // inference service creation, rolling update or scaling up and down. PlaygroundProgressing = "Progressing" // PlaygroundAvailable indicates the corresponding inference service is available now. PlaygroundAvailable string = "Available" )
const ( // ServiceAvailable means the inferenceService is available and all the // workloads are running as expected. ServiceAvailable = "Available" // ServiceProgressing means the inferenceService is progressing now, such as // in creation, rolling update or scaling up and down. ServiceProgressing = "Progressing" )
Variables ¶
var ( // GroupVersion is group version used to register these objects GroupVersion = schema.GroupVersion{Group: "inference.llmaz.io", Version: "v1alpha1"} // SchemeGroupVersion is alias to GroupVersion for client-go libraries. // It is required by pkg/client/informers/externalversions/... SchemeGroupVersion = GroupVersion // SchemeBuilder is used to add go types to the GroupVersionKind scheme SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion} // AddToScheme adds the types in this group-version to the given scheme. AddToScheme = SchemeBuilder.AddToScheme )
Functions ¶
func Resource ¶
func Resource(resource string) schema.GroupResource
Resource is required by pkg/client/listers/...
Types ¶
type BackendName ¶
type BackendName string
const ( LLAMACPP BackendName = "llamacpp" SGLANG BackendName = "sglang" VLLM BackendName = "vllm" DefaultBackend BackendName = VLLM )
type BackendRuntime ¶ added in v0.0.7
type BackendRuntime struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` Spec BackendRuntimeSpec `json:"spec,omitempty"` Status BackendRuntimeStatus `json:"status,omitempty"` }
BackendRuntime is the Schema for the backendRuntime API
func (*BackendRuntime) DeepCopy ¶ added in v0.0.7
func (in *BackendRuntime) DeepCopy() *BackendRuntime
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntime.
func (*BackendRuntime) DeepCopyInto ¶ added in v0.0.7
func (in *BackendRuntime) DeepCopyInto(out *BackendRuntime)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (*BackendRuntime) DeepCopyObject ¶ added in v0.0.7
func (in *BackendRuntime) DeepCopyObject() runtime.Object
DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
type BackendRuntimeArg ¶ added in v0.0.7
type BackendRuntimeArg struct { // Name represents the identifier of the backendRuntime argument. Name string `json:"name"` // Flags represents all the preset configurations. // Flag around with {{ .CONFIG }} is a configuration waiting for render. Flags []string `json:"flags,omitempty"` }
BackendRuntimeArg is preset arguments for easy to use. Do not edit the preset names unless set the argument name explicitly in Playground backendRuntimeConfig.
func (*BackendRuntimeArg) DeepCopy ¶ added in v0.0.7
func (in *BackendRuntimeArg) DeepCopy() *BackendRuntimeArg
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeArg.
func (*BackendRuntimeArg) DeepCopyInto ¶ added in v0.0.7
func (in *BackendRuntimeArg) DeepCopyInto(out *BackendRuntimeArg)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type BackendRuntimeConfig ¶ added in v0.0.7
type BackendRuntimeConfig struct { // Name represents the inference backend under the hood, e.g. vLLM. // +kubebuilder:default=vllm // +optional Name *BackendName `json:"name,omitempty"` // Version represents the backend version if you want a different one // from the default version. // +optional Version *string `json:"version,omitempty"` // Args represents the arguments appended to the backend. // You can add new args or overwrite the default args. // +optional Args []string `json:"args,omitempty"` // Envs represents the environments set to the container. // +optional Envs []corev1.EnvVar `json:"envs,omitempty"` // Resources represents the resource requirements for backend, like cpu/mem, // accelerators like GPU should not be defined here, but at the model flavors, // or the values here will be overwritten. Resources *ResourceRequirements `json:"resources,omitempty"` }
func (*BackendRuntimeConfig) DeepCopy ¶ added in v0.0.7
func (in *BackendRuntimeConfig) DeepCopy() *BackendRuntimeConfig
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeConfig.
func (*BackendRuntimeConfig) DeepCopyInto ¶ added in v0.0.7
func (in *BackendRuntimeConfig) DeepCopyInto(out *BackendRuntimeConfig)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type BackendRuntimeList ¶ added in v0.0.7
type BackendRuntimeList struct { metav1.TypeMeta `json:",inline"` metav1.ListMeta `json:"metadata,omitempty"` Items []BackendRuntime `json:"items"` }
BackendRuntimeList contains a list of BackendRuntime
func (*BackendRuntimeList) DeepCopy ¶ added in v0.0.7
func (in *BackendRuntimeList) DeepCopy() *BackendRuntimeList
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeList.
func (*BackendRuntimeList) DeepCopyInto ¶ added in v0.0.7
func (in *BackendRuntimeList) DeepCopyInto(out *BackendRuntimeList)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (*BackendRuntimeList) DeepCopyObject ¶ added in v0.0.7
func (in *BackendRuntimeList) DeepCopyObject() runtime.Object
DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
type BackendRuntimeSpec ¶ added in v0.0.7
type BackendRuntimeSpec struct { // Commands represents the default command of the backendRuntime. Commands []string `json:"commands"` // Image represents the default image registry of the backendRuntime. // It will work together with version to make up a real image. Image string `json:"image"` // Version represents the default version of the backendRuntime. // It will be appended to the image as a tag. Version string `json:"version"` // Args represents the preset arguments of the backendRuntime. // They can be appended or overwritten by the Playground backendRuntimeConfig. Args []BackendRuntimeArg `json:"args,omitempty"` // Envs represents the environments set to the container. // +optional Envs []corev1.EnvVar `json:"envs,omitempty"` // Resources represents the resource requirements for backendRuntime, like cpu/mem, // accelerators like GPU should not be defined here, but at the model flavors, // or the values here will be overwritten. Resources ResourceRequirements `json:"resources"` }
BackendRuntimeSpec defines the desired state of BackendRuntime
func (*BackendRuntimeSpec) DeepCopy ¶ added in v0.0.7
func (in *BackendRuntimeSpec) DeepCopy() *BackendRuntimeSpec
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeSpec.
func (*BackendRuntimeSpec) DeepCopyInto ¶ added in v0.0.7
func (in *BackendRuntimeSpec) DeepCopyInto(out *BackendRuntimeSpec)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type BackendRuntimeStatus ¶ added in v0.0.7
type BackendRuntimeStatus struct { // Conditions represents the Inference condition. Conditions []metav1.Condition `json:"conditions,omitempty"` }
BackendRuntimeStatus defines the observed state of BackendRuntime
func (*BackendRuntimeStatus) DeepCopy ¶ added in v0.0.7
func (in *BackendRuntimeStatus) DeepCopy() *BackendRuntimeStatus
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeStatus.
func (*BackendRuntimeStatus) DeepCopyInto ¶ added in v0.0.7
func (in *BackendRuntimeStatus) DeepCopyInto(out *BackendRuntimeStatus)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type ElasticConfig ¶
type ElasticConfig struct { // MinReplicas indicates the minimum number of inference workloads based on the traffic. // Default to nil means we can scale down the instances to 1. // If minReplicas set to 0, it requires to install serverless component at first. // +kubebuilder:default=1 // +optional MinReplicas *int32 `json:"minReplicas,omitempty"` // MaxReplicas indicates the maximum number of inference workloads based on the traffic. // Default to nil means there's no limit for the instance number. // +optional MaxReplicas *int32 `json:"maxReplicas,omitempty"` }
func (*ElasticConfig) DeepCopy ¶
func (in *ElasticConfig) DeepCopy() *ElasticConfig
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ElasticConfig.
func (*ElasticConfig) DeepCopyInto ¶
func (in *ElasticConfig) DeepCopyInto(out *ElasticConfig)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type Playground ¶
type Playground struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` Spec PlaygroundSpec `json:"spec,omitempty"` Status PlaygroundStatus `json:"status,omitempty"` }
Playground is the Schema for the playgrounds API
func (*Playground) DeepCopy ¶
func (in *Playground) DeepCopy() *Playground
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Playground.
func (*Playground) DeepCopyInto ¶
func (in *Playground) DeepCopyInto(out *Playground)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (*Playground) DeepCopyObject ¶
func (in *Playground) DeepCopyObject() runtime.Object
DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
type PlaygroundList ¶
type PlaygroundList struct { metav1.TypeMeta `json:",inline"` metav1.ListMeta `json:"metadata,omitempty"` Items []Playground `json:"items"` }
PlaygroundList contains a list of Playground
func (*PlaygroundList) DeepCopy ¶
func (in *PlaygroundList) DeepCopy() *PlaygroundList
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PlaygroundList.
func (*PlaygroundList) DeepCopyInto ¶
func (in *PlaygroundList) DeepCopyInto(out *PlaygroundList)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (*PlaygroundList) DeepCopyObject ¶
func (in *PlaygroundList) DeepCopyObject() runtime.Object
DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
type PlaygroundSpec ¶
type PlaygroundSpec struct { // Replicas represents the replica number of inference workloads. // +kubebuilder:default=1 // +optional Replicas *int32 `json:"replicas,omitempty"` // ModelClaim represents claiming for one model, it's a simplified use case // of modelClaims. Most of the time, modelClaim is enough. // ModelClaim and modelClaims are exclusive configured. // +optional ModelClaim *coreapi.ModelClaim `json:"modelClaim,omitempty"` // ModelClaims represents claiming for multiple models for more complicated // use cases like speculative-decoding. // ModelClaims and modelClaim are exclusive configured. // +optional ModelClaims *coreapi.ModelClaims `json:"modelClaims,omitempty"` // BackendRuntimeConfig represents the inference backendRuntime configuration // under the hood, e.g. vLLM, which is the default backendRuntime. // +optional BackendRuntimeConfig *BackendRuntimeConfig `json:"backendRuntimeConfig,omitempty"` }
PlaygroundSpec defines the desired state of Playground
func (*PlaygroundSpec) DeepCopy ¶
func (in *PlaygroundSpec) DeepCopy() *PlaygroundSpec
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PlaygroundSpec.
func (*PlaygroundSpec) DeepCopyInto ¶
func (in *PlaygroundSpec) DeepCopyInto(out *PlaygroundSpec)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type PlaygroundStatus ¶
type PlaygroundStatus struct { // Conditions represents the Inference condition. Conditions []metav1.Condition `json:"conditions,omitempty"` }
PlaygroundStatus defines the observed state of Playground
func (*PlaygroundStatus) DeepCopy ¶
func (in *PlaygroundStatus) DeepCopy() *PlaygroundStatus
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PlaygroundStatus.
func (*PlaygroundStatus) DeepCopyInto ¶
func (in *PlaygroundStatus) DeepCopyInto(out *PlaygroundStatus)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type ResourceRequirements ¶
type ResourceRequirements struct { // Limits describes the maximum amount of compute resources allowed. // More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ // +optional Limits corev1.ResourceList `json:"limits,omitempty"` // Requests describes the minimum amount of compute resources required. // If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, // otherwise to an implementation-defined value. Requests cannot exceed Limits. // More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ // +optional Requests corev1.ResourceList `json:"requests,omitempty"` }
TODO: Do not support DRA yet, we can support that once needed.
func (*ResourceRequirements) DeepCopy ¶
func (in *ResourceRequirements) DeepCopy() *ResourceRequirements
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ResourceRequirements.
func (*ResourceRequirements) DeepCopyInto ¶
func (in *ResourceRequirements) DeepCopyInto(out *ResourceRequirements)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type Service ¶
type Service struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` Spec ServiceSpec `json:"spec,omitempty"` Status ServiceStatus `json:"status,omitempty"` }
Service is the Schema for the services API
func (*Service) DeepCopy ¶
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Service.
func (*Service) DeepCopyInto ¶
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (*Service) DeepCopyObject ¶
DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
type ServiceList ¶
type ServiceList struct { metav1.TypeMeta `json:",inline"` metav1.ListMeta `json:"metadata,omitempty"` Items []Service `json:"items"` }
ServiceList contains a list of Service
func (*ServiceList) DeepCopy ¶
func (in *ServiceList) DeepCopy() *ServiceList
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ServiceList.
func (*ServiceList) DeepCopyInto ¶
func (in *ServiceList) DeepCopyInto(out *ServiceList)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (*ServiceList) DeepCopyObject ¶
func (in *ServiceList) DeepCopyObject() runtime.Object
DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
type ServiceSpec ¶
type ServiceSpec struct { // ModelClaims represents multiple claims for different models. ModelClaims coreapi.ModelClaims `json:"modelClaims,omitempty"` // WorkloadTemplate defines the underlying workload layout and configuration. // Note: the LWS spec might be twisted with various LWS instances to support // accelerator fungibility or other cutting-edge researches. // LWS supports both single-host and multi-host scenarios, for single host // cases, only need to care about replicas, rolloutStrategy and workerTemplate. WorkloadTemplate lws.LeaderWorkerSetSpec `json:"workloadTemplate"` // ElasticConfig defines the configuration for elastic usage, // e.g. the max/min replicas. Default to 0 ~ Inf+. // This requires to install the HPA first or will not work. // +optional ElasticConfig *ElasticConfig `json:"elasticConfig,omitempty"` }
ServiceSpec defines the desired state of Service. Service controller will maintain multi-flavor of workloads with different accelerators for cost or performance considerations.
func (*ServiceSpec) DeepCopy ¶
func (in *ServiceSpec) DeepCopy() *ServiceSpec
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ServiceSpec.
func (*ServiceSpec) DeepCopyInto ¶
func (in *ServiceSpec) DeepCopyInto(out *ServiceSpec)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
type ServiceStatus ¶
type ServiceStatus struct { // Conditions represents the Inference condition. Conditions []metav1.Condition `json:"conditions,omitempty"` }
ServiceStatus defines the observed state of Service
func (*ServiceStatus) DeepCopy ¶
func (in *ServiceStatus) DeepCopy() *ServiceStatus
DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ServiceStatus.
func (*ServiceStatus) DeepCopyInto ¶
func (in *ServiceStatus) DeepCopyInto(out *ServiceStatus)
DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.