v1alpha1

package
v0.0.8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 23, 2024 License: Apache-2.0 Imports: 5 Imported by: 1

Documentation

Overview

Package v1alpha1 contains API Schema definitions for the v1alpha1 API group +kubebuilder:object:generate=true +groupName=llmaz.io

Index

Constants

View Source
const (
	ModelFamilyNameLabelKey = "llmaz.io/model-family-name"
	ModelNameLabelKey       = "llmaz.io/model-name"

	HUGGING_FACE = "Huggingface"
	MODEL_SCOPE  = "ModelScope"
)

Variables

View Source
var (
	// GroupVersion is group version used to register these objects
	GroupVersion = schema.GroupVersion{Group: "llmaz.io", Version: "v1alpha1"}

	// SchemeGroupVersion is alias to GroupVersion for client-go libraries.
	// It is required by pkg/client/informers/externalversions/...
	SchemeGroupVersion = GroupVersion

	// SchemeBuilder is used to add go types to the GroupVersionKind scheme
	SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}

	// AddToScheme adds the types in this group-version to the given scheme.
	AddToScheme = SchemeBuilder.AddToScheme
)

Functions

func Resource

func Resource(resource string) schema.GroupResource

Resource is required by pkg/client/listers/...

Types

type Flavor

type Flavor struct {
	// Name represents the flavor name, which will be used in model claim.
	Name FlavorName `json:"name"`
	// Requests defines the required accelerators to serve the model, like nvidia.com/gpu: 8.
	// When GPU number is greater than 8, like 32, then multi-host inference is enabled and
	// 32/8=4 hosts will be grouped as an unit, each host will have a resource request as
	// nvidia.com/gpu: 8. The may change in the future if the GPU number limit is broken.
	// Not recommended to set the cpu and memory usage here.
	// If using playground, you can define the cpu/mem usage at backendConfig.
	// If using service, you can define the cpu/mem at the container resources.
	// Note: if you define the same accelerator requests at playground/service as well,
	// the requests here will be covered.
	// +optional
	Requests v1.ResourceList `json:"requests,omitempty"`
	// NodeSelector represents the node candidates for Pod placements, if a node doesn't
	// meet the nodeSelector, it will be filtered out in the resourceFungibility scheduler plugin.
	// If nodeSelector is empty, it means every node is a candidate.
	// +optional
	NodeSelector map[string]string `json:"nodeSelector,omitempty"`
	// Params stores other useful parameters and will be consumed by the autoscaling components
	// like cluster-autoscaler, Karpenter.
	// E.g. when scaling up nodes with 8x Nvidia A00, the parameter can be injected with
	// instance-type: p4d.24xlarge for AWS.
	// +optional
	Params map[string]string `json:"params,omitempty"`
}

Flavor defines the accelerator requirements for a model and the necessary parameters in autoscaling. Right now, it will be used in two places: - Pod scheduling with node selectors specified. - Cluster autoscaling with essential parameters provided.

func (*Flavor) DeepCopy

func (in *Flavor) DeepCopy() *Flavor

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Flavor.

func (*Flavor) DeepCopyInto

func (in *Flavor) DeepCopyInto(out *Flavor)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type FlavorName

type FlavorName string

type ModelClaim

type ModelClaim struct {
	// ModelName represents the name of the Model.
	ModelName ModelName `json:"modelName,omitempty"`
	// InferenceFlavors represents a list of flavors with fungibility support
	// to serve the model.
	// If set, The flavor names should be a subset of the model configured flavors.
	// If not set, Model configured flavors will be used by default.
	// +optional
	InferenceFlavors []FlavorName `json:"inferenceFlavors,omitempty"`
}

ModelClaim represents claiming for one model, it's the standard claimMode of multiModelsClaim compared to other modes like SpeculativeDecoding.

func (*ModelClaim) DeepCopy

func (in *ModelClaim) DeepCopy() *ModelClaim

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelClaim.

func (*ModelClaim) DeepCopyInto

func (in *ModelClaim) DeepCopyInto(out *ModelClaim)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelClaims added in v0.0.6

type ModelClaims struct {
	// Models represents a list of models with roles specified, there maybe
	// multiple models here to support state-of-the-art technologies like
	// speculative decoding, then one model is main(target) model, another one
	// is draft model.
	// +kubebuilder:validation:MinItems=1
	Models []ModelRefer `json:"models,omitempty"`
	// InferenceFlavors represents a list of flavors with fungibility supported
	// to serve the model.
	// - If not set, always apply with the 0-index model by default.
	// - If set, will lookup the flavor names following the model orders.
	// +optional
	InferenceFlavors []FlavorName `json:"inferenceFlavors,omitempty"`
}

ModelClaims represents multiple claims for different models.

func (*ModelClaims) DeepCopy added in v0.0.6

func (in *ModelClaims) DeepCopy() *ModelClaims

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelClaims.

func (*ModelClaims) DeepCopyInto added in v0.0.6

func (in *ModelClaims) DeepCopyInto(out *ModelClaims)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelHub

type ModelHub struct {
	// Name refers to the model registry, such as huggingface.
	// +kubebuilder:default=Huggingface
	// +kubebuilder:validation:Enum={Huggingface,ModelScope}
	// +optional
	Name *string `json:"name,omitempty"`
	// ModelID refers to the model identifier on model hub,
	// such as meta-llama/Meta-Llama-3-8B.
	ModelID string `json:"modelID,omitempty"`
	// Filename refers to a specified model file rather than the whole repo.
	// This is helpful to download a specified GGUF model rather than downloading
	// the whole repo which includes all kinds of quantized models.
	// TODO: this is only supported with Huggingface, add support for ModelScope
	// in the near future.
	// Note: once filename is set, allowPatterns and ignorePatterns should be left unset.
	Filename *string `json:"filename,omitempty"`
	// Revision refers to a Git revision id which can be a branch name, a tag, or a commit hash.
	// +kubebuilder:default=main
	// +optional
	Revision *string `json:"revision,omitempty"`
	// AllowPatterns refers to files matched with at least one pattern will be downloaded.
	// +optional
	AllowPatterns []string `json:"allowPatterns,omitempty"`
	// IgnorePatterns refers to files matched with any of the patterns will not be downloaded.
	// +optional
	IgnorePatterns []string `json:"ignorePatterns,omitempty"`
}

ModelHub represents the model registry for model downloads.

func (*ModelHub) DeepCopy

func (in *ModelHub) DeepCopy() *ModelHub

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelHub.

func (*ModelHub) DeepCopyInto

func (in *ModelHub) DeepCopyInto(out *ModelHub)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelName

type ModelName string

type ModelRefer added in v0.0.7

type ModelRefer struct {
	// Name represents the model name.
	Name ModelName `json:"name"`
	// Role represents the model role once more than one model is required.
	// Such as a draft role, which means running with SpeculativeDecoding,
	// and default arguments for backend will be searched in backendRuntime
	// with the name of speculative-decoding.
	// +kubebuilder:validation:Enum={main,draft}
	// +kubebuilder:default=main
	// +optional
	Role *ModelRole `json:"role,omitempty"`
}

ModelRefer refers to a created Model with it's role.

func (*ModelRefer) DeepCopy added in v0.0.7

func (in *ModelRefer) DeepCopy() *ModelRefer

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelRefer.

func (*ModelRefer) DeepCopyInto added in v0.0.7

func (in *ModelRefer) DeepCopyInto(out *ModelRefer)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelRole added in v0.0.6

type ModelRole string
const (
	// Main represents the main model, if only one model is required,
	// it must be the main model. Only one main model is allowed.
	MainRole ModelRole = "main"
	// Draft represents the draft model in speculative decoding,
	// the main model is the target model then.
	DraftRole ModelRole = "draft"
)

type ModelSource

type ModelSource struct {
	// ModelHub represents the model registry for model downloads.
	// +optional
	ModelHub *ModelHub `json:"modelHub,omitempty"`
	// URI represents a various kinds of model sources following the uri protocol, e.g.:
	// - OSS: oss://<bucket>.<endpoint>/<path-to-your-model>
	//
	// +optional
	URI *URIProtocol `json:"uri,omitempty"`
}

ModelSource represents the source of the model. Only one model source will be used.

func (*ModelSource) DeepCopy

func (in *ModelSource) DeepCopy() *ModelSource

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelSource.

func (*ModelSource) DeepCopyInto

func (in *ModelSource) DeepCopyInto(out *ModelSource)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelSpec

type ModelSpec struct {
	// FamilyName represents the model type, like llama2, which will be auto injected
	// to the labels with the key of `llmaz.io/model-family-name`.
	FamilyName ModelName `json:"familyName"`
	// Source represents the source of the model, there're several ways to load
	// the model such as loading from huggingface, OCI registry, s3, host path and so on.
	Source ModelSource `json:"source"`
	// InferenceFlavors represents the accelerator requirements to serve the model.
	// Flavors are fungible following the priority represented by the slice order.
	// +kubebuilder:validation:MaxItems=8
	// +optional
	InferenceFlavors []Flavor `json:"inferenceFlavors,omitempty"`
}

ModelSpec defines the desired state of Model

func (*ModelSpec) DeepCopy

func (in *ModelSpec) DeepCopy() *ModelSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelSpec.

func (*ModelSpec) DeepCopyInto

func (in *ModelSpec) DeepCopyInto(out *ModelSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelStatus

type ModelStatus struct {
	// Conditions represents the Inference condition.
	Conditions []metav1.Condition `json:"conditions,omitempty"`
}

ModelStatus defines the observed state of Model

func (*ModelStatus) DeepCopy

func (in *ModelStatus) DeepCopy() *ModelStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelStatus.

func (*ModelStatus) DeepCopyInto

func (in *ModelStatus) DeepCopyInto(out *ModelStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type OpenModel

type OpenModel struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   ModelSpec   `json:"spec,omitempty"`
	Status ModelStatus `json:"status,omitempty"`
}

OpenModel is the Schema for the open models API

func (*OpenModel) DeepCopy

func (in *OpenModel) DeepCopy() *OpenModel

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new OpenModel.

func (*OpenModel) DeepCopyInto

func (in *OpenModel) DeepCopyInto(out *OpenModel)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*OpenModel) DeepCopyObject

func (in *OpenModel) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type OpenModelList

type OpenModelList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []OpenModel `json:"items"`
}

OpenModelList contains a list of OpenModel

func (*OpenModelList) DeepCopy

func (in *OpenModelList) DeepCopy() *OpenModelList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new OpenModelList.

func (*OpenModelList) DeepCopyInto

func (in *OpenModelList) DeepCopyInto(out *OpenModelList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*OpenModelList) DeepCopyObject

func (in *OpenModelList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type URIProtocol

type URIProtocol string

URIProtocol represents the protocol of the URI.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL