serving

package
v0.26.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 7, 2023 License: Apache-2.0 Imports: 9 Imported by: 6

Documentation

Overview

These APIs allow you to manage Apps, Serving Endpoints, etc.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type AppEvents added in v0.25.0

type AppEvents struct {
	EventName string `json:"event_name,omitempty"`

	EventTime string `json:"event_time,omitempty"`

	EventType string `json:"event_type,omitempty"`

	Message string `json:"message,omitempty"`

	ServiceName string `json:"service_name,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (AppEvents) MarshalJSON added in v0.25.0

func (s AppEvents) MarshalJSON() ([]byte, error)

func (*AppEvents) UnmarshalJSON added in v0.25.0

func (s *AppEvents) UnmarshalJSON(b []byte) error

type AppManifest added in v0.24.0

type AppManifest struct {
	// Workspace dependencies.
	Dependencies []any `json:"dependencies,omitempty"`
	// application description
	Description string `json:"description,omitempty"`
	// Ingress rules for app public endpoints
	Ingress any `json:"ingress,omitempty"`
	// Only a-z and dashes (-). Max length of 30.
	Name string `json:"name,omitempty"`
	// Container private registry
	Registry any `json:"registry,omitempty"`
	// list of app services. Restricted to one for now.
	Services any `json:"services,omitempty"`
	// The manifest format version. Must be set to 1.
	Version any `json:"version,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (AppManifest) MarshalJSON added in v0.24.0

func (s AppManifest) MarshalJSON() ([]byte, error)

func (*AppManifest) UnmarshalJSON added in v0.24.0

func (s *AppManifest) UnmarshalJSON(b []byte) error

type AppServiceStatus added in v0.25.0

type AppServiceStatus struct {
	Deployment any `json:"deployment,omitempty"`

	Name string `json:"name,omitempty"`

	Template any `json:"template,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (AppServiceStatus) MarshalJSON added in v0.25.0

func (s AppServiceStatus) MarshalJSON() ([]byte, error)

func (*AppServiceStatus) UnmarshalJSON added in v0.25.0

func (s *AppServiceStatus) UnmarshalJSON(b []byte) error

type AppsAPI added in v0.24.0

type AppsAPI struct {
	// contains filtered or unexported fields
}

Lakehouse Apps run directly on a customer’s Databricks instance, integrate with their data, use and extend Databricks services, and enable users to interact through single sign-on.

func NewApps added in v0.24.0

func NewApps(client *client.DatabricksClient) *AppsAPI

func (*AppsAPI) Create added in v0.24.0

func (a *AppsAPI) Create(ctx context.Context, request DeployAppRequest) (*DeploymentStatus, error)

Create and deploy an application.

Creates and deploys an application.

func (*AppsAPI) DeleteApp added in v0.25.0

func (a *AppsAPI) DeleteApp(ctx context.Context, request DeleteAppRequest) (*DeleteAppResponse, error)

Delete an application.

Delete an application definition

func (*AppsAPI) DeleteAppByName added in v0.25.0

func (a *AppsAPI) DeleteAppByName(ctx context.Context, name string) (*DeleteAppResponse, error)

Delete an application.

Delete an application definition

func (*AppsAPI) GetApp added in v0.25.0

func (a *AppsAPI) GetApp(ctx context.Context, request GetAppRequest) (*GetAppResponse, error)

Get definition for an application.

Get an application definition

func (*AppsAPI) GetAppByName added in v0.25.0

func (a *AppsAPI) GetAppByName(ctx context.Context, name string) (*GetAppResponse, error)

Get definition for an application.

Get an application definition

func (*AppsAPI) GetAppDeploymentStatus added in v0.25.0

func (a *AppsAPI) GetAppDeploymentStatus(ctx context.Context, request GetAppDeploymentStatusRequest) (*DeploymentStatus, error)

Get deployment status for an application.

Get deployment status for an application

func (*AppsAPI) GetAppDeploymentStatusByDeploymentId added in v0.25.0

func (a *AppsAPI) GetAppDeploymentStatusByDeploymentId(ctx context.Context, deploymentId string) (*DeploymentStatus, error)

Get deployment status for an application.

Get deployment status for an application

func (*AppsAPI) GetApps added in v0.25.0

func (a *AppsAPI) GetApps(ctx context.Context) (*ListAppsResponse, error)

List all applications.

List all available applications

func (*AppsAPI) GetEvents added in v0.25.0

func (a *AppsAPI) GetEvents(ctx context.Context, request GetEventsRequest) (*ListAppEventsResponse, error)

Get deployment events for an application.

Get deployment events for an application

func (*AppsAPI) GetEventsByName added in v0.25.0

func (a *AppsAPI) GetEventsByName(ctx context.Context, name string) (*ListAppEventsResponse, error)

Get deployment events for an application.

Get deployment events for an application

func (*AppsAPI) Impl added in v0.24.0

func (a *AppsAPI) Impl() AppsService

Impl returns low-level Apps API implementation

func (*AppsAPI) WithImpl added in v0.24.0

func (a *AppsAPI) WithImpl(impl AppsService) *AppsAPI

WithImpl could be used to override low-level API implementations for unit testing purposes with github.com/golang/mock or other mocking frameworks.

type AppsService added in v0.24.0

type AppsService interface {

	// Create and deploy an application.
	//
	// Creates and deploys an application.
	Create(ctx context.Context, request DeployAppRequest) (*DeploymentStatus, error)

	// Delete an application.
	//
	// Delete an application definition
	DeleteApp(ctx context.Context, request DeleteAppRequest) (*DeleteAppResponse, error)

	// Get definition for an application.
	//
	// Get an application definition
	GetApp(ctx context.Context, request GetAppRequest) (*GetAppResponse, error)

	// Get deployment status for an application.
	//
	// Get deployment status for an application
	GetAppDeploymentStatus(ctx context.Context, request GetAppDeploymentStatusRequest) (*DeploymentStatus, error)

	// List all applications.
	//
	// List all available applications
	GetApps(ctx context.Context) (*ListAppsResponse, error)

	// Get deployment events for an application.
	//
	// Get deployment events for an application
	GetEvents(ctx context.Context, request GetEventsRequest) (*ListAppEventsResponse, error)
}

Lakehouse Apps run directly on a customer’s Databricks instance, integrate with their data, use and extend Databricks services, and enable users to interact through single sign-on.

type BuildLogsRequest

type BuildLogsRequest struct {
	// The name of the serving endpoint that the served model belongs to. This
	// field is required.
	Name string `json:"-" url:"-"`
	// The name of the served model that build logs will be retrieved for. This
	// field is required.
	ServedModelName string `json:"-" url:"-"`
}

Retrieve the logs associated with building the model's environment for a given serving endpoint's served model.

type BuildLogsResponse

type BuildLogsResponse struct {
	// The logs associated with building the served model's environment.
	Logs string `json:"logs"`
}

type CreateServingEndpoint

type CreateServingEndpoint struct {
	// The core config of the serving endpoint.
	Config EndpointCoreConfigInput `json:"config"`
	// The name of the serving endpoint. This field is required and must be
	// unique across a Databricks workspace. An endpoint name can consist of
	// alphanumeric characters, dashes, and underscores.
	Name string `json:"name"`
	// Tags to be attached to the serving endpoint and automatically propagated
	// to billing logs.
	Tags []EndpointTag `json:"tags,omitempty"`
}

type DataframeSplitInput added in v0.21.0

type DataframeSplitInput struct {
	Columns []any `json:"columns,omitempty"`

	Data []any `json:"data,omitempty"`

	Index []int `json:"index,omitempty"`
}

type DeleteAppRequest added in v0.24.0

type DeleteAppRequest struct {
	// The name of an application. This field is required.
	Name string `json:"-" url:"-"`
}

Delete an application

type DeleteAppResponse added in v0.25.0

type DeleteAppResponse struct {
	Name string `json:"name,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (DeleteAppResponse) MarshalJSON added in v0.25.0

func (s DeleteAppResponse) MarshalJSON() ([]byte, error)

func (*DeleteAppResponse) UnmarshalJSON added in v0.25.0

func (s *DeleteAppResponse) UnmarshalJSON(b []byte) error

type DeleteServingEndpointRequest

type DeleteServingEndpointRequest struct {
	// The name of the serving endpoint. This field is required.
	Name string `json:"-" url:"-"`
}

Delete a serving endpoint

type DeployAppRequest added in v0.24.0

type DeployAppRequest struct {
	// Manifest that specifies the application requirements
	Manifest AppManifest `json:"manifest"`
	// Information passed at app deployment time to fulfill app dependencies
	Resources any `json:"resources,omitempty"`
}

type DeploymentStatus added in v0.24.0

type DeploymentStatus struct {
	// Container logs.
	ContainerLogs []any `json:"container_logs,omitempty"`
	// description
	DeploymentId string `json:"deployment_id,omitempty"`
	// Supplementary information about pod
	ExtraInfo string `json:"extra_info,omitempty"`
	// State: one of DEPLOYING,SUCCESS, FAILURE, DEPLOYMENT_STATE_UNSPECIFIED
	State DeploymentStatusState `json:"state,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (DeploymentStatus) MarshalJSON added in v0.24.0

func (s DeploymentStatus) MarshalJSON() ([]byte, error)

func (*DeploymentStatus) UnmarshalJSON added in v0.24.0

func (s *DeploymentStatus) UnmarshalJSON(b []byte) error

type DeploymentStatusState added in v0.24.0

type DeploymentStatusState string

State: one of DEPLOYING,SUCCESS, FAILURE, DEPLOYMENT_STATE_UNSPECIFIED

const DeploymentStatusStateDeploying DeploymentStatusState = `DEPLOYING`
const DeploymentStatusStateDeploymentStateUnspecified DeploymentStatusState = `DEPLOYMENT_STATE_UNSPECIFIED`
const DeploymentStatusStateFailure DeploymentStatusState = `FAILURE`
const DeploymentStatusStateSuccess DeploymentStatusState = `SUCCESS`

func (*DeploymentStatusState) Set added in v0.24.0

Set raw string value and validate it against allowed values

func (*DeploymentStatusState) String added in v0.24.0

func (f *DeploymentStatusState) String() string

String representation for fmt.Print

func (*DeploymentStatusState) Type added in v0.24.0

func (f *DeploymentStatusState) Type() string

Type always returns DeploymentStatusState to satisfy [pflag.Value] interface

type EndpointCoreConfigInput

type EndpointCoreConfigInput struct {
	// The name of the serving endpoint to update. This field is required.
	Name string `json:"-" url:"-"`
	// A list of served models for the endpoint to serve. A serving endpoint can
	// have up to 10 served models.
	ServedModels []ServedModelInput `json:"served_models"`
	// The traffic config defining how invocations to the serving endpoint
	// should be routed.
	TrafficConfig *TrafficConfig `json:"traffic_config,omitempty"`
}

type EndpointCoreConfigOutput

type EndpointCoreConfigOutput struct {
	// The config version that the serving endpoint is currently serving.
	ConfigVersion int `json:"config_version,omitempty"`
	// The list of served models under the serving endpoint config.
	ServedModels []ServedModelOutput `json:"served_models,omitempty"`
	// The traffic configuration associated with the serving endpoint config.
	TrafficConfig *TrafficConfig `json:"traffic_config,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (EndpointCoreConfigOutput) MarshalJSON added in v0.23.0

func (s EndpointCoreConfigOutput) MarshalJSON() ([]byte, error)

func (*EndpointCoreConfigOutput) UnmarshalJSON added in v0.23.0

func (s *EndpointCoreConfigOutput) UnmarshalJSON(b []byte) error

type EndpointCoreConfigSummary

type EndpointCoreConfigSummary struct {
	// The list of served models under the serving endpoint config.
	ServedModels []ServedModelSpec `json:"served_models,omitempty"`
}

type EndpointPendingConfig

type EndpointPendingConfig struct {
	// The config version that the serving endpoint is currently serving.
	ConfigVersion int `json:"config_version,omitempty"`
	// The list of served models belonging to the last issued update to the
	// serving endpoint.
	ServedModels []ServedModelOutput `json:"served_models,omitempty"`
	// The timestamp when the update to the pending config started.
	StartTime int64 `json:"start_time,omitempty"`
	// The traffic config defining how invocations to the serving endpoint
	// should be routed.
	TrafficConfig *TrafficConfig `json:"traffic_config,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (EndpointPendingConfig) MarshalJSON added in v0.23.0

func (s EndpointPendingConfig) MarshalJSON() ([]byte, error)

func (*EndpointPendingConfig) UnmarshalJSON added in v0.23.0

func (s *EndpointPendingConfig) UnmarshalJSON(b []byte) error

type EndpointState

type EndpointState struct {
	// The state of an endpoint's config update. This informs the user if the
	// pending_config is in progress, if the update failed, or if there is no
	// update in progress. Note that if the endpoint's config_update state value
	// is IN_PROGRESS, another update can not be made until the update completes
	// or fails."
	ConfigUpdate EndpointStateConfigUpdate `json:"config_update,omitempty"`
	// The state of an endpoint, indicating whether or not the endpoint is
	// queryable. An endpoint is READY if all of the served models in its active
	// configuration are ready. If any of the actively served models are in a
	// non-ready state, the endpoint state will be NOT_READY.
	Ready EndpointStateReady `json:"ready,omitempty"`
}

type EndpointStateConfigUpdate

type EndpointStateConfigUpdate string

The state of an endpoint's config update. This informs the user if the pending_config is in progress, if the update failed, or if there is no update in progress. Note that if the endpoint's config_update state value is IN_PROGRESS, another update can not be made until the update completes or fails."

const EndpointStateConfigUpdateInProgress EndpointStateConfigUpdate = `IN_PROGRESS`
const EndpointStateConfigUpdateNotUpdating EndpointStateConfigUpdate = `NOT_UPDATING`
const EndpointStateConfigUpdateUpdateFailed EndpointStateConfigUpdate = `UPDATE_FAILED`

func (*EndpointStateConfigUpdate) Set

Set raw string value and validate it against allowed values

func (*EndpointStateConfigUpdate) String

func (f *EndpointStateConfigUpdate) String() string

String representation for fmt.Print

func (*EndpointStateConfigUpdate) Type

Type always returns EndpointStateConfigUpdate to satisfy [pflag.Value] interface

type EndpointStateReady

type EndpointStateReady string

The state of an endpoint, indicating whether or not the endpoint is queryable. An endpoint is READY if all of the served models in its active configuration are ready. If any of the actively served models are in a non-ready state, the endpoint state will be NOT_READY.

const EndpointStateReadyNotReady EndpointStateReady = `NOT_READY`
const EndpointStateReadyReady EndpointStateReady = `READY`

func (*EndpointStateReady) Set

func (f *EndpointStateReady) Set(v string) error

Set raw string value and validate it against allowed values

func (*EndpointStateReady) String

func (f *EndpointStateReady) String() string

String representation for fmt.Print

func (*EndpointStateReady) Type

func (f *EndpointStateReady) Type() string

Type always returns EndpointStateReady to satisfy [pflag.Value] interface

type EndpointTag added in v0.20.0

type EndpointTag struct {
	// Key field for a serving endpoint tag.
	Key string `json:"key"`
	// Optional value field for a serving endpoint tag.
	Value string `json:"value,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (EndpointTag) MarshalJSON added in v0.23.0

func (s EndpointTag) MarshalJSON() ([]byte, error)

func (*EndpointTag) UnmarshalJSON added in v0.23.0

func (s *EndpointTag) UnmarshalJSON(b []byte) error

type ExportMetricsRequest

type ExportMetricsRequest struct {
	// The name of the serving endpoint to retrieve metrics for. This field is
	// required.
	Name string `json:"-" url:"-"`
}

Retrieve the metrics associated with a serving endpoint

type GetAppDeploymentStatusRequest added in v0.25.0

type GetAppDeploymentStatusRequest struct {
	// The deployment id for an application. This field is required.
	DeploymentId string `json:"-" url:"-"`
	// Boolean flag to include application logs
	IncludeAppLog string `json:"-" url:"include_app_log,omitempty"`

	ForceSendFields []string `json:"-"`
}

Get deployment status for an application

func (GetAppDeploymentStatusRequest) MarshalJSON added in v0.25.0

func (s GetAppDeploymentStatusRequest) MarshalJSON() ([]byte, error)

func (*GetAppDeploymentStatusRequest) UnmarshalJSON added in v0.25.0

func (s *GetAppDeploymentStatusRequest) UnmarshalJSON(b []byte) error

type GetAppRequest added in v0.24.0

type GetAppRequest struct {
	// The name of an application. This field is required.
	Name string `json:"-" url:"-"`
}

Get definition for an application

type GetAppResponse added in v0.25.0

type GetAppResponse struct {
	CurrentServices []AppServiceStatus `json:"current_services,omitempty"`

	Name string `json:"name,omitempty"`

	PendingServices []AppServiceStatus `json:"pending_services,omitempty"`

	Url string `json:"url,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (GetAppResponse) MarshalJSON added in v0.25.0

func (s GetAppResponse) MarshalJSON() ([]byte, error)

func (*GetAppResponse) UnmarshalJSON added in v0.25.0

func (s *GetAppResponse) UnmarshalJSON(b []byte) error

type GetEventsRequest added in v0.25.0

type GetEventsRequest struct {
	// The name of an application. This field is required.
	Name string `json:"-" url:"-"`
}

Get deployment events for an application

type GetServingEndpointPermissionLevelsRequest added in v0.15.0

type GetServingEndpointPermissionLevelsRequest struct {
	// The serving endpoint for which to get or manage permissions.
	ServingEndpointId string `json:"-" url:"-"`
}

Get serving endpoint permission levels

type GetServingEndpointPermissionLevelsResponse added in v0.15.0

type GetServingEndpointPermissionLevelsResponse struct {
	// Specific permission levels
	PermissionLevels []ServingEndpointPermissionsDescription `json:"permission_levels,omitempty"`
}

type GetServingEndpointPermissionsRequest added in v0.15.0

type GetServingEndpointPermissionsRequest struct {
	// The serving endpoint for which to get or manage permissions.
	ServingEndpointId string `json:"-" url:"-"`
}

Get serving endpoint permissions

type GetServingEndpointRequest

type GetServingEndpointRequest struct {
	// The name of the serving endpoint. This field is required.
	Name string `json:"-" url:"-"`
}

Get a single serving endpoint

type ListAppEventsResponse added in v0.25.0

type ListAppEventsResponse struct {
	// App events
	Events []AppEvents `json:"events,omitempty"`
}

type ListAppsResponse added in v0.25.0

type ListAppsResponse struct {
	// Available apps.
	Apps []any `json:"apps,omitempty"`

	NextPageToken string `json:"next_page_token,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ListAppsResponse) MarshalJSON added in v0.25.0

func (s ListAppsResponse) MarshalJSON() ([]byte, error)

func (*ListAppsResponse) UnmarshalJSON added in v0.25.0

func (s *ListAppsResponse) UnmarshalJSON(b []byte) error

type ListEndpointsResponse

type ListEndpointsResponse struct {
	// The list of endpoints.
	Endpoints []ServingEndpoint `json:"endpoints,omitempty"`
}

type LogsRequest

type LogsRequest struct {
	// The name of the serving endpoint that the served model belongs to. This
	// field is required.
	Name string `json:"-" url:"-"`
	// The name of the served model that logs will be retrieved for. This field
	// is required.
	ServedModelName string `json:"-" url:"-"`
}

Retrieve the most recent log lines associated with a given serving endpoint's served model

type PatchServingEndpointTags added in v0.20.0

type PatchServingEndpointTags struct {
	// List of endpoint tags to add
	AddTags []EndpointTag `json:"add_tags,omitempty"`
	// List of tag keys to delete
	DeleteTags []string `json:"delete_tags,omitempty"`
	// The name of the serving endpoint who's tags to patch. This field is
	// required.
	Name string `json:"-" url:"-"`
}

type QueryEndpointInput added in v0.21.0

type QueryEndpointInput struct {
	// Pandas Dataframe input in the records orientation.
	DataframeRecords []any `json:"dataframe_records,omitempty"`
	// Pandas Dataframe input in the split orientation.
	DataframeSplit *DataframeSplitInput `json:"dataframe_split,omitempty"`
	// Tensor-based input in columnar format.
	Inputs any `json:"inputs,omitempty"`
	// Tensor-based input in row format.
	Instances []any `json:"instances,omitempty"`
	// The name of the serving endpoint. This field is required.
	Name string `json:"-" url:"-"`
}

type QueryEndpointResponse

type QueryEndpointResponse struct {
	// The predictions returned by the serving endpoint.
	Predictions []any `json:"predictions"`
}

type Route

type Route struct {
	// The name of the served model this route configures traffic for.
	ServedModelName string `json:"served_model_name"`
	// The percentage of endpoint traffic to send to this route. It must be an
	// integer between 0 and 100 inclusive.
	TrafficPercentage int `json:"traffic_percentage"`
}

type ServedModelInput

type ServedModelInput struct {
	// An object containing a set of optional, user-specified environment
	// variable key-value pairs used for serving this model. Note: this is an
	// experimental feature and subject to change. Example model environment
	// variables that refer to Databricks secrets: `{"OPENAI_API_KEY":
	// "{{secrets/my_scope/my_key}}", "DATABRICKS_TOKEN":
	// "{{secrets/my_scope2/my_key2}}"}`
	EnvironmentVars map[string]string `json:"environment_vars,omitempty"`
	// ARN of the instance profile that the served model will use to access AWS
	// resources.
	InstanceProfileArn string `json:"instance_profile_arn,omitempty"`
	// The name of the model in Databricks Model Registry to be served or if the
	// model resides in Unity Catalog, the full name of model, in the form of
	// __catalog_name__.__schema_name__.__model_name__.
	ModelName string `json:"model_name"`
	// The version of the model in Databricks Model Registry or Unity Catalog to
	// be served.
	ModelVersion string `json:"model_version"`
	// The name of a served model. It must be unique across an endpoint. If not
	// specified, this field will default to <model-name>-<model-version>. A
	// served model name can consist of alphanumeric characters, dashes, and
	// underscores.
	Name string `json:"name,omitempty"`
	// Whether the compute resources for the served model should scale down to
	// zero.
	ScaleToZeroEnabled bool `json:"scale_to_zero_enabled"`
	// The workload size of the served model. The workload size corresponds to a
	// range of provisioned concurrency that the compute will autoscale between.
	// A single unit of provisioned concurrency can process one request at a
	// time. Valid workload sizes are "Small" (4 - 4 provisioned concurrency),
	// "Medium" (8 - 16 provisioned concurrency), and "Large" (16 - 64
	// provisioned concurrency). If scale-to-zero is enabled, the lower bound of
	// the provisioned concurrency for each workload size will be 0.
	WorkloadSize string `json:"workload_size"`
	// The workload type of the served model. The workload type selects which
	// type of compute to use in the endpoint. The default value for this
	// parameter is "CPU". For deep learning workloads, GPU acceleration is
	// available by selecting workload types like GPU_SMALL and others. See
	// documentation for all options.
	WorkloadType string `json:"workload_type,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServedModelInput) MarshalJSON added in v0.23.0

func (s ServedModelInput) MarshalJSON() ([]byte, error)

func (*ServedModelInput) UnmarshalJSON added in v0.23.0

func (s *ServedModelInput) UnmarshalJSON(b []byte) error

type ServedModelOutput

type ServedModelOutput struct {
	// The creation timestamp of the served model in Unix time.
	CreationTimestamp int64 `json:"creation_timestamp,omitempty"`
	// The email of the user who created the served model.
	Creator string `json:"creator,omitempty"`
	// An object containing a set of optional, user-specified environment
	// variable key-value pairs used for serving this model. Note: this is an
	// experimental feature and subject to change. Example model environment
	// variables that refer to Databricks secrets: `{"OPENAI_API_KEY":
	// "{{secrets/my_scope/my_key}}", "DATABRICKS_TOKEN":
	// "{{secrets/my_scope2/my_key2}}"}`
	EnvironmentVars map[string]string `json:"environment_vars,omitempty"`
	// ARN of the instance profile that the served model will use to access AWS
	// resources.
	InstanceProfileArn string `json:"instance_profile_arn,omitempty"`
	// The name of the model in Databricks Model Registry or the full name of
	// the model in Unity Catalog.
	ModelName string `json:"model_name,omitempty"`
	// The version of the model in Databricks Model Registry or Unity Catalog to
	// be served.
	ModelVersion string `json:"model_version,omitempty"`
	// The name of the served model.
	Name string `json:"name,omitempty"`
	// Whether the compute resources for the Served Model should scale down to
	// zero.
	ScaleToZeroEnabled bool `json:"scale_to_zero_enabled,omitempty"`
	// Information corresponding to the state of the Served Model.
	State *ServedModelState `json:"state,omitempty"`
	// The workload size of the served model. The workload size corresponds to a
	// range of provisioned concurrency that the compute will autoscale between.
	// A single unit of provisioned concurrency can process one request at a
	// time. Valid workload sizes are "Small" (4 - 4 provisioned concurrency),
	// "Medium" (8 - 16 provisioned concurrency), and "Large" (16 - 64
	// provisioned concurrency). If scale-to-zero is enabled, the lower bound of
	// the provisioned concurrency for each workload size will be 0.
	WorkloadSize string `json:"workload_size,omitempty"`
	// The workload type of the served model. The workload type selects which
	// type of compute to use in the endpoint. The default value for this
	// parameter is "CPU". For deep learning workloads, GPU acceleration is
	// available by selecting workload types like GPU_SMALL and others. See
	// documentation for all options.
	WorkloadType string `json:"workload_type,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServedModelOutput) MarshalJSON added in v0.23.0

func (s ServedModelOutput) MarshalJSON() ([]byte, error)

func (*ServedModelOutput) UnmarshalJSON added in v0.23.0

func (s *ServedModelOutput) UnmarshalJSON(b []byte) error

type ServedModelSpec

type ServedModelSpec struct {
	// The name of the model in Databricks Model Registry or the full name of
	// the model in Unity Catalog.
	ModelName string `json:"model_name,omitempty"`
	// The version of the model in Databricks Model Registry or Unity Catalog to
	// be served.
	ModelVersion string `json:"model_version,omitempty"`
	// The name of the served model.
	Name string `json:"name,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServedModelSpec) MarshalJSON added in v0.23.0

func (s ServedModelSpec) MarshalJSON() ([]byte, error)

func (*ServedModelSpec) UnmarshalJSON added in v0.23.0

func (s *ServedModelSpec) UnmarshalJSON(b []byte) error

type ServedModelState

type ServedModelState struct {
	// The state of the served model deployment. DEPLOYMENT_CREATING indicates
	// that the served model is not ready yet because the deployment is still
	// being created (i.e container image is building, model server is deploying
	// for the first time, etc.). DEPLOYMENT_RECOVERING indicates that the
	// served model was previously in a ready state but no longer is and is
	// attempting to recover. DEPLOYMENT_READY indicates that the served model
	// is ready to receive traffic. DEPLOYMENT_FAILED indicates that there was
	// an error trying to bring up the served model (e.g container image build
	// failed, the model server failed to start due to a model loading error,
	// etc.) DEPLOYMENT_ABORTED indicates that the deployment was terminated
	// likely due to a failure in bringing up another served model under the
	// same endpoint and config version.
	Deployment ServedModelStateDeployment `json:"deployment,omitempty"`
	// More information about the state of the served model, if available.
	DeploymentStateMessage string `json:"deployment_state_message,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServedModelState) MarshalJSON added in v0.23.0

func (s ServedModelState) MarshalJSON() ([]byte, error)

func (*ServedModelState) UnmarshalJSON added in v0.23.0

func (s *ServedModelState) UnmarshalJSON(b []byte) error

type ServedModelStateDeployment

type ServedModelStateDeployment string

The state of the served model deployment. DEPLOYMENT_CREATING indicates that the served model is not ready yet because the deployment is still being created (i.e container image is building, model server is deploying for the first time, etc.). DEPLOYMENT_RECOVERING indicates that the served model was previously in a ready state but no longer is and is attempting to recover. DEPLOYMENT_READY indicates that the served model is ready to receive traffic. DEPLOYMENT_FAILED indicates that there was an error trying to bring up the served model (e.g container image build failed, the model server failed to start due to a model loading error, etc.) DEPLOYMENT_ABORTED indicates that the deployment was terminated likely due to a failure in bringing up another served model under the same endpoint and config version.

const ServedModelStateDeploymentDeploymentAborted ServedModelStateDeployment = `DEPLOYMENT_ABORTED`
const ServedModelStateDeploymentDeploymentCreating ServedModelStateDeployment = `DEPLOYMENT_CREATING`
const ServedModelStateDeploymentDeploymentFailed ServedModelStateDeployment = `DEPLOYMENT_FAILED`
const ServedModelStateDeploymentDeploymentReady ServedModelStateDeployment = `DEPLOYMENT_READY`
const ServedModelStateDeploymentDeploymentRecovering ServedModelStateDeployment = `DEPLOYMENT_RECOVERING`

func (*ServedModelStateDeployment) Set

Set raw string value and validate it against allowed values

func (*ServedModelStateDeployment) String

func (f *ServedModelStateDeployment) String() string

String representation for fmt.Print

func (*ServedModelStateDeployment) Type

Type always returns ServedModelStateDeployment to satisfy [pflag.Value] interface

type ServerLogsResponse

type ServerLogsResponse struct {
	// The most recent log lines of the model server processing invocation
	// requests.
	Logs string `json:"logs"`
}

type ServingEndpoint

type ServingEndpoint struct {
	// The config that is currently being served by the endpoint.
	Config *EndpointCoreConfigSummary `json:"config,omitempty"`
	// The timestamp when the endpoint was created in Unix time.
	CreationTimestamp int64 `json:"creation_timestamp,omitempty"`
	// The email of the user who created the serving endpoint.
	Creator string `json:"creator,omitempty"`
	// System-generated ID of the endpoint. This is used to refer to the
	// endpoint in the Permissions API
	Id string `json:"id,omitempty"`
	// The timestamp when the endpoint was last updated by a user in Unix time.
	LastUpdatedTimestamp int64 `json:"last_updated_timestamp,omitempty"`
	// The name of the serving endpoint.
	Name string `json:"name,omitempty"`
	// Information corresponding to the state of the serving endpoint.
	State *EndpointState `json:"state,omitempty"`
	// Tags attached to the serving endpoint.
	Tags []EndpointTag `json:"tags,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServingEndpoint) MarshalJSON added in v0.23.0

func (s ServingEndpoint) MarshalJSON() ([]byte, error)

func (*ServingEndpoint) UnmarshalJSON added in v0.23.0

func (s *ServingEndpoint) UnmarshalJSON(b []byte) error

type ServingEndpointAccessControlRequest added in v0.15.0

type ServingEndpointAccessControlRequest struct {
	// name of the group
	GroupName string `json:"group_name,omitempty"`
	// Permission level
	PermissionLevel ServingEndpointPermissionLevel `json:"permission_level,omitempty"`
	// Application ID of an active service principal. Setting this field
	// requires the `servicePrincipal/user` role.
	ServicePrincipalName string `json:"service_principal_name,omitempty"`
	// name of the user
	UserName string `json:"user_name,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServingEndpointAccessControlRequest) MarshalJSON added in v0.23.0

func (s ServingEndpointAccessControlRequest) MarshalJSON() ([]byte, error)

func (*ServingEndpointAccessControlRequest) UnmarshalJSON added in v0.23.0

func (s *ServingEndpointAccessControlRequest) UnmarshalJSON(b []byte) error

type ServingEndpointAccessControlResponse added in v0.15.0

type ServingEndpointAccessControlResponse struct {
	// All permissions.
	AllPermissions []ServingEndpointPermission `json:"all_permissions,omitempty"`
	// Display name of the user or service principal.
	DisplayName string `json:"display_name,omitempty"`
	// name of the group
	GroupName string `json:"group_name,omitempty"`
	// Name of the service principal.
	ServicePrincipalName string `json:"service_principal_name,omitempty"`
	// name of the user
	UserName string `json:"user_name,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServingEndpointAccessControlResponse) MarshalJSON added in v0.23.0

func (s ServingEndpointAccessControlResponse) MarshalJSON() ([]byte, error)

func (*ServingEndpointAccessControlResponse) UnmarshalJSON added in v0.23.0

func (s *ServingEndpointAccessControlResponse) UnmarshalJSON(b []byte) error

type ServingEndpointDetailed

type ServingEndpointDetailed struct {
	// The config that is currently being served by the endpoint.
	Config *EndpointCoreConfigOutput `json:"config,omitempty"`
	// The timestamp when the endpoint was created in Unix time.
	CreationTimestamp int64 `json:"creation_timestamp,omitempty"`
	// The email of the user who created the serving endpoint.
	Creator string `json:"creator,omitempty"`
	// System-generated ID of the endpoint. This is used to refer to the
	// endpoint in the Permissions API
	Id string `json:"id,omitempty"`
	// The timestamp when the endpoint was last updated by a user in Unix time.
	LastUpdatedTimestamp int64 `json:"last_updated_timestamp,omitempty"`
	// The name of the serving endpoint.
	Name string `json:"name,omitempty"`
	// The config that the endpoint is attempting to update to.
	PendingConfig *EndpointPendingConfig `json:"pending_config,omitempty"`
	// The permission level of the principal making the request.
	PermissionLevel ServingEndpointDetailedPermissionLevel `json:"permission_level,omitempty"`
	// Information corresponding to the state of the serving endpoint.
	State *EndpointState `json:"state,omitempty"`
	// Tags attached to the serving endpoint.
	Tags []EndpointTag `json:"tags,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServingEndpointDetailed) MarshalJSON added in v0.23.0

func (s ServingEndpointDetailed) MarshalJSON() ([]byte, error)

func (*ServingEndpointDetailed) UnmarshalJSON added in v0.23.0

func (s *ServingEndpointDetailed) UnmarshalJSON(b []byte) error

type ServingEndpointDetailedPermissionLevel

type ServingEndpointDetailedPermissionLevel string

The permission level of the principal making the request.

const ServingEndpointDetailedPermissionLevelCanManage ServingEndpointDetailedPermissionLevel = `CAN_MANAGE`
const ServingEndpointDetailedPermissionLevelCanQuery ServingEndpointDetailedPermissionLevel = `CAN_QUERY`
const ServingEndpointDetailedPermissionLevelCanView ServingEndpointDetailedPermissionLevel = `CAN_VIEW`

func (*ServingEndpointDetailedPermissionLevel) Set

Set raw string value and validate it against allowed values

func (*ServingEndpointDetailedPermissionLevel) String

String representation for fmt.Print

func (*ServingEndpointDetailedPermissionLevel) Type

Type always returns ServingEndpointDetailedPermissionLevel to satisfy [pflag.Value] interface

type ServingEndpointPermission added in v0.15.0

type ServingEndpointPermission struct {
	Inherited bool `json:"inherited,omitempty"`

	InheritedFromObject []string `json:"inherited_from_object,omitempty"`
	// Permission level
	PermissionLevel ServingEndpointPermissionLevel `json:"permission_level,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServingEndpointPermission) MarshalJSON added in v0.23.0

func (s ServingEndpointPermission) MarshalJSON() ([]byte, error)

func (*ServingEndpointPermission) UnmarshalJSON added in v0.23.0

func (s *ServingEndpointPermission) UnmarshalJSON(b []byte) error

type ServingEndpointPermissionLevel added in v0.15.0

type ServingEndpointPermissionLevel string

Permission level

const ServingEndpointPermissionLevelCanManage ServingEndpointPermissionLevel = `CAN_MANAGE`
const ServingEndpointPermissionLevelCanQuery ServingEndpointPermissionLevel = `CAN_QUERY`
const ServingEndpointPermissionLevelCanView ServingEndpointPermissionLevel = `CAN_VIEW`

func (*ServingEndpointPermissionLevel) Set added in v0.15.0

Set raw string value and validate it against allowed values

func (*ServingEndpointPermissionLevel) String added in v0.15.0

String representation for fmt.Print

func (*ServingEndpointPermissionLevel) Type added in v0.15.0

Type always returns ServingEndpointPermissionLevel to satisfy [pflag.Value] interface

type ServingEndpointPermissions added in v0.15.0

type ServingEndpointPermissions struct {
	AccessControlList []ServingEndpointAccessControlResponse `json:"access_control_list,omitempty"`

	ObjectId string `json:"object_id,omitempty"`

	ObjectType string `json:"object_type,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServingEndpointPermissions) MarshalJSON added in v0.23.0

func (s ServingEndpointPermissions) MarshalJSON() ([]byte, error)

func (*ServingEndpointPermissions) UnmarshalJSON added in v0.23.0

func (s *ServingEndpointPermissions) UnmarshalJSON(b []byte) error

type ServingEndpointPermissionsDescription added in v0.15.0

type ServingEndpointPermissionsDescription struct {
	Description string `json:"description,omitempty"`
	// Permission level
	PermissionLevel ServingEndpointPermissionLevel `json:"permission_level,omitempty"`

	ForceSendFields []string `json:"-"`
}

func (ServingEndpointPermissionsDescription) MarshalJSON added in v0.23.0

func (s ServingEndpointPermissionsDescription) MarshalJSON() ([]byte, error)

func (*ServingEndpointPermissionsDescription) UnmarshalJSON added in v0.23.0

func (s *ServingEndpointPermissionsDescription) UnmarshalJSON(b []byte) error

type ServingEndpointPermissionsRequest added in v0.15.0

type ServingEndpointPermissionsRequest struct {
	AccessControlList []ServingEndpointAccessControlRequest `json:"access_control_list,omitempty"`
	// The serving endpoint for which to get or manage permissions.
	ServingEndpointId string `json:"-" url:"-"`
}

type ServingEndpointsAPI

type ServingEndpointsAPI struct {
	// contains filtered or unexported fields
}

The Serving Endpoints API allows you to create, update, and delete model serving endpoints.

You can use a serving endpoint to serve models from the Databricks Model Registry or from Unity Catalog. Endpoints expose the underlying models as scalable REST API endpoints using serverless compute. This means the endpoints and associated compute resources are fully managed by Databricks and will not appear in your cloud account. A serving endpoint can consist of one or more MLflow models from the Databricks Model Registry, called served models. A serving endpoint can have at most ten served models. You can configure traffic settings to define how requests should be routed to your served models behind an endpoint. Additionally, you can configure the scale of resources that should be applied to each served model.

func NewServingEndpoints

func NewServingEndpoints(client *client.DatabricksClient) *ServingEndpointsAPI

func (*ServingEndpointsAPI) BuildLogs

Retrieve the logs associated with building the model's environment for a given serving endpoint's served model.

Retrieves the build logs associated with the provided served model.

func (*ServingEndpointsAPI) BuildLogsByNameAndServedModelName

func (a *ServingEndpointsAPI) BuildLogsByNameAndServedModelName(ctx context.Context, name string, servedModelName string) (*BuildLogsResponse, error)

Retrieve the logs associated with building the model's environment for a given serving endpoint's served model.

Retrieves the build logs associated with the provided served model.

func (*ServingEndpointsAPI) Create

Create a new serving endpoint.

func (*ServingEndpointsAPI) CreateAndWait deprecated

func (a *ServingEndpointsAPI) CreateAndWait(ctx context.Context, createServingEndpoint CreateServingEndpoint, options ...retries.Option[ServingEndpointDetailed]) (*ServingEndpointDetailed, error)

Calls ServingEndpointsAPI.Create and waits to reach NOT_UPDATING state

You can override the default timeout of 20 minutes by calling adding retries.Timeout[ServingEndpointDetailed](60*time.Minute) functional option.

Deprecated: use ServingEndpointsAPI.Create.Get() or ServingEndpointsAPI.WaitGetServingEndpointNotUpdating

func (*ServingEndpointsAPI) Delete

Delete a serving endpoint.

func (*ServingEndpointsAPI) DeleteByName

func (a *ServingEndpointsAPI) DeleteByName(ctx context.Context, name string) error

Delete a serving endpoint.

func (*ServingEndpointsAPI) ExportMetrics

func (a *ServingEndpointsAPI) ExportMetrics(ctx context.Context, request ExportMetricsRequest) error

Retrieve the metrics associated with a serving endpoint.

Retrieves the metrics associated with the provided serving endpoint in either Prometheus or OpenMetrics exposition format.

func (*ServingEndpointsAPI) ExportMetricsByName

func (a *ServingEndpointsAPI) ExportMetricsByName(ctx context.Context, name string) error

Retrieve the metrics associated with a serving endpoint.

Retrieves the metrics associated with the provided serving endpoint in either Prometheus or OpenMetrics exposition format.

func (*ServingEndpointsAPI) Get

Get a single serving endpoint.

Retrieves the details for a single serving endpoint.

func (*ServingEndpointsAPI) GetByName

Get a single serving endpoint.

Retrieves the details for a single serving endpoint.

func (*ServingEndpointsAPI) GetPermissionLevels added in v0.19.0

Get serving endpoint permission levels.

Gets the permission levels that a user can have on an object.

func (*ServingEndpointsAPI) GetPermissionLevelsByServingEndpointId added in v0.19.0

func (a *ServingEndpointsAPI) GetPermissionLevelsByServingEndpointId(ctx context.Context, servingEndpointId string) (*GetServingEndpointPermissionLevelsResponse, error)

Get serving endpoint permission levels.

Gets the permission levels that a user can have on an object.

func (*ServingEndpointsAPI) GetPermissions added in v0.19.0

Get serving endpoint permissions.

Gets the permissions of a serving endpoint. Serving endpoints can inherit permissions from their root object.

func (*ServingEndpointsAPI) GetPermissionsByServingEndpointId added in v0.19.0

func (a *ServingEndpointsAPI) GetPermissionsByServingEndpointId(ctx context.Context, servingEndpointId string) (*ServingEndpointPermissions, error)

Get serving endpoint permissions.

Gets the permissions of a serving endpoint. Serving endpoints can inherit permissions from their root object.

func (*ServingEndpointsAPI) Impl

Impl returns low-level ServingEndpoints API implementation

func (*ServingEndpointsAPI) List

Retrieve all serving endpoints.

This method is generated by Databricks SDK Code Generator.

func (*ServingEndpointsAPI) ListAll added in v0.10.0

Retrieve all serving endpoints.

This method is generated by Databricks SDK Code Generator.

func (*ServingEndpointsAPI) Logs

Retrieve the most recent log lines associated with a given serving endpoint's served model.

Retrieves the service logs associated with the provided served model.

func (*ServingEndpointsAPI) LogsByNameAndServedModelName

func (a *ServingEndpointsAPI) LogsByNameAndServedModelName(ctx context.Context, name string, servedModelName string) (*ServerLogsResponse, error)

Retrieve the most recent log lines associated with a given serving endpoint's served model.

Retrieves the service logs associated with the provided served model.

func (*ServingEndpointsAPI) Patch added in v0.20.0

Patch the tags of a serving endpoint.

Used to batch add and delete tags from a serving endpoint with a single API call.

func (*ServingEndpointsAPI) Query

Query a serving endpoint with provided model input.

func (*ServingEndpointsAPI) SetPermissions added in v0.19.0

Set serving endpoint permissions.

Sets permissions on a serving endpoint. Serving endpoints can inherit permissions from their root object.

func (*ServingEndpointsAPI) UpdateConfig

Update a serving endpoint with a new config.

Updates any combination of the serving endpoint's served models, the compute configuration of those served models, and the endpoint's traffic config. An endpoint that already has an update in progress can not be updated until the current update completes or fails.

func (*ServingEndpointsAPI) UpdateConfigAndWait deprecated

func (a *ServingEndpointsAPI) UpdateConfigAndWait(ctx context.Context, endpointCoreConfigInput EndpointCoreConfigInput, options ...retries.Option[ServingEndpointDetailed]) (*ServingEndpointDetailed, error)

Calls ServingEndpointsAPI.UpdateConfig and waits to reach NOT_UPDATING state

You can override the default timeout of 20 minutes by calling adding retries.Timeout[ServingEndpointDetailed](60*time.Minute) functional option.

Deprecated: use ServingEndpointsAPI.UpdateConfig.Get() or ServingEndpointsAPI.WaitGetServingEndpointNotUpdating

func (*ServingEndpointsAPI) UpdatePermissions added in v0.19.0

Update serving endpoint permissions.

Updates the permissions on a serving endpoint. Serving endpoints can inherit permissions from their root object.

func (*ServingEndpointsAPI) WaitGetServingEndpointNotUpdating added in v0.10.0

func (a *ServingEndpointsAPI) WaitGetServingEndpointNotUpdating(ctx context.Context, name string,
	timeout time.Duration, callback func(*ServingEndpointDetailed)) (*ServingEndpointDetailed, error)

WaitGetServingEndpointNotUpdating repeatedly calls ServingEndpointsAPI.Get and waits to reach NOT_UPDATING state

func (*ServingEndpointsAPI) WithImpl

WithImpl could be used to override low-level API implementations for unit testing purposes with github.com/golang/mock or other mocking frameworks.

type ServingEndpointsService

type ServingEndpointsService interface {

	// Retrieve the logs associated with building the model's environment for a
	// given serving endpoint's served model.
	//
	// Retrieves the build logs associated with the provided served model.
	BuildLogs(ctx context.Context, request BuildLogsRequest) (*BuildLogsResponse, error)

	// Create a new serving endpoint.
	Create(ctx context.Context, request CreateServingEndpoint) (*ServingEndpointDetailed, error)

	// Delete a serving endpoint.
	Delete(ctx context.Context, request DeleteServingEndpointRequest) error

	// Retrieve the metrics associated with a serving endpoint.
	//
	// Retrieves the metrics associated with the provided serving endpoint in
	// either Prometheus or OpenMetrics exposition format.
	ExportMetrics(ctx context.Context, request ExportMetricsRequest) error

	// Get a single serving endpoint.
	//
	// Retrieves the details for a single serving endpoint.
	Get(ctx context.Context, request GetServingEndpointRequest) (*ServingEndpointDetailed, error)

	// Get serving endpoint permission levels.
	//
	// Gets the permission levels that a user can have on an object.
	GetPermissionLevels(ctx context.Context, request GetServingEndpointPermissionLevelsRequest) (*GetServingEndpointPermissionLevelsResponse, error)

	// Get serving endpoint permissions.
	//
	// Gets the permissions of a serving endpoint. Serving endpoints can inherit
	// permissions from their root object.
	GetPermissions(ctx context.Context, request GetServingEndpointPermissionsRequest) (*ServingEndpointPermissions, error)

	// Retrieve all serving endpoints.
	//
	// Use ListAll() to get all ServingEndpoint instances
	List(ctx context.Context) (*ListEndpointsResponse, error)

	// Retrieve the most recent log lines associated with a given serving
	// endpoint's served model.
	//
	// Retrieves the service logs associated with the provided served model.
	Logs(ctx context.Context, request LogsRequest) (*ServerLogsResponse, error)

	// Patch the tags of a serving endpoint.
	//
	// Used to batch add and delete tags from a serving endpoint with a single
	// API call.
	Patch(ctx context.Context, request PatchServingEndpointTags) ([]EndpointTag, error)

	// Query a serving endpoint with provided model input.
	Query(ctx context.Context, request QueryEndpointInput) (*QueryEndpointResponse, error)

	// Set serving endpoint permissions.
	//
	// Sets permissions on a serving endpoint. Serving endpoints can inherit
	// permissions from their root object.
	SetPermissions(ctx context.Context, request ServingEndpointPermissionsRequest) (*ServingEndpointPermissions, error)

	// Update a serving endpoint with a new config.
	//
	// Updates any combination of the serving endpoint's served models, the
	// compute configuration of those served models, and the endpoint's traffic
	// config. An endpoint that already has an update in progress can not be
	// updated until the current update completes or fails.
	UpdateConfig(ctx context.Context, request EndpointCoreConfigInput) (*ServingEndpointDetailed, error)

	// Update serving endpoint permissions.
	//
	// Updates the permissions on a serving endpoint. Serving endpoints can
	// inherit permissions from their root object.
	UpdatePermissions(ctx context.Context, request ServingEndpointPermissionsRequest) (*ServingEndpointPermissions, error)
}

The Serving Endpoints API allows you to create, update, and delete model serving endpoints.

You can use a serving endpoint to serve models from the Databricks Model Registry or from Unity Catalog. Endpoints expose the underlying models as scalable REST API endpoints using serverless compute. This means the endpoints and associated compute resources are fully managed by Databricks and will not appear in your cloud account. A serving endpoint can consist of one or more MLflow models from the Databricks Model Registry, called served models. A serving endpoint can have at most ten served models. You can configure traffic settings to define how requests should be routed to your served models behind an endpoint. Additionally, you can configure the scale of resources that should be applied to each served model.

type TrafficConfig

type TrafficConfig struct {
	// The list of routes that define traffic to each served model.
	Routes []Route `json:"routes,omitempty"`
}

type WaitGetServingEndpointNotUpdating added in v0.10.0

type WaitGetServingEndpointNotUpdating[R any] struct {
	Response *R
	Name     string `json:"name"`
	// contains filtered or unexported fields
}

WaitGetServingEndpointNotUpdating is a wrapper that calls ServingEndpointsAPI.WaitGetServingEndpointNotUpdating and waits to reach NOT_UPDATING state.

func (*WaitGetServingEndpointNotUpdating[R]) Get added in v0.10.0

Get the ServingEndpointDetailed with the default timeout of 20 minutes.

func (*WaitGetServingEndpointNotUpdating[R]) GetWithTimeout added in v0.10.0

Get the ServingEndpointDetailed with custom timeout.

func (*WaitGetServingEndpointNotUpdating[R]) OnProgress added in v0.10.0

OnProgress invokes a callback every time it polls for the status update.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL