Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
View Source
var ErrNotAccountClient = errors.New("invalid Databricks Account configuration")
Functions ¶
func Must ¶
Must panics if error is not nil. It's intended to be used with databricks.NewWorkspaceClient and databricks.NewAccountClient.
func WithProduct ¶
func WithProduct(name, version string)
WithProduct is expected to be set by developers to differentiate their app from others.
Example setting is:
func init() { databricks.WithProduct("your-product", "0.0.1") }
Types ¶
type AccountClient ¶
type AccountClient struct { Config *config.Config // This API allows you to download billable usage logs for the specified // account and date range. This feature works with all account types. BillableUsage *billing.BillableUsageAPI // These APIs manage budget configuration including notifications for // exceeding a budget for a period. They can also retrieve the status of // each budget. Budgets *billing.BudgetsAPI // These APIs manage credential configurations for this workspace. // Databricks needs access to a cross-account service IAM role in your AWS // account so that Databricks can deploy clusters in the appropriate VPC for // the new workspace. A credential configuration encapsulates this role // information, and its ID is used when creating a new workspace. Credentials *provisioning.CredentialsAPI // These APIs enable administrators to manage custom oauth app integrations, // which is required for adding/using Custom OAuth App Integration like // Tableau Cloud for Databricks in AWS cloud. // // **Note:** You can only add/use the OAuth custom application integrations // when OAuth enrollment status is enabled. For more details see // :method:OAuthEnrollment/create CustomAppIntegration *oauth2.CustomAppIntegrationAPI // These APIs manage encryption key configurations for this workspace // (optional). A key configuration encapsulates the AWS KMS key information // and some information about how the key configuration can be used. There // are two possible uses for key configurations: // // * Managed services: A key configuration can be used to encrypt a // workspace's notebook and secret data in the control plane, as well as // Databricks SQL queries and query history. * Storage: A key configuration // can be used to encrypt a workspace's DBFS and EBS data in the data plane. // // In both of these cases, the key configuration's ID is used when creating // a new workspace. This Preview feature is available if your account is on // the E2 version of the platform. Updating a running workspace with // workspace storage encryption requires that the workspace is on the E2 // version of the platform. If you have an older workspace, it might not be // on the E2 version of the platform. If you are not sure, contact your // Databricks representative. EncryptionKeys *provisioning.EncryptionKeysAPI // Groups simplify identity management, making it easier to assign access to // Databricks Account, data, and other securable objects. // // It is best practice to assign access to workspaces and access-control // policies in Unity Catalog to groups, instead of to users individually. // All Databricks Account identities can be assigned as members of groups, // and members inherit permissions that are assigned to their group. Groups *iam.AccountGroupsAPI // The Accounts IP Access List API enables account admins to configure IP // access lists for access to the account console. // // Account IP Access Lists affect web application access and REST API access // to the account console and account APIs. If the feature is disabled for // the account, all access is allowed for this account. There is support for // allow lists (inclusion) and block lists (exclusion). // // When a connection is attempted: 1. **First, all block lists are // checked.** If the connection IP address matches any block list, the // connection is rejected. 2. **If the connection was not rejected by block // lists**, the IP address is compared with the allow lists. // // If there is at least one allow list for the account, the connection is // allowed only if the IP address matches an allow list. If there are no // allow lists for the account, all IP addresses are allowed. // // For all allow lists and block lists combined, the account supports a // maximum of 1000 IP/CIDR values, where one CIDR counts as a single value. // // After changes to the account-level IP access lists, it can take a few // minutes for changes to take effect. IpAccessLists *settings.AccountIpAccessListsAPI // These APIs manage log delivery configurations for this account. The two // supported log types for this API are _billable usage logs_ and _audit // logs_. This feature is in Public Preview. This feature works with all // account ID types. // // Log delivery works with all account types. However, if your account is on // the E2 version of the platform or on a select custom plan that allows // multiple workspaces per account, you can optionally configure different // storage destinations for each workspace. Log delivery status is also // provided to know the latest status of log delivery attempts. The // high-level flow of billable usage delivery: // // 1. **Create storage**: In AWS, [create a new AWS S3 bucket] with a // specific bucket policy. Using Databricks APIs, call the Account API to // create a [storage configuration object](#operation/create-storage-config) // that uses the bucket name. 2. **Create credentials**: In AWS, create the // appropriate AWS IAM role. For full details, including the required IAM // role policies and trust relationship, see [Billable usage log delivery]. // Using Databricks APIs, call the Account API to create a [credential // configuration object](#operation/create-credential-config) that uses the // IAM role's ARN. 3. **Create log delivery configuration**: Using // Databricks APIs, call the Account API to [create a log delivery // configuration](#operation/create-log-delivery-config) that uses the // credential and storage configuration objects from previous steps. You can // specify if the logs should include all events of that log type in your // account (_Account level_ delivery) or only events for a specific set of // workspaces (_workspace level_ delivery). Account level log delivery // applies to all current and future workspaces plus account level logs, // while workspace level log delivery solely delivers logs related to the // specified workspaces. You can create multiple types of delivery // configurations per account. // // For billable usage delivery: * For more information about billable usage // logs, see [Billable usage log delivery]. For the CSV schema, see the // [Usage page]. * The delivery location is // `<bucket-name>/<prefix>/billable-usage/csv/`, where `<prefix>` is the // name of the optional delivery path prefix you set up during log delivery // configuration. Files are named // `workspaceId=<workspace-id>-usageMonth=<month>.csv`. * All billable usage // logs apply to specific workspaces (_workspace level_ logs). You can // aggregate usage for your entire account by creating an _account level_ // delivery configuration that delivers logs for all current and future // workspaces in your account. * The files are delivered daily by // overwriting the month's CSV file for each workspace. // // For audit log delivery: * For more information about about audit log // delivery, see [Audit log delivery], which includes information about the // used JSON schema. * The delivery location is // `<bucket-name>/<delivery-path-prefix>/workspaceId=<workspaceId>/date=<yyyy-mm-dd>/auditlogs_<internal-id>.json`. // Files may get overwritten with the same content multiple times to achieve // exactly-once delivery. * If the audit log delivery configuration included // specific workspace IDs, only _workspace-level_ audit logs for those // workspaces are delivered. If the log delivery configuration applies to // the entire account (_account level_ delivery configuration), the audit // log delivery includes workspace-level audit logs for all workspaces in // the account as well as account-level audit logs. See [Audit log delivery] // for details. * Auditable events are typically available in logs within 15 // minutes. // // [Audit log delivery]: https://docs.databricks.com/administration-guide/account-settings/audit-logs.html // [Billable usage log delivery]: https://docs.databricks.com/administration-guide/account-settings/billable-usage-delivery.html // [Usage page]: https://docs.databricks.com/administration-guide/account-settings/usage.html // [create a new AWS S3 bucket]: https://docs.databricks.com/administration-guide/account-api/aws-storage.html LogDelivery *billing.LogDeliveryAPI // These APIs manage metastore assignments to a workspace. MetastoreAssignments *catalog.AccountMetastoreAssignmentsAPI // These APIs manage Unity Catalog metastores for an account. A metastore // contains catalogs that can be associated with workspaces Metastores *catalog.AccountMetastoresAPI // These APIs manage network configurations for customer-managed VPCs // (optional). Its ID is used when creating a new workspace if you use // customer-managed VPCs. Networks *provisioning.NetworksAPI // These APIs enable administrators to enroll OAuth for their accounts, // which is required for adding/using any OAuth published/custom application // integration. // // **Note:** Your account must be on the E2 version to use these APIs, this // is because OAuth is only supported on the E2 version. OAuthEnrollment *oauth2.OAuthEnrollmentAPI // These APIs manage private access settings for this account. PrivateAccess *provisioning.PrivateAccessAPI // These APIs enable administrators to manage published oauth app // integrations, which is required for adding/using Published OAuth App // Integration like Tableau Cloud for Databricks in AWS cloud. // // **Note:** You can only add/use the OAuth published application // integrations when OAuth enrollment status is enabled. For more details // see :method:OAuthEnrollment/create PublishedAppIntegration *oauth2.PublishedAppIntegrationAPI // Identities for use with jobs, automated tools, and systems such as // scripts, apps, and CI/CD platforms. Databricks recommends creating // service principals to run production jobs or modify production data. If // all processes that act on production data run with service principals, // interactive users do not need any write, delete, or modify privileges in // production. This eliminates the risk of a user overwriting production // data by accident. ServicePrincipals *iam.AccountServicePrincipalsAPI // These APIs manage storage configurations for this workspace. A root // storage S3 bucket in your account is required to store objects like // cluster logs, notebook revisions, and job results. You can also use the // root storage S3 bucket for storage of non-production DBFS data. A storage // configuration encapsulates this bucket information, and its ID is used // when creating a new workspace. Storage *provisioning.StorageAPI // These APIs manage storage credentials for a particular metastore. StorageCredentials *catalog.AccountStorageCredentialsAPI // User identities recognized by Databricks and represented by email // addresses. // // Databricks recommends using SCIM provisioning to sync users and groups // automatically from your identity provider to your Databricks Account. // SCIM streamlines onboarding a new employee or team by using your identity // provider to create users and groups in Databricks Account and give them // the proper level of access. When a user leaves your organization or no // longer needs access to Databricks Account, admins can terminate the user // in your identity provider and that user’s account will also be removed // from Databricks Account. This ensures a consistent offboarding process // and prevents unauthorized users from accessing sensitive data. Users *iam.AccountUsersAPI // These APIs manage VPC endpoint configurations for this account. VpcEndpoints *provisioning.VpcEndpointsAPI // The Workspace Permission Assignment API allows you to manage workspace // permissions for principals in your account. WorkspaceAssignment *iam.WorkspaceAssignmentAPI // These APIs manage workspaces for this account. A Databricks workspace is // an environment for accessing all of your Databricks assets. The workspace // organizes objects (notebooks, libraries, and experiments) into folders, // and provides access to data and computational resources such as clusters // and jobs. // // These endpoints are available if your account is on the E2 version of the // platform or on a select custom plan that allows multiple workspaces per // account. Workspaces *provisioning.WorkspacesAPI }
func NewAccountClient ¶
func NewAccountClient(c ...*Config) (*AccountClient, error)
NewAccountClient creates new Databricks SDK client for Accounts or returns error in case configuration is wrong
type WorkspaceClient ¶
type WorkspaceClient struct { Config *config.Config // The alerts API can be used to perform CRUD operations on alerts. An alert // is a Databricks SQL object that periodically runs a query, evaluates a // condition of its result, and notifies one or more users and/or // notification destinations if the condition was met. Alerts *sql.AlertsAPI // A catalog is the first layer of Unity Catalog’s three-level namespace. // It’s used to organize your data assets. Users can see all catalogs on // which they have been assigned the USE_CATALOG data permission. // // In Unity Catalog, admins and data stewards manage users and their access // to data centrally across all of the workspaces in a Databricks account. // Users in different workspaces can share access to the same data, // depending on privileges granted centrally in Unity Catalog. Catalogs *catalog.CatalogsAPI // Cluster policy limits the ability to configure clusters based on a set of // rules. The policy rules limit the attributes or attribute values // available for cluster creation. Cluster policies have ACLs that limit // their use to specific users and groups. // // Cluster policies let you limit users to create clusters with prescribed // settings, simplify the user interface and enable more users to create // their own clusters (by fixing and hiding some values), control cost by // limiting per cluster maximum cost (by setting limits on attributes whose // values contribute to hourly price). // // Cluster policy permissions limit which policies a user can select in the // Policy drop-down when the user creates a cluster: - A user who has // cluster create permission can select the Unrestricted policy and create // fully-configurable clusters. - A user who has both cluster create // permission and access to cluster policies can select the Unrestricted // policy and policies they have access to. - A user that has access to only // cluster policies, can select the policies they have access to. // // If no policies have been created in the workspace, the Policy drop-down // does not display. // // Only admin users can create, edit, and delete policies. Admin users also // have access to all policies. ClusterPolicies *compute.ClusterPoliciesAPI // The Clusters API allows you to create, start, edit, list, terminate, and // delete clusters. // // Databricks maps cluster node instance types to compute units known as // DBUs. See the instance type pricing page for a list of the supported // instance types and their corresponding DBUs. // // A Databricks cluster is a set of computation resources and configurations // on which you run data engineering, data science, and data analytics // workloads, such as production ETL pipelines, streaming analytics, ad-hoc // analytics, and machine learning. // // You run these workloads as a set of commands in a notebook or as an // automated job. Databricks makes a distinction between all-purpose // clusters and job clusters. You use all-purpose clusters to analyze data // collaboratively using interactive notebooks. You use job clusters to run // fast and robust automated jobs. // // You can create an all-purpose cluster using the UI, CLI, or REST API. You // can manually terminate and restart an all-purpose cluster. Multiple users // can share such clusters to do collaborative interactive analysis. // // IMPORTANT: Databricks retains cluster configuration information for up to // 200 all-purpose clusters terminated in the last 30 days and up to 30 job // clusters recently terminated by the job scheduler. To keep an all-purpose // cluster configuration even after it has been terminated for more than 30 // days, an administrator can pin a cluster to the cluster list. Clusters *compute.ClustersAPI // This API allows executing commands on running clusters. CommandExecutor compute.CommandExecutor // This API allows retrieving information about currently authenticated user // or service principal. CurrentUser *iam.CurrentUserAPI // In general, there is little need to modify dashboards using the API. // However, it can be useful to use dashboard objects to look-up a // collection of related query IDs. The API can also be used to duplicate // multiple dashboards at once since you can get a dashboard definition with // a GET request and then POST it to create a new one. Dashboards *sql.DashboardsAPI // This API is provided to assist you in making new query objects. When // creating a query object, you may optionally specify a `data_source_id` // for the SQL warehouse against which it will run. If you don't already // know the `data_source_id` for your desired SQL warehouse, this API will // help you find it. // // This API does not support searches. It returns the full list of SQL // warehouses in your workspace. We advise you to use any text editor, REST // client, or `grep` to search the response from this API for the name of // your SQL warehouse as it appears in Databricks SQL. DataSources *sql.DataSourcesAPI // DBFS API makes it simple to interact with various data sources without // having to include a users credentials every time to read a file. Dbfs *files.DbfsAPI // The SQL Permissions API is similar to the endpoints of the // :method:permissions/set. However, this exposes only one endpoint, which // gets the Access Control List for a given object. You cannot modify any // permissions using this API. // // There are three levels of permission: // // - `CAN_VIEW`: Allows read-only access // // - `CAN_RUN`: Allows read access and run access (superset of `CAN_VIEW`) // // - `CAN_MANAGE`: Allows all actions: read, run, edit, delete, modify // permissions (superset of `CAN_RUN`) DbsqlPermissions *sql.DbsqlPermissionsAPI Experiments *ml.ExperimentsAPI // An external location is an object that combines a cloud storage path with // a storage credential that authorizes access to the cloud storage path. // Each external location is subject to Unity Catalog access-control // policies that control which users and groups can access the credential. // If a user does not have access to an external location in Unity Catalog, // the request fails and Unity Catalog does not attempt to authenticate to // your cloud tenant on the user’s behalf. // // Databricks recommends using external locations rather than using storage // credentials directly. // // To create external locations, you must be a metastore admin or a user // with the **CREATE_EXTERNAL_LOCATION** privilege. ExternalLocations *catalog.ExternalLocationsAPI // Functions implement User-Defined Functions (UDFs) in Unity Catalog. // // The function implementation can be any SQL expression or Query, and it // can be invoked wherever a table reference is allowed in a query. In Unity // Catalog, a function resides at the same level as a table, so it can be // referenced with the form // __catalog_name__.__schema_name__.__function_name__. Functions *catalog.FunctionsAPI // Registers personal access token for Databricks to do operations on behalf // of the user. // // See [more info]. // // [more info]: https://docs.databricks.com/repos/get-access-tokens-from-git-provider.html GitCredentials *workspace.GitCredentialsAPI // The Global Init Scripts API enables Workspace administrators to configure // global initialization scripts for their workspace. These scripts run on // every node in every cluster in the workspace. // // **Important:** Existing clusters must be restarted to pick up any changes // made to global init scripts. Global init scripts are run in order. If the // init script returns with a bad exit code, the Apache Spark container // fails to launch and init scripts with later position are skipped. If // enough containers fail, the entire cluster fails with a // `GLOBAL_INIT_SCRIPT_FAILURE` error code. GlobalInitScripts *compute.GlobalInitScriptsAPI // In Unity Catalog, data is secure by default. Initially, users have no // access to data in a metastore. Access can be granted by either a // metastore admin, the owner of an object, or the owner of the catalog or // schema that contains the object. Securable objects in Unity Catalog are // hierarchical and privileges are inherited downward. // // Securable objects in Unity Catalog are hierarchical and privileges are // inherited downward. This means that granting a privilege on the catalog // automatically grants the privilege to all current and future objects // within the catalog. Similarly, privileges granted on a schema are // inherited by all current and future objects within that schema. Grants *catalog.GrantsAPI // Groups simplify identity management, making it easier to assign access to // Databricks Workspace, data, and other securable objects. // // It is best practice to assign access to workspaces and access-control // policies in Unity Catalog to groups, instead of to users individually. // All Databricks Workspace identities can be assigned as members of groups, // and members inherit permissions that are assigned to their group. Groups *iam.GroupsAPI // Instance Pools API are used to create, edit, delete and list instance // pools by using ready-to-use cloud instances which reduces a cluster start // and auto-scaling times. // // Databricks pools reduce cluster start and auto-scaling times by // maintaining a set of idle, ready-to-use instances. When a cluster is // attached to a pool, cluster nodes are created using the pool’s idle // instances. If the pool has no idle instances, the pool expands by // allocating a new instance from the instance provider in order to // accommodate the cluster’s request. When a cluster releases an instance, // it returns to the pool and is free for another cluster to use. Only // clusters attached to a pool can use that pool’s idle instances. // // You can specify a different pool for the driver node and worker nodes, or // use the same pool for both. // // Databricks does not charge DBUs while instances are idle in the pool. // Instance provider billing does apply. See pricing. InstancePools *compute.InstancePoolsAPI // The Instance Profiles API allows admins to add, list, and remove instance // profiles that users can launch clusters with. Regular users can list the // instance profiles available to them. See [Secure access to S3 buckets] // using instance profiles for more information. // // [Secure access to S3 buckets]: https://docs.databricks.com/administration-guide/cloud-configurations/aws/instance-profiles.html InstanceProfiles *compute.InstanceProfilesAPI // IP Access List enables admins to configure IP access lists. // // IP access lists affect web application access and REST API access to this // workspace only. If the feature is disabled for a workspace, all access is // allowed for this workspace. There is support for allow lists (inclusion) // and block lists (exclusion). // // When a connection is attempted: 1. **First, all block lists are // checked.** If the connection IP address matches any block list, the // connection is rejected. 2. **If the connection was not rejected by block // lists**, the IP address is compared with the allow lists. // // If there is at least one allow list for the workspace, the connection is // allowed only if the IP address matches an allow list. If there are no // allow lists for the workspace, all IP addresses are allowed. // // For all allow lists and block lists combined, the workspace supports a // maximum of 1000 IP/CIDR values, where one CIDR counts as a single value. // // After changes to the IP access list feature, it can take a few minutes // for changes to take effect. IpAccessLists *settings.IpAccessListsAPI // The Jobs API allows you to create, edit, and delete jobs. // // You can use a Databricks job to run a data processing or data analysis // task in a Databricks cluster with scalable resources. Your job can // consist of a single task or can be a large, multi-task workflow with // complex dependencies. Databricks manages the task orchestration, cluster // management, monitoring, and error reporting for all of your jobs. You can // run your jobs immediately or periodically through an easy-to-use // scheduling system. You can implement job tasks using notebooks, JARS, // Delta Live Tables pipelines, or Python, Scala, Spark submit, and Java // applications. // // You should never hard code secrets or store them in plain text. Use the // :service:secrets to manage secrets in the [Databricks CLI]. Use the // [Secrets utility] to reference secrets in notebooks and jobs. // // [Databricks CLI]: https://docs.databricks.com/dev-tools/cli/index.html // [Secrets utility]: https://docs.databricks.com/dev-tools/databricks-utils.html#dbutils-secrets Jobs *jobs.JobsAPI // The Libraries API allows you to install and uninstall libraries and get // the status of libraries on a cluster. // // To make third-party or custom code available to notebooks and jobs // running on your clusters, you can install a library. Libraries can be // written in Python, Java, Scala, and R. You can upload Java, Scala, and // Python libraries and point to external packages in PyPI, Maven, and CRAN // repositories. // // Cluster libraries can be used by all notebooks running on a cluster. You // can install a cluster library directly from a public repository such as // PyPI or Maven, using a previously installed workspace library, or using // an init script. // // When you install a library on a cluster, a notebook already attached to // that cluster will not immediately see the new library. You must first // detach and then reattach the notebook to the cluster. // // When you uninstall a library from a cluster, the library is removed only // when you restart the cluster. Until you restart the cluster, the status // of the uninstalled library appears as Uninstall pending restart. Libraries *compute.LibrariesAPI // A metastore is the top-level container of objects in Unity Catalog. It // stores data assets (tables and views) and the permissions that govern // access to them. Databricks account admins can create metastores and // assign them to Databricks workspaces to control which workloads use each // metastore. For a workspace to use Unity Catalog, it must have a Unity // Catalog metastore attached. // // Each metastore is configured with a root storage location in a cloud // storage account. This storage location is used for metadata and managed // tables data. // // NOTE: This metastore is distinct from the metastore included in // Databricks workspaces created before Unity Catalog was released. If your // workspace includes a legacy Hive metastore, the data in that metastore is // available in a catalog named hive_metastore. Metastores *catalog.MetastoresAPI ModelRegistry *ml.ModelRegistryAPI // Permissions API are used to create read, write, edit, update and manage // access for various users on different objects and endpoints. Permissions *iam.PermissionsAPI // The Delta Live Tables API allows you to create, edit, delete, start, and // view details about pipelines. // // Delta Live Tables is a framework for building reliable, maintainable, and // testable data processing pipelines. You define the transformations to // perform on your data, and Delta Live Tables manages task orchestration, // cluster management, monitoring, data quality, and error handling. // // Instead of defining your data pipelines using a series of separate Apache // Spark tasks, Delta Live Tables manages how your data is transformed based // on a target schema you define for each processing step. You can also // enforce data quality with Delta Live Tables expectations. Expectations // allow you to define expected data quality and specify how to handle // records that fail those expectations. Pipelines *pipelines.PipelinesAPI // View available policy families. A policy family contains a policy // definition providing best practices for configuring clusters for a // particular use case. // // Databricks manages and provides policy families for several common // cluster use cases. You cannot create, edit, or delete policy families. // // Policy families cannot be used directly to create clusters. Instead, you // create cluster policies using a policy family. Cluster policies created // using a policy family inherit the policy family's policy definition. PolicyFamilies *compute.PolicyFamiliesAPI // Databricks Providers REST API Providers *sharing.ProvidersAPI // These endpoints are used for CRUD operations on query definitions. Query // definitions include the target SQL warehouse, query text, name, // description, tags, parameters, and visualizations. Queries *sql.QueriesAPI // Access the history of queries through SQL warehouses. QueryHistory *sql.QueryHistoryAPI // Databricks Recipient Activation REST API RecipientActivation *sharing.RecipientActivationAPI // Databricks Recipients REST API Recipients *sharing.RecipientsAPI // The Repos API allows users to manage their git repos. Users can use the // API to access all repos that they have manage permissions on. // // Databricks Repos is a visual Git client in Databricks. It supports common // Git operations such a cloning a repository, committing and pushing, // pulling, branch management, and visual comparison of diffs when // committing. // // Within Repos you can develop code in notebooks or other files and follow // data science and engineering code development best practices using Git // for version control, collaboration, and CI/CD. Repos *workspace.ReposAPI // A schema (also called a database) is the second layer of Unity // Catalog’s three-level namespace. A schema organizes tables, views and // functions. To access (or list) a table or view in a schema, users must // have the USE_SCHEMA data permission on the schema and its parent catalog, // and they must have the SELECT permission on the table or view. Schemas *catalog.SchemasAPI // The Secrets API allows you to manage secrets, secret scopes, and access // permissions. // // Sometimes accessing data requires that you authenticate to external data // sources through JDBC. Instead of directly entering your credentials into // a notebook, use Databricks secrets to store your credentials and // reference them in notebooks and jobs. // // Administrators, secret creators, and users granted permission can read // Databricks secrets. While Databricks makes an effort to redact secret // values that might be displayed in notebooks, it is not possible to // prevent such users from reading secrets. Secrets *workspace.SecretsAPI // Identities for use with jobs, automated tools, and systems such as // scripts, apps, and CI/CD platforms. Databricks recommends creating // service principals to run production jobs or modify production data. If // all processes that act on production data run with service principals, // interactive users do not need any write, delete, or modify privileges in // production. This eliminates the risk of a user overwriting production // data by accident. ServicePrincipals *iam.ServicePrincipalsAPI // The Serving Endpoints API allows you to create, update, and delete model // serving endpoints. // // You can use a serving endpoint to serve models from the Databricks Model // Registry. Endpoints expose the underlying models as scalable REST API // endpoints using serverless compute. This means the endpoints and // associated compute resources are fully managed by Databricks and will not // appear in your cloud account. A serving endpoint can consist of one or // more MLflow models from the Databricks Model Registry, called served // models. A serving endpoint can have at most ten served models. You can // configure traffic settings to define how requests should be routed to // your served models behind an endpoint. Additionally, you can configure // the scale of resources that should be applied to each served model. ServingEndpoints *serving.ServingEndpointsAPI Shares *sharing.SharesAPI // The SQL Statement Execution API manages the execution of arbitrary SQL // statements and the fetching of result data. // // **Release status** // // This feature is in [Public Preview]. // // **Getting started** // // We suggest beginning with the [SQL Statement Execution API tutorial]. // // **Overview of statement execution and result fetching** // // Statement execution begins by issuing a // :method:statementexecution/executeStatement request with a valid SQL // statement and warehouse ID, along with optional parameters such as the // data catalog and output format. // // When submitting the statement, the call can behave synchronously or // asynchronously, based on the `wait_timeout` setting. When set between // 5-50 seconds (default: 10) the call behaves synchronously and waits for // results up to the specified timeout; when set to `0s`, the call is // asynchronous and responds immediately with a statement ID that can be // used to poll for status or fetch the results in a separate call. // // **Call mode: synchronous** // // In synchronous mode, when statement execution completes within the `wait // timeout`, the result data is returned directly in the response. This // response will contain `statement_id`, `status`, `manifest`, and `result` // fields. The `status` field confirms success whereas the `manifest` field // contains the result data column schema and metadata about the result set. // The `result` field contains the first chunk of result data according to // the specified `disposition`, and links to fetch any remaining chunks. // // If the execution does not complete before `wait_timeout`, the setting // `on_wait_timeout` determines how the system responds. // // By default, `on_wait_timeout=CONTINUE`, and after reaching // `wait_timeout`, a response is returned and statement execution continues // asynchronously. The response will contain only `statement_id` and // `status` fields, and the caller must now follow the flow described for // asynchronous call mode to poll and fetch the result. // // Alternatively, `on_wait_timeout` can also be set to `CANCEL`; in this // case if the timeout is reached before execution completes, the underlying // statement execution is canceled, and a `CANCELED` status is returned in // the response. // // **Call mode: asynchronous** // // In asynchronous mode, or after a timed-out synchronous request continues, // a `statement_id` and `status` will be returned. In this case polling // :method:statementexecution/getStatement calls are required to fetch the // result and metadata. // // Next, a caller must poll until execution completes (`SUCCEEDED`, // `FAILED`, etc.) by issuing :method:statementexecution/getStatement // requests for the given `statement_id`. // // When execution has succeeded, the response will contain `status`, // `manifest`, and `result` fields. These fields and the structure are // identical to those in the response to a successful synchronous // submission. The `result` field will contain the first chunk of result // data, either `INLINE` or as `EXTERNAL_LINKS` depending on `disposition`. // Additional chunks of result data can be fetched by checking for the // presence of the `next_chunk_internal_link` field, and iteratively `GET` // those paths until that field is unset: `GET // https://$DATABRICKS_HOST/{next_chunk_internal_link}`. // // **Fetching result data: format and disposition** // // Result data from statement execution is available in two formats: JSON, // and [Apache Arrow Columnar]. Statements producing a result set smaller // than 16 MiB can be fetched as `format=JSON_ARRAY`, using the // `disposition=INLINE`. When a statement executed in `INLINE` disposition // exceeds this limit, the execution is aborted, and no result can be // fetched. Using `format=ARROW_STREAM` and `disposition=EXTERNAL_LINKS` // allows large result sets, and with higher throughput. // // The API uses defaults of `format=JSON_ARRAY` and `disposition=INLINE`. // `We advise explicitly setting format and disposition in all production // use cases. // // **Statement response: statement_id, status, manifest, and result** // // The base call :method:statementexecution/getStatement returns a single // response combining `statement_id`, `status`, a result `manifest`, and a // `result` data chunk or link, depending on the `disposition`. The // `manifest` contains the result schema definition and the result summary // metadata. When using `disposition=EXTERNAL_LINKS`, it also contains a // full listing of all chunks and their summary metadata. // // **Use case: small result sets with INLINE + JSON_ARRAY** // // For flows that generate small and predictable result sets (<= 16 MiB), // `INLINE` downloads of `JSON_ARRAY` result data are typically the simplest // way to execute and fetch result data. // // When the result set with `disposition=INLINE` is larger, the result can // be transferred in chunks. After receiving the initial chunk with // :method:statementexecution/executeStatement or // :method:statementexecution/getStatement subsequent calls are required to // iteratively fetch each chunk. Each result response contains a link to the // next chunk, when there are additional chunks to fetch; it can be found in // the field `.next_chunk_internal_link`. This link is an absolute `path` to // be joined with your `$DATABRICKS_HOST`, and of the form // `/api/2.0/sql/statements/{statement_id}/result/chunks/{chunk_index}`. The // next chunk can be fetched by issuing a // :method:statementexecution/getStatementResultChunkN request. // // When using this mode, each chunk may be fetched once, and in order. A // chunk without a field `next_chunk_internal_link` indicates the last chunk // was reached and all chunks have been fetched from the result set. // // **Use case: large result sets with EXTERNAL_LINKS + ARROW_STREAM** // // Using `EXTERNAL_LINKS` to fetch result data in Arrow format allows you to // fetch large result sets efficiently. The primary difference from using // `INLINE` disposition is that fetched result chunks contain resolved // `external_links` URLs, which can be fetched with standard HTTP. // // **Presigned URLs** // // External links point to data stored within your workspace's internal // DBFS, in the form of a presigned URL. The URLs are valid for only a short // period, <= 15 minutes. Alongside each `external_link` is an expiration // field indicating the time at which the URL is no longer valid. In // `EXTERNAL_LINKS` mode, chunks can be resolved and fetched multiple times // and in parallel. // // ---- // // ### **Warning: We recommend you protect the URLs in the EXTERNAL_LINKS.** // // When using the EXTERNAL_LINKS disposition, a short-lived pre-signed URL // is generated, which the client can use to download the result chunk // directly from cloud storage. As the short-lived credential is embedded in // a pre-signed URL, this URL should be protected. // // Since pre-signed URLs are generated with embedded temporary credentials, // you need to remove the authorization header from the fetch requests. // // ---- // // Similar to `INLINE` mode, callers can iterate through the result set, by // using the `next_chunk_internal_link` field. Each internal link response // will contain an external link to the raw chunk data, and additionally // contain the `next_chunk_internal_link` if there are more chunks. // // Unlike `INLINE` mode, when using `EXTERNAL_LINKS`, chunks may be fetched // out of order, and in parallel to achieve higher throughput. // // **Limits and limitations** // // Note: All byte limits are calculated based on internal storage metrics // and will not match byte counts of actual payloads. // // - Statements with `disposition=INLINE` are limited to 16 MiB and will // abort when this limit is exceeded. - Statements with // `disposition=EXTERNAL_LINKS` are limited to 100 GiB. - The maximum query // text size is 16 MiB. - Cancelation may silently fail. A successful // response from a cancel request indicates that the cancel request was // successfully received and sent to the processing engine. However, for // example, an outstanding statement may complete execution during signal // delivery, with the cancel signal arriving too late to be meaningful. // Polling for status until a terminal state is reached is a reliable way to // determine the final state. - Wait timeouts are approximate, occur // server-side, and cannot account for caller delays, network latency from // caller to service, and similarly. - After a statement has been submitted // and a statement_id is returned, that statement's status and result will // automatically close after either of 2 conditions: - The last result chunk // is fetched (or resolved to an external link). - One hour passes with no // calls to get the status or fetch the result. Best practice: in // asynchronous clients, poll for status regularly (and with backoff) to // keep the statement open and alive. - After fetching the last result chunk // (including chunk_index=0) the statement is automatically closed. // // [Apache Arrow Columnar]: https://arrow.apache.org/overview/ // [Public Preview]: https://docs.databricks.com/release-notes/release-types.html // [SQL Statement Execution API tutorial]: https://docs.databricks.com/sql/api/sql-execution-tutorial.html StatementExecution *sql.StatementExecutionAPI // A storage credential represents an authentication and authorization // mechanism for accessing data stored on your cloud tenant. Each storage // credential is subject to Unity Catalog access-control policies that // control which users and groups can access the credential. If a user does // not have access to a storage credential in Unity Catalog, the request // fails and Unity Catalog does not attempt to authenticate to your cloud // tenant on the user’s behalf. // // Databricks recommends using external locations rather than using storage // credentials directly. // // To create storage credentials, you must be a Databricks account admin. // The account admin who creates the storage credential can delegate // ownership to another user or group to manage permissions on it. StorageCredentials *catalog.StorageCredentialsAPI // Primary key and foreign key constraints encode relationships between // fields in tables. // // Primary and foreign keys are informational only and are not enforced. // Foreign keys must reference a primary key in another table. This primary // key is the parent constraint of the foreign key and the table this // primary key is on is the parent table of the foreign key. Similarly, the // foreign key is the child constraint of its referenced primary key; the // table of the foreign key is the child table of the primary key. // // You can declare primary keys and foreign keys as part of the table // specification during table creation. You can also add or drop constraints // on existing tables. TableConstraints *catalog.TableConstraintsAPI // A table resides in the third layer of Unity Catalog’s three-level // namespace. It contains rows of data. To create a table, users must have // CREATE_TABLE and USE_SCHEMA permissions on the schema, and they must have // the USE_CATALOG permission on its parent catalog. To query a table, users // must have the SELECT permission on the table, and they must have the // USE_CATALOG permission on its parent catalog and the USE_SCHEMA // permission on its parent schema. // // A table can be managed or external. From an API perspective, a __VIEW__ // is a particular kind of table (rather than a managed or external table). Tables *catalog.TablesAPI // Enables administrators to get all tokens and delete tokens for other // users. Admins can either get every token, get a specific token by ID, or // get all tokens for a particular user. TokenManagement *settings.TokenManagementAPI // The Token API allows you to create, list, and revoke tokens that can be // used to authenticate and access Databricks REST APIs. Tokens *settings.TokensAPI // User identities recognized by Databricks and represented by email // addresses. // // Databricks recommends using SCIM provisioning to sync users and groups // automatically from your identity provider to your Databricks Workspace. // SCIM streamlines onboarding a new employee or team by using your identity // provider to create users and groups in Databricks Workspace and give them // the proper level of access. When a user leaves your organization or no // longer needs access to Databricks Workspace, admins can terminate the // user in your identity provider and that user’s account will also be // removed from Databricks Workspace. This ensures a consistent offboarding // process and prevents unauthorized users from accessing sensitive data. Users *iam.UsersAPI // Volumes are a Unity Catalog (UC) capability for accessing, storing, // governing, organizing and processing files. Use cases include running // machine learning on unstructured data such as image, audio, video, or PDF // files, organizing data sets during the data exploration stages in data // science, working with libraries that require access to the local file // system on cluster machines, storing library and config files of arbitrary // formats such as .whl or .txt centrally and providing secure access across // workspaces to it, or transforming and querying non-tabular data files in // ETL. Volumes *catalog.VolumesAPI // A SQL warehouse is a compute resource that lets you run SQL commands on // data objects within Databricks SQL. Compute resources are infrastructure // resources that provide processing capabilities in the cloud. Warehouses *sql.WarehousesAPI // The Workspace API allows you to list, import, export, and delete // notebooks and folders. // // A notebook is a web-based interface to a document that contains runnable // code, visualizations, and explanatory text. Workspace *workspace.WorkspaceAPI // This API allows updating known workspace settings for advanced users. WorkspaceConf *settings.WorkspaceConfAPI }
func NewWorkspaceClient ¶
func NewWorkspaceClient(c ...*Config) (*WorkspaceClient, error)
NewWorkspaceClient creates new Databricks SDK client for Workspaces or returns error in case configuration is wrong
Directories ¶
Path | Synopsis |
---|---|
examples
|
|
internal
|
|
code
Package holds higher-level abstractions on top of OpenAPI that are used to generate code via text/template for Databricks SDK in different languages.
|
Package holds higher-level abstractions on top of OpenAPI that are used to generate code via text/template for Databricks SDK in different languages. |
gen
Usage: openapi-codegen
|
Usage: openapi-codegen |
Databricks SDK for Go APIs
|
Databricks SDK for Go APIs |
billing
These APIs allow you to manage Billable Usage, Budgets, Log Delivery, etc.
|
These APIs allow you to manage Billable Usage, Budgets, Log Delivery, etc. |
catalog
These APIs allow you to manage Account Metastore Assignments, Account Metastores, Account Storage Credentials, Catalogs, External Locations, Functions, Grants, Metastores, Schemas, Storage Credentials, Table Constraints, Tables, Volumes, etc.
|
These APIs allow you to manage Account Metastore Assignments, Account Metastores, Account Storage Credentials, Catalogs, External Locations, Functions, Grants, Metastores, Schemas, Storage Credentials, Table Constraints, Tables, Volumes, etc. |
compute
These APIs allow you to manage Cluster Policies, Clusters, Command Execution, Global Init Scripts, Instance Pools, Instance Profiles, Libraries, Policy Families, etc.
|
These APIs allow you to manage Cluster Policies, Clusters, Command Execution, Global Init Scripts, Instance Pools, Instance Profiles, Libraries, Policy Families, etc. |
files
DBFS API makes it simple to interact with various data sources without having to include a users credentials every time to read a file.
|
DBFS API makes it simple to interact with various data sources without having to include a users credentials every time to read a file. |
iam
These APIs allow you to manage Account Groups, Account Service Principals, Account Users, Current User, Groups, Permissions, Service Principals, Users, Workspace Assignment, etc.
|
These APIs allow you to manage Account Groups, Account Service Principals, Account Users, Current User, Groups, Permissions, Service Principals, Users, Workspace Assignment, etc. |
jobs
The Jobs API allows you to create, edit, and delete jobs.
|
The Jobs API allows you to create, edit, and delete jobs. |
ml
These APIs allow you to manage Experiments, Model Registry, etc.
|
These APIs allow you to manage Experiments, Model Registry, etc. |
oauth2
These APIs allow you to manage Custom App Integration, O Auth Enrollment, Published App Integration, etc.
|
These APIs allow you to manage Custom App Integration, O Auth Enrollment, Published App Integration, etc. |
pipelines
The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines.
|
The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines. |
provisioning
These APIs allow you to manage Credentials, Encryption Keys, Networks, Private Access, Storage, Vpc Endpoints, Workspaces, etc.
|
These APIs allow you to manage Credentials, Encryption Keys, Networks, Private Access, Storage, Vpc Endpoints, Workspaces, etc. |
serving
The Serving Endpoints API allows you to create, update, and delete model serving endpoints.
|
The Serving Endpoints API allows you to create, update, and delete model serving endpoints. |
settings
These APIs allow you to manage Account Ip Access Lists, Ip Access Lists, Token Management, Tokens, Workspace Conf, etc.
|
These APIs allow you to manage Account Ip Access Lists, Ip Access Lists, Token Management, Tokens, Workspace Conf, etc. |
sharing
These APIs allow you to manage Providers, Recipient Activation, Recipients, Shares, etc.
|
These APIs allow you to manage Providers, Recipient Activation, Recipients, Shares, etc. |
sql
These APIs allow you to manage Alerts, Dashboards, Data Sources, Dbsql Permissions, Queries, Query History, Statement Execution, Warehouses, etc.
|
These APIs allow you to manage Alerts, Dashboards, Data Sources, Dbsql Permissions, Queries, Query History, Statement Execution, Warehouses, etc. |
workspace
These APIs allow you to manage Git Credentials, Repos, Secrets, Workspace, etc.
|
These APIs allow you to manage Git Credentials, Repos, Secrets, Workspace, etc. |
Click to show internal directories.
Click to hide internal directories.