Databricks provides a cli tool via a python library that allows you to administer most of the core functionality for a Databricks implementation. Within the CLI library is an API client, and multiple Service objects that provide methods that map to each of that each service’s API endpoints.
These interfaces aren’t mentioned at all in the README documentation, but one can create an API client for each service simply from the cli library
from databricks_cli.sdk import ApiClient from databricks_cli.sdk import service host = "mycompany.cloud.databricks.com" token = "mytoken" client = ApiClient(host=host, token=token) jobs_client = service.JobsService(client) cluster_client = service.ClusterService(client) managed_library = service.ManagedLibraryService(client) # ... etc for dbfs, workspace, secret, groups clusters = cluster_client.list_clusters()
in which each API service is instantiated using the
In the databricks-api package, this entire set of services is exposed and simplified into a single, autogenerated API client that wraps the
from databricks_api import DatabricksAPI host = "mycompany.cloud.databricks.com" token = "mytoken" databricks = DatabricksAPI(host=host, token=token) clusters = databricks.cluster.list_clusters()
DatabricksAPI instance provides attributes for each service described in the documentation. Each attribute object exposes the CLI services’ underlying methods which correspond to the available API 2.0 endpoints, i.e.
For example, the
managed_library methods corresponding to the Libraries API:
DatabricksAPI.managed_library.all_cluster_statuses() DatabricksAPI.managed_library.cluster_status(cluster_id) DatabricksAPI.managed_library.install_libraries( cluster_id, libraries=None ) DatabricksAPI.managed_library.uninstall_libraries( cluster_id, libraries=None )
For more details view the documentation here, which is autogenerated from the