Skip to content

tomarv2/terraform-databricks-workspace-management

Repository files navigation

❗️ Important

👉 This module assumes you have Databricks Workspace AWS or Azure already deployed.

👉 Workspace URL

👉 DAPI Token

Versions

  • Module tested for Terraform 1.0.1.
  • databricks/databricks provider version 1.3.1
  • AWS provider version 4.14.
  • main branch: Provider versions not pinned to keep up with Terraform releases.
  • tags releases: Tags are pinned with versions (use ).

What this module does?

  • This is where you would normally start with if you have just deployed your databricks workspace.

Two cluster modes are supported by this module:

  • Single Node mode: To deploy cluster in Single Node mode, update fixed_value to 0:
fixed_value         = 0
  • Standard mode: To deploy in Standard mode, two options are available:
fixed_value         = 1 or more

OR

auto_scaling         = [1,3]

Cluster can have one of these permissions: CAN_ATTACH_TO , CAN_RESTART and CAN_MANAGE.

cluster_access_control = [
  {
    group_name       = "<group_name>"
    permission_level = "CAN_RESTART"
  },
  {
    user_name       = "<user_name>"
    permission_level = "CAN_RESTART"
  }
]
  • To build cluster with new cluster policy, use:
deploy_cluster_policy = true
policy_overrides = {
  "dbus_per_hour" : {
    "type" : "range",
    "maxValue" : 10
  },
  "autotermination_minutes" : {
    "type" : "fixed",
    "value" : 30,
    "hidden" : true
  }
}
  • To use existing Cluster policy, specify the existing policy id:
cluster_policy_id = "E0123456789"

To get existing policy id use:

curl -X GET --header "Authorization: Bearer $DAPI_TOKEN"  https://<workspace_name>/api/2.0/policies/clusters/list \
--data '{ "sort_order": "DESC", "sort_column": "POLICY_CREATION_TIME" }'

Cluster Policy ACL

policy_access_control = [
  {
    group_name       = "<group_name>"
    permission_level = "CAN_USE"
  },
  {
    user_name       = "<user_name>"
    permission_level = "CAN_USE"
  }
]

Note: To configure Instance Pool, add below configuration:

deploy_worker_instance_pool           = true
min_idle_instances                    = 1
max_capacity                          = 5
idle_instance_autotermination_minutes = 30

Instance pool can have one of these permissions: CAN_ATTACH_TO and CAN_MANAGE.

instance_pool_access_control = [
  {
    group_name       = "<group_name>"
    permission_level = "CAN_ATTACH_TO"
  },
  {
    user_name       = "<user_name>"
    permission_level = "CAN_ATTACH_TO"
  },
]

❗️ Important

If deploy_worker_instance_pool is set to true and auto_scaling is enabled. Ensure max_capacity of Cluster Instance Pool is more than auto_scaling max value for Cluster.

Two options are available:

  • Deploy Job to an existing cluster.
  • Deploy new Cluster and then deploy Job.

Two options are available to attach notebooks to a job:

  • Attach existing notebook to a job.
  • Create new notebook and attach it to a job.

Job can have one of these permissions: CAN_VIEW, CAN_MANAGE_RUN, IS_OWNER, and CAN_MANAGE.

Admins have CAN_MANAGE permission by default, and they can assign that permission to non-admin users, and service principals.

Job creator has IS_OWNER permission. Destroying databricks_permissions resource for a job would revert ownership to the creator.

Note:

  • A job must have exactly one owner. If resource is changed and no owner is specified, currently authenticated principal would become new owner of the job.
  • A job cannot have a group as an owner.
  • Jobs triggered through Run Now assume the permissions of the job owner and not the user, and service principal who issued Run Now.
jobs_access_control = [
  {
    group_name       = "<group_name>"
    permission_level = "CAN_MANAGE_RUN"
  },
   {
    user_name       = "<user_name>"
    permission_level = "CAN_MANAGE_RUN"
  }
]

AWS only

Add instance profile at cluster creation time. It can control which data a given cluster can access through cloud-native controls.

add_instance_profile_to_workspace = true (default false)
aws_attributes = {
    instance_profile_arn = "arn:aws:iam::123456789012:instance-profile/aws-instance-role"
}

Note: add_instance_profile_to_workspace to add Instance profile to Databricks workspace. To use existing set it to false.

Put notebooks in notebooks folder and provide below information:

  notebooks = [
    {
      name       = "demo_notebook1"
      language   = "PYTHON"
      local_path = "notebooks/sample1.py"
      path       = "/Shared/demo/sample1.py"
    },
    {
      name       = "demo_notebook2"
      local_path = "notebooks/sample2.py"
    }
  ]

Notebook ACL

Notebook can have one of these permissions: CAN_READ, CAN_RUN, CAN_EDIT, and CAN_MANAGE.

notebooks_access_control = [
  {
    group_name       = "<group_name>"
    permission_level = "CAN_MANAGE"
  },
  {
    user_name       = "<user_name>"
    permission_level = "CAN_MANAGE"
  }
]
  • Try this: If you want to test what resources are getting deployed.

Usage

Option 1:

terrafrom init
terraform plan -var='teamid=tryme' -var='prjid=project'
terraform apply -var='teamid=tryme' -var='prjid=project'
terraform destroy -var='teamid=tryme' -var='prjid=project'

Note: With this option please take care of remote state storage

Option 2:

Recommended method (store remote state in S3 using prjid and teamid to create directory structure):

  • Create python 3.8+ virtual environment
python3 -m venv <venv name>
  • Install package:
pip install tfremote
  • Set below environment variables based on cloud provider.

  • Updated examples directory with required values.

NOTE:


Please refer to examples directory link for references.

Coming up

Helpful links

Troubleshooting

If you see error messages. Try running the same the command again.

Error: Failed to delete token in Scope <scope name>
Error: Scope <scope name> does not exist!

Requirements

Name Version
terraform >= 1.0.1
aws >= 4.14
databricks >= 0.5.7

Providers

Name Version
databricks >= 0.5.7

Modules

No modules.

Resources

Name Type
databricks_cluster.cluster resource
databricks_cluster_policy.this resource
databricks_group.this resource
databricks_group_member.group_members resource
databricks_instance_pool.driver_instance_nodes resource
databricks_instance_pool.worker_instance_nodes resource
databricks_instance_profile.shared resource
databricks_job.existing_cluster_new_job_existing_notebooks resource
databricks_job.existing_cluster_new_job_new_notebooks resource
databricks_job.new_cluster_new_job_existing_notebooks resource
databricks_job.new_cluster_new_job_new_notebooks resource
databricks_library.maven resource
databricks_library.python_wheel resource
databricks_notebook.notebook_file resource
databricks_notebook.notebook_file_deployment resource
databricks_permissions.cluster resource
databricks_permissions.driver_pool resource
databricks_permissions.existing_cluster_new_job_existing_notebooks resource
databricks_permissions.existing_cluster_new_job_new_notebooks resource
databricks_permissions.jobs_notebook resource
databricks_permissions.new_cluster_new_job_existing_notebooks resource
databricks_permissions.new_cluster_new_job_new_notebooks resource
databricks_permissions.notebook resource
databricks_permissions.policy resource
databricks_permissions.worker_pool resource
databricks_secret_acl.spectators resource
databricks_user.users resource
databricks_current_user.me data source
databricks_node_type.cluster_node_type data source
databricks_spark_version.latest data source

Inputs

Name Description Type Default Required
add_instance_profile_to_workspace Existing AWS instance profile ARN bool false no
allow_cluster_create This is a field to allow the group to have cluster create privileges. More fine grained permissions could be assigned with databricks_permissions and cluster_id argument. Everyone without allow_cluster_create argument set, but with permission to use Cluster Policy would be able to create clusters, but within boundaries of that specific policy. bool true no
allow_instance_pool_create This is a field to allow the group to have instance pool create privileges. More fine grained permissions could be assigned with databricks_permissions and instance_pool_id argument. bool true no
always_running Whenever the job is always running, like a Spark Streaming application, on every update restart the current active run or start it again, if nothing it is not running. False by default. bool false no
auto_scaling Number of min and max workers in auto scale. list(any) null no
aws_attributes Optional configuration block contains attributes related to clusters running on AWS. any null no
azure_attributes Optional configuration block contains attributes related to clusters running on Azure. any null no
category Node category, which can be one of: General purpose, Memory optimized, Storage optimized, Compute optimized, GPU string "General purpose" no
cluster_access_control Cluster access control any null no
cluster_autotermination_minutes cluster auto termination duration number 30 no
cluster_id Existing cluster id string null no
cluster_name Cluster name string null no
cluster_policy_id Exiting cluster policy id string null no
create_group Create a new group, if group already exists the deployment will fail. bool false no
create_user Create a new user, if user already exists the deployment will fail. bool false no
custom_tags Extra custom tags any null no
data_security_mode Access mode string "NONE" no
databricks_username User allowed to access the platform. string "" no
deploy_cluster feature flag, true or false bool false no
deploy_cluster_policy feature flag, true or false bool false no
deploy_driver_instance_pool Driver instance pool bool false no
deploy_job_cluster feature flag, true or false bool false no
deploy_jobs feature flag, true or false bool false no
deploy_worker_instance_pool Worker instance pool bool false no
driver_node_type_id The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id. string null no
email_notifications Email notification block. any null no
fixed_value Number of nodes in the cluster. number 0 no
gb_per_core Number of gigabytes per core available on instance. Conflicts with min_memory_gb. Defaults to 0. string 0 no
gcp_attributes Optional configuration block contains attributes related to clusters running on GCP. any null no
gpu GPU required or not. bool false no
idle_instance_autotermination_minutes idle instance auto termination duration number 20 no
instance_pool_access_control Instance pool access control any null no
jobs_access_control Jobs access control any null no
libraries Installs a library on databricks_cluster map(any) {} no
local_disk Pick only nodes with local storage. Defaults to false. string true no
local_notebooks Local path to the notebook(s) that will be used by the job any [] no
max_capacity instance pool maximum capacity number 3 no
max_concurrent_runs An optional maximum allowed number of concurrent runs of the job. number null no
max_retries An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with a FAILED result_state or INTERNAL_ERROR life_cycle_state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry. number 0 no
min_cores Minimum number of CPU cores available on instance. Defaults to 0. string 0 no
min_gpus Minimum number of GPU's attached to instance. Defaults to 0. string 0 no
min_idle_instances instance pool minimum idle instances number 1 no
min_memory_gb Minimum amount of memory per node in gigabytes. Defaults to 0. string 0 no
min_retry_interval_millis An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried. number null no
ml ML required or not. bool false no
notebooks Local path to the notebook(s) that will be deployed any [] no
notebooks_access_control Notebook access control any null no
policy_access_control Policy access control any null no
policy_overrides Cluster policy overrides any null no
prjid (Required) Name of the project/stack e.g: mystack, nifieks, demoaci. Should not be changed after running 'tf apply' string n/a yes
remote_notebooks Path to notebook(s) in the databricks workspace that will be used by the job any [] no
retry_on_timeout An optional policy to specify whether to retry a job when it times out. The default behavior is to not retry on timeout. bool false no
schedule Job schedule configuration. map(any) null no
spark_conf Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration. any null no
spark_env_vars Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers. any null no
spark_version Runtime version of the cluster. Any supported databricks_spark_version id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control. string null no
task_parameters Base parameters to be used for each run of this job. map(any) {} no
teamid (Required) Name of the team/group e.g. devops, dataengineering. Should not be changed after running 'tf apply' string n/a yes
timeout An optional timeout applied to each run of this job. The default behavior is to have no timeout. number null no
worker_node_type_id The node type of the Spark worker. string null no

Outputs

Name Description
cluster_id databricks cluster id
cluster_name databricks cluster name
cluster_policy_id databricks cluster policy permissions
databricks_group databricks group name
databricks_group_member databricks group members
databricks_secret_acl databricks secret acl
databricks_user databricks user name
databricks_user_id databricks user id
existing_cluster_new_job_existing_notebooks_id databricks new cluster job id
existing_cluster_new_job_existing_notebooks_job databricks new cluster job url
existing_cluster_new_job_new_notebooks_id databricks new cluster job id
existing_cluster_new_job_new_notebooks_job databricks new cluster job url
instance_profile databricks instance profile ARN
new_cluster_new_job_existing_notebooks_id databricks job id
new_cluster_new_job_existing_notebooks_job databricks job url
new_cluster_new_job_new_notebooks_id databricks job id
new_cluster_new_job_new_notebooks_job databricks job url
notebook_url databricks notebook url
notebook_url_standalone databricks notebook url standalone