azure - Terraform Azure Databricks 提供程序错误

标签 azure databricks azure-databricks terraform-provider-azure terraform-provider-databricks

我需要一些帮助来了解登录 Databricks 的各种形式。我正在使用 Terraform 来配置 Azure Databricks 我想知道下面两个代码的区别 当我使用选项 1 时,出现如图所示的错误

选项 1:

  required_providers {
    azuread     = "~> 1.0"
    azurerm     = "~> 2.0"
    azuredevops = { source = "registry.terraform.io/microsoft/azuredevops", version = "~> 0.0" }
    databricks  = { source = "registry.terraform.io/databrickslabs/databricks", version = "~> 0.0" }
  }
}

provider "random" {}
provider "azuread" {
  tenant_id     = var.project.arm.tenant.id
  client_id     = var.project.arm.client.id
  client_secret = var.secret.arm.client.secret
}

provider "databricks" {
  host          = azurerm_databricks_workspace.db-workspace.workspace_url
  azure_use_msi = true
}

resource "azurerm_databricks_workspace" "db-workspace" {
  name                          = module.names-db-workspace.environment.databricks_workspace.name_unique
  resource_group_name           = module.resourcegroup.resource_group.name
  location                      = module.resourcegroup.resource_group.location
  sku                           = "premium"
  public_network_access_enabled = true

  custom_parameters {
    no_public_ip                                         = true
    virtual_network_id                                   = module.virtualnetwork["centralus"].virtual_network.self.id
    public_subnet_name                                   = module.virtualnetwork["centralus"].virtual_network.subnets["db-sub-1-public"].name
    private_subnet_name                                  = module.virtualnetwork["centralus"].virtual_network.subnets["db-sub-2-private"].name
    public_subnet_network_security_group_association_id  = module.virtualnetwork["centralus"].virtual_network.nsgs.associations.subnets["databricks-public-nsg-db-sub-1-public"].id
    private_subnet_network_security_group_association_id = module.virtualnetwork["centralus"].virtual_network.nsgs.associations.subnets["databricks-private-nsg-db-sub-2-private"].id
  }
  tags = local.tags
}

Databricks 集群创建

resource "databricks_cluster" "dbcselfservice" {
  cluster_name            = format("adb-cluster-%s-%s", var.project.name, var.project.environment.name)
  spark_version           = var.spark_version
  node_type_id            = var.node_type_id
  autotermination_minutes = 20
  autoscale {
    min_workers = 1
    max_workers = 7
  }
  azure_attributes {
    availability       = "SPOT_AZURE"
    first_on_demand    = 1
    spot_bid_max_price = 100
  }
  depends_on = [
    azurerm_databricks_workspace.db-workspace
  ]
}

Databricks 工作区 RBAC 权限

resource "databricks_group" "db-group" {
  display_name               = format("adb-users-%s", var.project.name)
  allow_cluster_create       = true
  allow_instance_pool_create = true
  depends_on = [
    resource.azurerm_databricks_workspace.db-workspace
  ]
}

resource "databricks_user" "dbuser" {
  count            = length(local.display_name)
  display_name     = local.display_name[count.index]
  user_name        = local.user_name[count.index]
  workspace_access = true
  depends_on = [
    resource.azurerm_databricks_workspace.db-workspace
  ]
}

将成员添加到 Databricks 管理组

resource "databricks_group_member" "i-am-admin" {
  for_each  = toset(local.email_address)
  group_id  = data.databricks_group.admins.id
  member_id = databricks_user.dbuser[index(local.email_address, each.key)].id
  depends_on = [
    resource.azurerm_databricks_workspace.db-workspace
  ]
}

data "databricks_group" "admins" {
  display_name = "admins"
  depends_on = [
    #    resource.databricks_cluster.dbcselfservice,
    resource.azurerm_databricks_workspace.db-workspace
  ]
}

我在 TF 应用时遇到的错误如下:

Error: User not authorized

with databricks_user.dbuser[1],
on resources.adb.tf line 80, in resource "databricks_user" "dbuser":
80: resource "databricks_user" "dbuser"{


Error: User not authorized

with databricks_user.dbuser[0],
on resources.adb.tf line 80, in resource "databricks_user" "dbuser":
80: resource "databricks_user" "dbuser"{

Error: cannot refresh AAD token: adal:Refresh request failed. Status Code =  '500'. Response body: {"error":"server_error", "error_description":"Internal server error"} Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.core.windows.net%2F

with databricks_group.db-group,
on resources.adb.tf line 80, in resource "databricks_group" "db-group":
71: resource "databricks_group" "db-group"{

错误是因为下面的这个 block 引起的吗?

provider "databricks" {
  host          = azurerm_databricks_workspace.db-workspace.workspace_url
  azure_use_msi = true
}

我只需要在单击门户中的 URL 时自动登录。那么我该用什么呢?为什么我们需要提供两次 databricks 提供程序,一次在 required_providers 下,另一次在提供程序“databricks”中? 我已经看到,如果我不提供第二个提供商,我会收到错误:

"authentication is not configured for provider"

最佳答案

正如评论中提到的,如果您使用 Azure CLI 身份验证,即使用您的用户名和密码进行 az 登录,那么您可以使用以下代码:

terraform {
  required_providers {
    databricks = {
      source = "databrickslabs/databricks"
      version = "0.3.11"
    }
  }
}
provider "azurerm" {
  features {}
}
provider "databricks" {
    host = azurerm_databricks_workspace.example.workspace_url
}

resource "azurerm_databricks_workspace" "example" {
  name                        = "DBW-ansuman"
  resource_group_name         = azurerm_resource_group.example.name
  location                    = azurerm_resource_group.example.location
  sku                         = "premium"
  managed_resource_group_name = "ansuman-DBW-managed-without-lb"

  public_network_access_enabled = true

  custom_parameters {
    no_public_ip        = true
    public_subnet_name  = azurerm_subnet.public.name
    private_subnet_name = azurerm_subnet.private.name
    virtual_network_id  = azurerm_virtual_network.example.id

    public_subnet_network_security_group_association_id  = azurerm_subnet_network_security_group_association.public.id
    private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
  }

  tags = {
    Environment = "Production"
    Pricing     = "Standard"
  }
}
data "databricks_node_type" "smallest" {
  local_disk = true
    depends_on = [
    azurerm_databricks_workspace.example
  ]
}
data "databricks_spark_version" "latest_lts" {
  long_term_support = true
    depends_on = [
    azurerm_databricks_workspace.example
  ]
}
resource "databricks_cluster" "dbcselfservice" {
  cluster_name            = "Shared Autoscaling"
  spark_version           = data.databricks_spark_version.latest_lts.id
  node_type_id            = data.databricks_node_type.smallest.id
  autotermination_minutes = 20
  autoscale {
    min_workers = 1
    max_workers = 7
  }
  azure_attributes {
    availability       = "SPOT_AZURE"
    first_on_demand    = 1
    spot_bid_max_price = 100
  }
  depends_on = [
    azurerm_databricks_workspace.example
  ]
}
resource "databricks_group" "db-group" {
  display_name               = "adb-users-admin"
  allow_cluster_create       = true
  allow_instance_pool_create = true
  depends_on = [
    resource.azurerm_databricks_workspace.example
  ]
}

resource "databricks_user" "dbuser" {
  display_name     = "Rahul Sharma"
  user_name        = "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="debba6bfb3aeb2bb9ebdb1b0aab1adb1f0bdb1b3" rel="noreferrer noopener nofollow">[email protected]</a>"
  workspace_access = true
  depends_on = [
    resource.azurerm_databricks_workspace.example
  ]
}
resource "databricks_group_member" "i-am-admin" {
  group_id  = databricks_group.db-group.id
  member_id = databricks_user.dbuser.id
  depends_on = [
    resource.azurerm_databricks_workspace.example
  ]
}

输出:

enter image description here

<小时/>

如果您使用服务主体作为身份验证,那么您可以使用如下所示的内容:

terraform {
  required_providers {
    databricks = {
      source = "databrickslabs/databricks"
      version = "0.3.11"
    }
  }
}
provider "azurerm" {
  subscription_id = "948d4068-xxxx-xxxx-xxxx-e00a844e059b"
  tenant_id = "72f988bf-xxxx-xxxx-xxxx-2d7cd011db47"
  client_id = "f6a2f33d-xxxx-xxxx-xxxx-d713a1bb37c0"
  client_secret = "inl7Q~Gvdxxxx-xxxx-xxxxyaGPF3uSoL"
  features {}
}
provider "databricks" {
    host = azurerm_databricks_workspace.example.workspace_url
    azure_client_id = "f6a2f33d-xxxx-xxxx-xxxx-d713a1bb37c0"
    azure_client_secret = "inl7Q~xxxx-xxxx-xxxxg6ntiyaGPF3uSoL"
    azure_tenant_id = "72f988bf-xxxx-xxxx-xxxx-2d7cd011db47"
}


resource "azurerm_databricks_workspace" "example" {
  name                        = "DBW-ansuman"
  resource_group_name         = azurerm_resource_group.example.name
  location                    = azurerm_resource_group.example.location
  sku                         = "premium"
  managed_resource_group_name = "ansuman-DBW-managed-without-lb"

  public_network_access_enabled = true

  custom_parameters {
    no_public_ip        = true
    public_subnet_name  = azurerm_subnet.public.name
    private_subnet_name = azurerm_subnet.private.name
    virtual_network_id  = azurerm_virtual_network.example.id

    public_subnet_network_security_group_association_id  = azurerm_subnet_network_security_group_association.public.id
    private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
  }

  tags = {
    Environment = "Production"
    Pricing     = "Standard"
  }
}
data "databricks_node_type" "smallest" {
  local_disk = true
    depends_on = [
    azurerm_databricks_workspace.example
  ]
}
data "databricks_spark_version" "latest_lts" {
  long_term_support = true
    depends_on = [
    azurerm_databricks_workspace.example
  ]
}
resource "databricks_cluster" "dbcselfservice" {
  cluster_name            = "Shared Autoscaling"
  spark_version           = data.databricks_spark_version.latest_lts.id
  node_type_id            = data.databricks_node_type.smallest.id
  autotermination_minutes = 20
  autoscale {
    min_workers = 1
    max_workers = 7
  }
  azure_attributes {
    availability       = "SPOT_AZURE"
    first_on_demand    = 1
    spot_bid_max_price = 100
  }
  depends_on = [
    azurerm_databricks_workspace.example
  ]
}
resource "databricks_group" "db-group" {
  display_name               = "adb-users-admin"
  allow_cluster_create       = true
  allow_instance_pool_create = true
  depends_on = [
    resource.azurerm_databricks_workspace.example
  ]
}

resource "databricks_user" "dbuser" {
  display_name     = "Rahul Sharma"
  user_name        = "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c3a6bba2aeb3afa683a0acadb7acb0aceda0acae" rel="noreferrer noopener nofollow">[email protected]</a>"
  workspace_access = true
  depends_on = [
    resource.azurerm_databricks_workspace.example
  ]
}
resource "databricks_group_member" "i-am-admin" {
  group_id  = databricks_group.db-group.id
  member_id = databricks_user.dbuser.id
  depends_on = [
    resource.azurerm_databricks_workspace.example
  ]
}
<小时/>

And why do we need to provide two times databricks providers, once under required_providers and again in provider "databricks"?

required_providers 用于从源(即 TerraformRegistry)下载并初始化所需的提供程序。但是Provider Block用于进一步配置下载的Provider,例如描述client_id、功能 block 等,可用于身份验证或其他配置。

关于azure - Terraform Azure Databricks 提供程序错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69982945/

相关文章:

azure - 如何将对象从 azure 数据工厂查找传递到笔记本,以便我可以在 python 脚本中使用 object/json

azure - 资源组 beginDeleteByName 与 deleteByNameAsync

Azure API应用程序无法编写部署脚本

azure - 使用 Azure 表存储的 Databricks Scala 应用程序出现错误

Databricks 删除增量表?

dataframe - Azure Blob 存储 - 如何读取源字符串 : wasbs://training@dbtrainsouthcentralus. blob.core.windows.net

azure - 我需要在特定版本的 Delta 表之上创建一个 View 。有没有办法在 Synapse 中并使用 T-SQL 来完成此操作?

Azure CLI - 自动化帐户替换内容打破新行

python - 使用 python 通过 databricks 获取 Azure Gen1 中文件的最后修改日期

azure - 使用 Simba 驱动程序将 ODBC 连接到 AzureDatabricks