azure - Azure AKS 的 aci-connector-linux pod 处于 CrashLoopBackOff 状态

标签 azure kubernetes terraform azure-aks azure-cli

我在尝试使用 TerraformAzure Kubernetes 集群设置虚拟节点时遇到问题。

当我检查 pod 中的 aci-connector-linux 时,出现以下错误:

Events:
  Type     Reason   Age                     From     Message
  ----     ------   ----                    ----     -------
  Normal   Pulled   41m (x50 over 4h26m)    kubelet  Container image "mcr.microsoft.com/oss/virtual-kubelet/virtual-kubelet:1.4.1" already present on machine
  Warning  BackOff  68s (x1222 over 4h26m)  kubelet  Back-off restarting failed container

我还使用此处的文档向 Azure Kubernetes 集群的系统分配身份授予了所需的贡献者角色 - https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/examples/kubernetes/aci_connector_linux/main.tf但我仍然收到 CrashLoopBackOff 状态错误。

最佳答案

我终于修好了。

该问题是由 aci-connector-linux 的过时文档引起的在这里 - https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/examples/kubernetes/aci_connector_linux/main.tf它将角色分配给 Azure Kubernetes 群集的托管标识

这是我修复它的方法:

Azure Kubernetes 服务创建独立于 Kubernetes 群集资源组的节点资源组。在节点资源组中,AKS 为 aci-connector-linux 创建托管标识。节点资源组的名称通常为 MC_<KubernetesResourceGroupName_KubernetesServiceName-KubernetesResourceGroupLocation> ,所以如果您的 KubernetesResourceGroupNameMyResourceGroup如果 KubernetesServiceNamemy-test-cluster如果 KubernetesResourceGroupLocation westeurope ,那么节点资源组将为 MC_MyResourceGroup_my-test-cluster_westeurope 。您可以在 Azure 门户中的资源组下查看资源。

接下来,您可以通过查看 aci-connector-linux 的日志来查看问题的根本原因。 pod 使用命令:

kubectl logs aci-connector-linux-577bf54d75-qm9kl -n kube-system

您将得到如下输出:

time="2022-06-29T15:23:38Z" level=fatal msg="error initializing provider azure: error setting up network profile: error while looking up subnet: api call to https://management.azure.com/subscriptions/0237fb7-7530-43ba-96ae-927yhfad80d1/resourcegroups/MyResourceGroup/providers/Microsoft.Network/virtualNetworks/my-vnet/subnets/k8s-aci-node-pool-subnet?api-version=2018-08-01: got HTTP response status code 403 error code "AuthorizationFailed": The client '560df3e9b-9f64-4faf-aa7c-6tdg779f81c7' with object id '560df3e9b-9f64-4faf-aa7c-6tdg779f81c7' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/0237fb7-7530-43ba-96ae-927yhfad80d1/resourcegroups/MyResourceGroup/providers/Microsoft.Network/virtualNetworks/my-vnet/subnets/k8s-aci-node-pool-subnet' or the scope is invalid. If access was recently granted, please refresh your credentials."

您可以使用以下代码在 Terraform 中修复此问题:

# Get subnet ID
data "azurerm_subnet" "k8s_aci" {
  name                 = "k8s-aci-node-pool-uat-subnet"
  virtual_network_name = "sparkle-uat-vnet"
  resource_group_name  = data.azurerm_resource_group.main.name
}

# Get the Identity of a service principal
data "azuread_service_principal" "aks_aci_identity" {
  display_name = "aciconnectorlinux-${var.kubernetes_cluster_name}"
  depends_on = [module.kubernetes_service_uat]
}

# Assign role to aci identity
module "role_assignment_aci_nodepool_subnet" {
  source = "../../../modules/azure/role-assignment"

  role_assignment_scope        = data.azurerm_subnet.k8s_aci.id
  role_definition_name         = var.role_definition_name.net-contrib
  role_assignment_principal_id = data.azuread_service_principal.aks_aci_identity.id
}

您还可以使用下面的 Azure CLI 命令来实现此目的:

az role assignment create --assignee <Object (principal) ID> --role "Network Contributor" --scope <subnet-id>

注意:对象(主体)ID是您在错误消息中获取的 ID。

一个例子是这样的:

az role assignment create --assignee 560df3e9b-9f64-4faf-aa7c-6tdg779f81c7 --role "Network Contributor" --scope /subscriptions/0237fb7-7530-43ba-96ae-927yhfad80d1/resourcegroups/MyResourceGroup/providers/Microsoft.Network/virtualNetworks/my-vnet/subnets/k8s-aci-node-pool-subnet

资源:

Aci connector linux should export the identity associated to its addon

Using Terraform to create an AKS cluster with "SystemAssigned" identity and aci_connector_linux profile enabled does not result in a creation of a virtual node

Azure Kubernetes Service Tutorial: How to Integrate AKS with Azure Container Instances

Fail to configure a load balancer (AKS)

关于azure - Azure AKS 的 aci-connector-linux pod 处于 CrashLoopBackOff 状态,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72806870/

相关文章:

amazon-web-services - 如何在 terraform 中传递 template_file var 部分中的列表而不是字符串

terraform - 如何定义 "azurerm_resource_group_template_deployment" "parameters_content"部分

Azure函数无法访问Azure Blob

python - 使用 conda 环境在 VS code 中部署 Azure Functions

c# - 来自代码的 Windows Azure 服务状态

azure - 添加对 Azure Blob 存储的 nlog 支持后,Xamarin 版本构建失败

Kubernetes 前端和后端之间的通信

docker - Kubernetes 支持 docker 用户命名空间重新映射

mongodb - 就绪探测失败 : MongoDB shell version v4. 0.10

amazon-web-services - 解决 Terraform 中的资源依赖性