azure - 设置 AKS 群集静态 IP、负载均衡器和入口 Controller 的正确方法是什么?

标签 azure kubernetes nginx terraform terraform-provider-azure

现在我尝试在 AKS 上配置集群已经好几天了,但我一直在文档的各个部分、SO 上的各种问题、Medium 上的文章之间跳转。所有这些都导致不断失败。

目标是获取带有 DNS 的静态 IP,我可以使用它将我的应用程序连接到部署在 AKS 上的服务器。

我已经通过 terraform 创建了基础设施,其中包含一个资源组,我在其中创建了公共(public) IP 和 AKS 集群,到目前为止一切顺利。

resource groups

fixi-resource-group

MC_fixit-resource-group

尝试使用使用选项 http_application_routing_enabled = true 时安装的入口 Controller 后关于集群创建,文档不鼓励生产https://learn.microsoft.com/en-us/azure/aks/http-application-routing ,我正在尝试推荐的方法并通过 Helm 安装 ingress-nginx Controller https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli .

在 terraform 中,我像这样安装它

资源组和集群

resource "azurerm_resource_group" "resource_group" {
  name     = var.resource_group_name
  location = var.location
    tags = {
    Environment = "Test"
    Team = "DevOps"
  }
}


resource "azurerm_kubernetes_cluster" "server_cluster" {
  name                = "server_cluster"
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
  dns_prefix          = "fixit"
  kubernetes_version = var.kubernetes_version
  # sku_tier = "Paid"

  default_node_pool {
    name       = "default"
    node_count = 1
    min_count = 1
    max_count = 3
    # vm_size    = "standard_b2s_v5"
    # vm_size    = "standard_e2bs_v5"
    vm_size    = "standard_b4ms"
    type = "VirtualMachineScaleSets"
    enable_auto_scaling = true
    enable_host_encryption = false
    # os_disk_size_gb = 30

    
    # enable_node_public_ip = true

  }

  service_principal {
    client_id = var.sp_client_id
    client_secret = var.sp_client_secret
  }

  tags = {
    Environment = "Production"
  }

  linux_profile {
    admin_username = "azureuser"
    ssh_key {
        key_data = var.ssh_key
    }
  }
  network_profile {
      network_plugin = "kubenet"
      load_balancer_sku = "standard"
      # load_balancer_sku = "basic"
    
  }
  # http_application_routing_enabled = true
  http_application_routing_enabled = false

  
}

公共(public)IP

resource "azurerm_public_ip" "public-ip" {
  name                = "fixit-public-ip"
  location            = var.location
  resource_group_name = var.resource_group_name
  allocation_method   = "Static"
  domain_name_label = "fixit"
  sku = "Standard"
}

负载均衡器

resource "kubernetes_service" "cluster-ingress" {
  metadata {
    name = "cluster-ingress-svc"
    annotations = {
      "service.beta.kubernetes.io/azure-load-balancer-resource-group" = "fixit-resource-group"

      # Warning  SyncLoadBalancerFailed  2m38s (x8 over 12m)  service-controller  Error syncing load balancer: 
      # failed to ensure load balancer: findMatchedPIPByLoadBalancerIP: cannot find public IP with IP address 52.157.90.236 
      # in resource group MC_fixit-resource-group_server_cluster_westeurope

      # "service.beta.kubernetes.io/azure-load-balancer-resource-group" = "MC_fixit-resource-group_server_cluster_westeurope"



      # kubernetes.io/ingress.class: addon-http-application-routing
    }
  }
  spec {
    # type = "Ingress"
    type = "LoadBalancer"
    load_balancer_ip = var.public_ip_address
    selector = {
      name = "cluster-ingress-svc"
    }
    port {
      name = "cluster-port"
      protocol = "TCP"
      port = 3000
      target_port = "80"
    }

  }
}

入口 Controller

resource "helm_release" "nginx" {
  name = "ingress-nginx"
  repository = "https://kubernetes.github.io/ingress-nginx"
  chart = "ingress-nginx"
  namespace = "default"

  set {
    name  = "rbac.create"
    value = "false"
  }

  set {
    name  = "controller.service.externalTrafficPolicy"
    value = "Local"
  }

  set {
    name  = "controller.service.loadBalancerIP"
    value = var.public_ip_address
  }

  set {
    name  = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-internal"
    value = "true"
  }
  
#   --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz

  set {
    name  = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/azure-load-balancer-health-probe-request-path"
    value = "/healthz"
  }
} 

但是安装失败,并显示来自 terraform 的此消息

Warning: Helm release "ingress-nginx" was created but has a failed status. Use the `helm` command to investigate the error, correct it, then run Terraform again.
│ 
│   with module.ingress_controller.helm_release.nginx,
│   on modules/ingress_controller/controller.tf line 2, in resource "helm_release" "nginx":
│    2: resource "helm_release" "nginx" {
│ 
╵
╷
│ Error: timed out waiting for the condition
│ 
│   with module.ingress_controller.helm_release.nginx,
│   on modules/ingress_controller/controller.tf line 2, in resource "helm_release" "nginx":
│    2: resource "helm_release" "nginx" {

Controller 打印输出

vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl describe svc ingress-nginx-controller
Name:                     ingress-nginx-controller
Namespace:                default
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=ingress-nginx
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.5.1
                          helm.sh/chart=ingress-nginx-4.4.2
Annotations:              meta.helm.sh/release-name: ingress-nginx
                          meta.helm.sh/release-namespace: default
                          service: map[beta:map[kubernetes:map[io/azure-load-balancer-internal:true]]]
                          service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.0.173.243
IPs:                      10.0.173.243
IP:                       52.157.90.236
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  31709/TCP
Endpoints:                
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  30045/TCP
Endpoints:                
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     32500
Events:
  Type     Reason                  Age                 From                Message
  ----     ------                  ----                ----                -------
  Normal   EnsuringLoadBalancer    32s (x5 over 108s)  service-controller  Ensuring load balancer
  Warning  SyncLoadBalancerFailed  31s (x5 over 107s)  service-controller  Error syncing load balancer: failed to ensure load balancer: findMatchedPIPByLoadBalancerIP: cannot find public IP with IP address 52.157.90.236 in resource group mc_fixit-resource-group_server_cluster_westeurope


vincenzocalia@vincenzos-MacBook-Air helm_charts % az aks show --resource-group fixit-resource-group --name server_cluster --query nodeResourceGroup -o tsv
MC_fixit-resource-group_server_cluster_westeurope

为什么它在 MC_fixit-resource-group_server_cluster_westeurope 中查找资源组而不是 fixit-resource-group我为集群、公共(public) IP 和负载均衡器创建了?

如果我将 Controller 负载均衡器 IP 更改为 MC_fixit-resource-group_server_cluster_westeurope 中的公共(public) IP然后terraform仍然输出相同的错误,但是 Controller 打印出正确分配给ip和负载均衡器

set {
    name  = "controller.service.loadBalancerIP"
    value = "20.73.192.77" #var.public_ip_address
  }
vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl get svc
NAME                                 TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)                      AGE
cluster-ingress-svc                  LoadBalancer   10.0.110.114   52.157.90.236   3000:31863/TCP               104m
ingress-nginx-controller             LoadBalancer   10.0.106.201   20.73.192.77    80:30714/TCP,443:32737/TCP   41m
ingress-nginx-controller-admission   ClusterIP      10.0.23.188    <none>          443/TCP                      41m
kubernetes                           ClusterIP      10.0.0.1       <none>          443/TCP                      122m
vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl describe svc ingress-nginx-controller
Name:                     ingress-nginx-controller
Namespace:                default
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=ingress-nginx
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.5.1
                          helm.sh/chart=ingress-nginx-4.4.2
Annotations:              meta.helm.sh/release-name: ingress-nginx
                          meta.helm.sh/release-namespace: default
                          service: map[beta:map[kubernetes:map[io/azure-load-balancer-internal:true]]]
                          service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.0.106.201
IPs:                      10.0.106.201
IP:                       20.73.192.77
LoadBalancer Ingress:     20.73.192.77
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  30714/TCP
Endpoints:                
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  32737/TCP
Endpoints:                
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     32538
Events:
  Type    Reason                Age                From                Message
  ----    ------                ----               ----                -------
  Normal  EnsuringLoadBalancer  39m (x2 over 41m)  service-controller  Ensuring load balancer
  Normal  EnsuredLoadBalancer   39m (x2 over 41m)  service-controller  Ensured load balancer
vincenzocalia@vincenzos-MacBook-Air helm_charts % 

在这里阅读https://learn.microsoft.com/en-us/azure/aks/faq#why-are-two-resource-groups-created-with-aks

To enable this architecture, each AKS deployment spans two resource groups: You create the first resource group. This group contains only the Kubernetes service resource. The AKS resource provider automatically creates the second resource group during deployment. An example of the second resource group is MC_myResourceGroup_myAKSCluster_eastus. For information on how to specify the name of this second resource group, see the next section. The second resource group, known as the node resource group, contains all of the infrastructure resources associated with the cluster. These resources include the Kubernetes node VMs, virtual networking, and storage. By default, the node resource group has a name like MC_myResourceGroup_myAKSCluster_eastus. AKS automatically deletes the node resource group whenever the cluster is deleted, so it should only be used for resources that share the cluster's lifecycle.

我应该根据我创建的资源类型传递第一组还是第二组? 例如。 kubernetes_service需要第一个 rg,而 azurerm_public_ip需要第二个 rg?

我在这里错过了什么? 请像我 5 岁一样解释一下,因为我现在的感觉就是..

非常感谢

最佳答案

终于找到问题所在了。

事实上,需要在节点资源组中创建公共(public)IP,因为入口 Controller 将loadBalancerIP分配给 >公共(public)IP地址,将在节点资源组中查找它,因此如果您在资源组中创建它,则会失败并出现我收到的错误.

节点资源组名称是在集群创建时分配的,例如。 MC_myResourceGroup_myAKSCluster_eastus,但您可以使用参数 node_resource_group = var.node_resource_group_name 随意命名。

此外,公共(public) IP sku “标准”(待指定)或 “基本”(默认),以及集群 load_balancer_sku "standard""basic"(没有默认值,需要指定)必须匹配。

我还将公共(public) IP 放入集群模块中,以便它可以依赖它,以避免在它之前创建并由于尚未创建节点资源组而失败,无法设置该依赖关系在 main.tf 文件中正确存在。

所以现在的工作配置是:

主要

terraform {
  required_version = ">=1.1.0"
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
       version = "~> 3.0.2"
    }
  }
}

provider "azurerm" {
  features {
    resource_group {
      prevent_deletion_if_contains_resources = false
    }
  }
  subscription_id   = var.azure_subscription_id
  tenant_id         = var.azure_subscription_tenant_id
  client_id         = var.service_principal_appid
  client_secret     = var.service_principal_password
}


provider "kubernetes" {
  host = "${module.cluster.host}"
  client_certificate = "${base64decode(module.cluster.client_certificate)}"
  client_key = "${base64decode(module.cluster.client_key)}"
  cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
}

provider "helm" {
  kubernetes {
    host                   = "${module.cluster.host}"
    client_certificate     = "${base64decode(module.cluster.client_certificate)}"
    client_key             = "${base64decode(module.cluster.client_key)}"
    cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
  }
}



module "cluster" {
  source = "./modules/cluster"
  location = var.location
  vm_size = var.vm_size
  resource_group_name = var.resource_group_name
  node_resource_group_name = var.node_resource_group_name
  kubernetes_version = var.kubernetes_version
  ssh_key = var.ssh_key
  sp_client_id = var.service_principal_appid
  sp_client_secret = var.service_principal_password
}




module "ingress-controller" {
  source = "./modules/ingress-controller"
  public_ip_address = module.cluster.public_ip_address
  depends_on = [
    module.cluster.public_ip_address
  ]
}

集群

resource "azurerm_resource_group" "resource_group" {
  name     = var.resource_group_name
  location = var.location
    tags = {
    Environment = "test"
    Team = "DevOps"
  }
}
resource "azurerm_kubernetes_cluster" "server_cluster" {
  name                = "server_cluster"
  ### choose the resource goup to use for the cluster
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
  ### decide the name of the cluster "node" resource group, if unset will be named automatically 
  node_resource_group = var.node_resource_group_name
  dns_prefix          = "fixit"
  kubernetes_version = var.kubernetes_version
  # sku_tier = "Paid"

  default_node_pool {
    name       = "default"
    node_count = 1
    min_count = 1
    max_count = 3
    vm_size = var.vm_size

    type = "VirtualMachineScaleSets"
    enable_auto_scaling = true
    enable_host_encryption = false
    # os_disk_size_gb = 30
  }

  service_principal {
    client_id = var.sp_client_id
    client_secret = var.sp_client_secret
  }

  tags = {
    Environment = "Production"
  }

  linux_profile {
    admin_username = "azureuser"
    ssh_key {
        key_data = var.ssh_key
    }
  }
  network_profile {
      network_plugin = "kubenet"
      load_balancer_sku = "basic" 
    
  }
  http_application_routing_enabled = false
  depends_on = [
    azurerm_resource_group.resource_group
  ]
}

resource "azurerm_public_ip" "public-ip" {
  name                = "fixit-public-ip"
  location            = var.location
  # resource_group_name = var.resource_group_name
  resource_group_name = var.node_resource_group_name
  allocation_method   = "Static"
  domain_name_label = "fixit"
  # sku = "Standard"

depends_on = [
  azurerm_kubernetes_cluster.server_cluster
]
}

入口 Controller

resource "helm_release" "nginx" {
  name      = "ingress-nginx"
  repository = "ingress-nginx"
  chart     = "ingress-nginx/ingress-nginx"
  namespace = "default"

  set {
    name  = "controller.service.externalTrafficPolicy"
    value = "Local"
  }

  set {
    name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-internal"
    value = "true"
  }

  set {
    name  = "controller.service.loadBalancerIP"
    value = var.public_ip_address
  }

  set {
    name  = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path"
    value = "/healthz"
  }
} 

入口服务

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-service
  # namespace: default
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
    nginx.ingress.kubernetes.io/use-regex: "true"
    nginx.ingress.kubernetes.io/rewrite-target: /$2$3$4
spec:
  ingressClassName: nginx
  rules:
    # - host: fixit.westeurope.cloudapp.azure.com #dns from Azure PublicIP


### Node.js server
  - http:
      paths:
      - path: /(/|$)(.*)
        pathType: Prefix
        backend:
          service:
            name: server-clusterip-service
            port:
              number: 80 

  - http:
      paths:
      - path: /server(/|$)(.*)
        pathType: Prefix
        backend:
          service:
            name: server-clusterip-service
            port:
              number: 80
...

other services omitted

希望这可以帮助其他在正确设置方面遇到困难的人。 干杯。

关于azure - 设置 AKS 群集静态 IP、负载均衡器和入口 Controller 的正确方法是什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75247192/

相关文章:

jquery - nginx comet 使用 jquery 进行长轮询

linux - Bash 脚本中评估 azure 命令时的 while 循环被破坏

.net - Azure 应用程序网关返回随机 502 错误

Azure 存储 Blob 错误 - AnonymousClientOtherError 和 AnonymousNetworkError

amazon-web-services - Terraform AWS EKS 安全组问题

ubuntu - Docker 已启动,但无法在端口 8000 访问 localhost

azure - 缓存管理 Power BI 服务和 Azure Analysis Services 多维数据集

postgresql - 在 kubernetes pod 被杀死后释放 JDBC 连接

go - kubebuilder的client.List方法如何使用?

php - 无法使用 nginx 实现 php