docker - "Kind"写入节点创建集群失败

标签 docker kubernetes load-balancing kind

我正在尝试为我的 kubernetes 集群设置一个 kind 集群。不幸的是,它在编写节点时准备好节点后失败了。我将附上输出和一些信息。提前感谢您的帮助!

干杯

错误

$ kind create cluster --config kind-config.yaml 

Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.20.2) 🖼 
 ✓ Preparing nodes 📦 📦 📦 📦  
 ✗ Writing configuration 📜 
ERROR: failed to create cluster: failed to generate kubeadm config content: failed to get kubernetes version from node: failed to get file: command "docker exec --privileged kind-worker3 cat /kind/version" failed with error: exit status 1
Command Output: Error response from daemon: Container c41566958be2239a9470ef2ea636c4b21958ee7620086f526954a02e4a605106 is not running

类配置yaml

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
  - role: worker
  - role: worker
  - role: worker

我的节点

$ kubectl get nodes -o wide

NAME      STATUS   ROLES                  AGE     VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION     CONTAINER-RUNTIME
gemini    Ready    control-plane,master   3d18h   v1.20.2   192.168.2.203   <none>        Ubuntu 20.10   5.8.0-1015-raspi   docker://19.3.13
phoenix   Ready    <none>                 3d17h   v1.20.2   192.168.2.129   <none>        Ubuntu 20.10   5.8.0-1015-raspi   docker://19.3.13
taurus    Ready    <none>                 3d17h   v1.20.2   192.168.2.201   <none>        Ubuntu 20.10   5.8.0-1015-raspi   docker://19.3.13
virgo     Ready    <none>                 3d17h   v1.20.2   192.168.2.202   <none>        Ubuntu 20.10   5.8.0-1015-raspi   docker://19.3.13

我的集群上正在运行什么

$ kubectl get all --all-namespaces

NAMESPACE              NAME                                             READY   STATUS    RESTARTS   AGE
default                pod/nginx-6799fc88d8-62cjd                       1/1     Running   1          18h
kube-system            pod/calico-kube-controllers-86bddfcff-ccrhg      1/1     Running   7          3d18h
kube-system            pod/calico-node-jddnl                            1/1     Running   4          3d17h
kube-system            pod/calico-node-nxwlw                            0/1     Running   7          3d18h
kube-system            pod/calico-node-stnzs                            0/1     Running   0          52s
kube-system            pod/calico-node-zrxzl                            1/1     Running   4          3d17h
kube-system            pod/coredns-74ff55c5b-kb2nm                      1/1     Running   7          3d18h
kube-system            pod/coredns-74ff55c5b-wsgs5                      1/1     Running   7          3d18h
kube-system            pod/etcd-gemini                                  1/1     Running   8          3d18h
kube-system            pod/kube-apiserver-gemini                        1/1     Running   8          3d18h
kube-system            pod/kube-controller-manager-gemini               1/1     Running   11         3d18h
kube-system            pod/kube-proxy-7fcjz                             1/1     Running   8          3d18h
kube-system            pod/kube-proxy-84rr7                             1/1     Running   4          3d17h
kube-system            pod/kube-proxy-lc88w                             1/1     Running   4          3d17h
kube-system            pod/kube-proxy-v4qd9                             1/1     Running   4          3d17h
kube-system            pod/kube-scheduler-gemini                        1/1     Running   9          3d18h
kubernetes-dashboard   pod/dashboard-metrics-scraper-79c5968bdc-mlb4s   1/1     Running   7          3d18h
kubernetes-dashboard   pod/kubernetes-dashboard-7448ffc97b-nq5c9        1/1     Running   7          3d18h

NAMESPACE              NAME                                TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
default                service/kubernetes                  ClusterIP   10.96.0.1        <none>        443/TCP                  41h
kube-system            service/calico-etcd                 ClusterIP   10.96.232.136    <none>        6666/TCP                 3d18h
kube-system            service/calico-typha                ClusterIP   10.109.108.233   <none>        5473/TCP                 3d18h
kube-system            service/kube-dns                    ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP,9153/TCP   3d18h
kubernetes-dashboard   service/dashboard-metrics-scraper   ClusterIP   10.110.70.52     <none>        8000/TCP                 3d18h
kubernetes-dashboard   service/kubernetes-dashboard        NodePort    10.106.194.127   <none>        443:31741/TCP            3d18h

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   daemonset.apps/calico-node   4         4         2       4            2           kubernetes.io/os=linux   3d18h
kube-system   daemonset.apps/kube-proxy    4         4         4       4            4           kubernetes.io/os=linux   3d18h

NAMESPACE              NAME                                        READY   UP-TO-DATE   AVAILABLE   AGE
default                deployment.apps/nginx                       1/1     1            1           18h
kube-system            deployment.apps/calico-kube-controllers     1/1     1            1           3d18h
kube-system            deployment.apps/coredns                     2/2     2            2           3d18h
kubernetes-dashboard   deployment.apps/dashboard-metrics-scraper   1/1     1            1           3d18h
kubernetes-dashboard   deployment.apps/kubernetes-dashboard        1/1     1            1           3d18h

NAMESPACE              NAME                                                   DESIRED   CURRENT   READY   AGE
default                replicaset.apps/nginx-6799fc88d8                       1         1         1       18h
kube-system            replicaset.apps/calico-kube-controllers-56b44cd6d5     0         0         0       3d18h
kube-system            replicaset.apps/calico-kube-controllers-86bddfcff      1         1         1       3d18h
kube-system            replicaset.apps/coredns-74ff55c5b                      2         2         2       3d18h
kubernetes-dashboard   replicaset.apps/dashboard-metrics-scraper-79c5968bdc   1         1         1       3d18h
kubernetes-dashboard   replicaset.apps/kubernetes-dashboard-7448ffc97b        1         1         1       3d18h

最佳答案

答案不具体,从一开始就做好准备。 有一个巨大的封闭Cannot create cluster due to docker exec cat /kind/version failing github 问题没有解决,但是..

您遇到的问题范围很广,根本原因可能完全不同。

BenTheElder - kind creator/maintainer :

This part: Command Output: Error response from daemon: Container f2a2d9c8f9c2eca9aeec7f10249eb205b02c8a5f41e5bf1145b5a8e4b63da123 is not running

That tells us that the node container is not running. That either means the entrypoint failed or your host killed it, both either due to some obscure bug we haven't found yet, or more likely an issue with your host environment.

Please file your own issue with much more details. This issue is non-specific and has discussed many different problems, as outlined above.

所以请在 Github 上创建新问题,很可能那个地方现在最适合解决此类问题。

我还发现您的问题可能来自 Docker Installed with Snap .参见 https://github.com/kubernetes-sigs/kind/issues/1288#issuecomment-631673479 . 来自 SNAP 的 Docker 知道与 Kind 一起工作的问题,而且它实际上并没有得到 KIND 团队的支持

snap is in the known-issue document, the snap docker package has a number of issues, e.g. no access to temp directories. I don't recommend snap for docker and we don't really support this.


A small note: we've worked around most of the snap issues for now if you're just managing clusters, but I still don't recommend snap for docker.

If you're seeing an issue similar to this, it means the node container exited early for some reason. That usually means the host environment is broken, but occasionally has meant we need to work-around e.g. less common filesystems with device mapper issues.

Please attempt to capture node logs with kind create cluster --retain, kind export logs, and file an issue with the logs uploaded. We'll try to identify the cause based on these.

EDIT: to be extra clear: "Cannot create cluster due to docker exec cat /kind/version failing" is a symptom, please file an issue with the details on your specific failure so we can identify the actual root cause if you encounter this.

This issue is locked because it got off-topic from the original root cause and kept being used for new problems that just happen to have the same symptom. This symptom is common for edge cases with the nodes suddenly terminating very early on because it's one of the first actions we take against the running node.

关于docker - "Kind"写入节点创建集群失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66134285/

相关文章:

kubernetes - namespace 上的GKE `ResourceQuota`-限制高于指定的限制

azure - kubectl错误: You must be logged in to the server (Unauthorized) after some time in AKS

java - Helm 图表中的 Spring Boot 应用程序属性

c# - Azure Service Fabric 中的节点之间广播消息

kubernetes - 将流量从DigitalOcean负载均衡器转发到Kubernetes服务不起作用

sql-server - Windows 2008 和 SQL Server 2008 的最佳负载平衡配置

Docker/Node-red - 如果卷在 USB 闪存驱动器上,npm 无法安装调色板

docker - 每当我(重新)启动容器时,是否有一种简单的方法可以自动运行脚本?

docker - 我可以在Docker的终端中读取文件时如何在Finder中查找文件

docker - 如何在 docker 中安装全局 npm 依赖项?