deployment - kubelet 在初始部署后未注册,但需要重新启动

标签 deployment kubernetes kubelet

我有一个来自 kubelet 的奇怪行为,在集群启动后不久 kubelet 没有注册到 API 服务器。

有趣的是,如果我重新启动 kubelet 守护程序,它会正确注册并且一切都按预期工作,这让我相信这是一个同步问题?(我正在使用 coreos、云配置和 kubelet 配置为 systemd 单元)

在 Kubernetes 节点部署后不久,Kubelet 日志仅显示以下条目,仅此而已:

-- Logs begin at Wed 2017-01-11 10:59:51 UTC, end at Wed 2017-01-11 11:58:35 UTC. --
Jan 11 11:00:47 worker0 systemd[1]: Started Kubernetes Kubelet.
Jan 11 11:00:47 worker0 kubelet[1712]: Flag --api-servers has been deprecated, Use --kubeconfig instead. Will be removed in a future version.
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.793484    1712 docker.go:375] Connecting to docker on unix:///var/run/docker.sock
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.793603    1712 docker.go:395] Start docker client with request timeout=2m0s
Jan 11 11:00:47 worker0 kubelet[1712]: E0111 11:00:47.793740    1712 cni.go:163] error updating cni config: No networks found in /etc/cni/net.d
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.804434    1712 manager.go:140] cAdvisor running in container: "/system.slice/kubelet.service"

如果我重新启动 kubelet,我会看到预期的日志,并且它会按预期注册到 API 服务器。重启后的 kubelet 日志如下:
-- Logs begin at Wed 2017-01-11 10:59:51 UTC, end at Wed 2017-01-11 11:58:35 UTC. --
Jan 11 11:00:47 worker0 systemd[1]: Started Kubernetes Kubelet.
Jan 11 11:00:47 worker0 kubelet[1712]: Flag --api-servers has been deprecated, Use --kubeconfig instead. Will be removed in a future version.
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.793484    1712 docker.go:375] Connecting to docker on unix:///var/run/docker.sock
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.793603    1712 docker.go:395] Start docker client with request timeout=2m0s
Jan 11 11:00:47 worker0 kubelet[1712]: E0111 11:00:47.793740    1712 cni.go:163] error updating cni config: No networks found in /etc/cni/net.d
Jan 11 11:00:47 worker0 kubelet[1712]: I0111 11:00:47.804434    1712 manager.go:140] cAdvisor running in container: "/system.slice/kubelet.service"
Jan 11 11:58:26 worker0 systemd[1]: Stopping Kubernetes Kubelet...
Jan 11 11:58:26 worker0 systemd[1]: Stopped Kubernetes Kubelet.
Jan 11 11:58:26 worker0 systemd[1]: Started Kubernetes Kubelet.
Jan 11 11:58:26 worker0 kubelet[5180]: Flag --api-servers has been deprecated, Use --kubeconfig instead. Will be removed in a future version.
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.501190    5180 docker.go:375] Connecting to docker on unix:///var/run/docker.sock
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.501525    5180 docker.go:395] Start docker client with request timeout=2m0s
Jan 11 11:58:26 worker0 kubelet[5180]: E0111 11:58:26.501775    5180 cni.go:163] error updating cni config: No networks found in /etc/cni/net.d
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.521821    5180 manager.go:140] cAdvisor running in container: "/system.slice/kubelet.service"
Jan 11 11:58:26 worker0 kubelet[5180]: W0111 11:58:26.554844    5180 manager.go:148] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp 127.0.0.1:15441: ge
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.562578    5180 fs.go:116] Filesystem partitions: map[/dev/sda3:{mountpoint:/usr major:8 minor:3 fsType:ext4 blockSize:0} /dev/sda6:{mou
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.567504    5180 manager.go:195] Machine: {NumCores:2 CpuFrequency:2299998 MemoryCapacity:1045340160 MachineID:bed23c2c06d642f1904ebbe67a
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.572042    5180 manager.go:201] Version: {KernelVersion:4.7.3-coreos-r3 ContainerOsVersion:CoreOS 1185.5.0 (MoreOS) DockerVersion:1.11.2
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.574264    5180 kubelet.go:255] Adding manifest file: /opt/kubernetes/manifests
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.574340    5180 kubelet.go:265] Watching apiserver
Jan 11 11:58:26 worker0 kubelet[5180]: W0111 11:58:26.633161    5180 kubelet_network.go:71] Hairpin mode set to "promiscuous-bridge" but configureCBR0 is false, falling back to "hairpin-vet
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.633682    5180 kubelet.go:516] Hairpin mode set to "hairpin-veth"
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.641810    5180 docker_manager.go:242] Setting dockerRoot to /var/lib/docker
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.642560    5180 kubelet_network.go:306] Setting Pod CIDR:  -> 172.20.31.1/24
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.644117    5180 server.go:714] Started kubelet v1.4.0
Jan 11 11:58:26 worker0 kubelet[5180]: E0111 11:58:26.647154    5180 kubelet.go:1094] Image garbage collection failed: unable to find data for container /
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.650196    5180 kubelet_node_status.go:194] Setting node annotation to enable volume controller attach/detach
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.651955    5180 server.go:118] Starting to listen on 0.0.0.0:10250
Jan 11 11:58:26 worker0 kubelet[5180]: E0111 11:58:26.668376    5180 kubelet.go:2127] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": unable
Jan 11 11:58:26 worker0 kubelet[5180]: E0111 11:58:26.668432    5180 kubelet.go:2135] Failed to check if disk space is available on the root partition: failed to get fs info for "root": una
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.674021    5180 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.674110    5180 status_manager.go:129] Starting to sync pod status with apiserver
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.674141    5180 kubelet.go:2229] Starting kubelet main sync loop.
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.674208    5180 kubelet.go:2240] skipping pod synchronization - [network state unknown container runtime is down]
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.675339    5180 volume_manager.go:234] Starting Kubelet Volume Manager
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.713597    5180 factory.go:295] Registering Docker factory
Jan 11 11:58:26 worker0 kubelet[5180]: W0111 11:58:26.717164    5180 manager.go:244] Registration of the rkt container factory failed: unable to communicate with Rkt api service: rkt: canno
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.717777    5180 factory.go:54] Registering systemd factory
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.719843    5180 factory.go:86] Registering Raw factory
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.723229    5180 manager.go:1082] Started watching for new ooms in manager
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.725579    5180 oomparser.go:185] oomparser using systemd
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.728010    5180 manager.go:285] Starting recovery of all containers
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.837552    5180 kubelet_node_status.go:194] Setting node annotation to enable volume controller attach/detach
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.878400    5180 kubelet_node_status.go:64] Attempting to register node worker0
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.919196    5180 kubelet_node_status.go:67] Successfully registered node worker0
Jan 11 11:58:26 worker0 kubelet[5180]: I0111 11:58:26.924483    5180 kubelet_network.go:306] Setting Pod CIDR: 172.20.31.1/24 ->
Jan 11 11:58:27 worker0 kubelet[5180]: I0111 11:58:27.104781    5180 manager.go:290] Recovery completed

知道如何解决此类问题吗?

谢谢,
戴维德

最佳答案

听起来等待 docker 启动或接口(interface)正确初始化有延迟。我发现以下问题与您的问题一模一样:https://github.com/kubernetes/kubernetes/issues/33789#issuecomment-251251196

The fix could be adding a condition that "if configure-cbr=true AND network-plugin=none or noop", then do not check /etc/default/docker to decide whether to restart docker.

关于deployment - kubelet 在初始部署后未注册,但需要重新启动,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41590765/

相关文章:

javascript - 应用程序未初始化

django - 将 http 请求的副本转发到另一个(测试)环境

kubernetes - 在部署/ pods 创建之前在镜像中编辑配置文件

kubernetes - kubelet没有在Microk8s中配置ClusterDNS IP

node.js - 无法访问将其从 GH-pages 传递到 React App 的环境 secret 变量

sql-server - 使用桌面应用程序部署 SQL Server Express 数据库?

Azure Durable Function HttpStart 失败 : Webhooks are not configured

elasticsearch - 如何使用其他 POD 访问 StatefulSet

docker - 在Google Cloud实例上独立安装kubelet永久磁盘

go - 如何转储正在运行的 kubelet 的 goroutines 堆栈竞争