kubernetes - 运行 kubelet 时出错

标签 kubernetes lxc

我正在尝试在 fedora 24/lxc 容器上启动 kubelet,但收到一个似乎与 libvirt/iptables 相关的错误

Docker(使用 dnf/yum 安装):

[root@node2 ~]# docker version
 Client:
  Version:      1.12.0
  API version:  1.24
  Go version:   go1.6.3
  Git commit:   8eab29e
  Built:        
  OS/Arch:      linux/amd64

 Server:
  Version:      1.12.0
  API version:  1.24
  Go version:   go1.6.3
  Git commit:   8eab29e
 Built:        
  OS/Arch:      linux/amd64

Kubernetes(下载 v1.3.3 并提取 tar):

root@node2 bin]# ./kubectl version
Client Version: version.Info{
 Major:"1", Minor:"3", GitVersion:"v1.3.3", 
 GitCommit:"c6411395e09da356c608896d3d9725acab821418", 
 GitTreeState:"clean", BuildDate:"2016-07-22T20:29:38Z", 
 GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}

启动、参数和错误:

[root@node2 bin]# ./kubelet --address=0.0.0.0 --api-servers=http://master1:8080 --container-runtime=docker --hostname-override=node1 --port=10250
I0802 17:43:04.264454    2348 docker.go:327] Start docker client with request timeout=2m0s
W0802 17:43:04.271850    2348 server.go:487] Could not load kubeconfig file /var/lib/kubelet/kubeconfig: stat /var/lib/kubelet/kubeconfig: no such file or directory. Trying auth path instead.
W0802 17:43:04.271906    2348 server.go:448] Could not load kubernetes auth path /var/lib/kubelet/kubernetes_auth: stat /var/lib/kubelet/kubernetes_auth: no such file or directory. Continuing with defaults.
I0802 17:43:04.272241    2348 manager.go:138] cAdvisor running in container: "/"
W0802 17:43:04.275956    2348 manager.go:146] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp 127.0.0.1:15441: getsockopt: connection refused
I0802 17:43:04.280283    2348 fs.go:139] Filesystem partitions: map[/dev/mapper/fedora_kg--fedora-root:{mountpoint:/ major:253 minor:0 fsType:ext4 blockSize:0}]
I0802 17:43:04.284868    2348 manager.go:192] Machine: {NumCores:4 CpuFrequency:3192789 
 MemoryCapacity:4125679616 MachineID:1e80444278b7442385a762b9545cec7b 
 SystemUUID:5EC24D56-9CA6-B237-EE21-E0899C3C16AB BootID:44212209-ff1d-4340-8433-11a93274d927 
 Filesystems:[{Device:/dev/mapper/fedora_kg--fedora-root 
  Capacity:52710469632 Type:vfs Inodes:3276800}] 
 DiskMap:map[8:0:{Name:sda Major:8 Minor:0 Size:85899345920 Scheduler:cfq} 
  253:0:{Name:dm-0 Major:253 Minor:0 Size:53687091200 Scheduler:none} 
  253:1:{Name:dm-1 Major:253 Minor:1 Size:4160749568 Scheduler:none} 
  253:2:{Name:dm-2 Major:253 Minor:2 Size:27518828544 Scheduler:none} 
  253:3:{Name:dm-3 Major:253 Minor:3 Size:107374182400 Scheduler:none}] 
 NetworkDevices:[
  {Name:eth0 MacAddress:00:16:3e:b9:ce:f3 Speed:10000 Mtu:1500} 
  {Name:flannel.1 MacAddress:fa:ed:34:75:d6:1d Speed:0 Mtu:1450}] 
 Topology:[
  {Id:0 Memory:4125679616 
  Cores:[{Id:0 Threads:[0] 
  Caches:[]} {Id:1 Threads:[1] Caches:[]}] 
  Caches:[{Size:8388608 Type:Unified Level:3}]} 
  {Id:1 Memory:0 Cores:[{Id:0 Threads:[2] 
  Caches:[]} {Id:1 Threads:[3] Caches:[]}] 
  Caches:[{Size:8388608 Type:Unified Level:3}]}] 
 CloudProvider:Unknown InstanceType:Unknown InstanceID:None}
 I0802 17:43:04.285649    2348 manager.go:198] 
  Version: {KernelVersion:4.6.4-301.fc24.x86_64 ContainerOsVersion:Fedora 24 (Twenty Four) 
  DockerVersion:1.12.0 CadvisorVersion: CadvisorRevision:}
I0802 17:43:04.286366    2348 server.go:768] Watching apiserver
W0802 17:43:04.286477    2348 kubelet.go:561] Hairpin mode set to "promiscuous-bridge" but configureCBR0 is false, falling back to "hairpin-veth"
I0802 17:43:04.286575    2348 kubelet.go:384] Hairpin mode set to "hairpin-veth"
W0802 17:43:04.303188    2348 plugins.go:170] can't set sysctl net/bridge/bridge-nf-call-iptables: open /proc/sys/net/bridge/bridge-nf-call-iptables: no such file or directory
I0802 17:43:04.307700    2348 docker_manager.go:235] Setting dockerRoot to /var/lib/docker
I0802 17:43:04.310175    2348 server.go:730] Started kubelet v1.3.3
E0802 17:43:04.311636    2348 kubelet.go:933] Image garbage collection failed: unable to find data for container /
E0802 17:43:04.312800    2348 kubelet.go:994] Failed to start ContainerManager [open /proc/sys/kernel/panic: read-only file system, open /proc/sys/kernel/panic_on_oops: read-only file system, open /proc/sys/vm/overcommit_memory: read-only file system]
I0802 17:43:04.312962    2348 status_manager.go:123] Starting to sync pod status with apiserver
I0802 17:43:04.313080    2348 kubelet.go:2468] Starting kubelet main sync loop.
I0802 17:43:04.313187    2348 kubelet.go:2477] skipping pod synchronization - [Failed to start ContainerManager [open /proc/sys/kernel/panic: read-only file system, open /proc/sys/kernel/panic_on_oops: read-only file system, open /proc/sys/vm/overcommit_memory: read-only file system] network state unknown container runtime is down]
I0802 17:43:04.313525    2348 server.go:117] Starting to listen on 0.0.0.0:10250
I0802 17:43:04.315021    2348 volume_manager.go:216] Starting Kubelet Volume Manager
I0802 17:43:04.325998    2348 factory.go:228] Registering Docker factory
E0802 17:43:04.326049    2348 manager.go:240] Registration of the rkt container factory failed: unable to communicate with Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp 127.0.0.1:15441: getsockopt: connection refused
I0802 17:43:04.326073    2348 factory.go:54] Registering systemd factory
I0802 17:43:04.326545    2348 factory.go:86] Registering Raw factory
I0802 17:43:04.326993    2348 manager.go:1072] Started watching for new ooms in manager
I0802 17:43:04.331164    2348 oomparser.go:185] oomparser using systemd
I0802 17:43:04.331904    2348 manager.go:281] Starting recovery of all containers
I0802 17:43:04.368958    2348 manager.go:286] Recovery completed
I0802 17:43:04.419959    2348 kubelet.go:1185] Node node1 was previously registered
I0802 17:43:09.313871    2348 kubelet.go:2477] skipping pod synchronization - [Failed to start ContainerManager [open /proc/sys/kernel/panic: read-only file system, open /proc/sys/kernel/panic_on_oops: read-only file system, open /proc/sys/vm/overcommit_memory: read-only file system]]

Flannel(使用 dnf/yum 安装):

root@node2 bin]# systemctl status flanneld
● flanneld.service - Flanneld overlay address etcd agent
   Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled;  vendor preset: disabled)
   Active: active (running) since Mon 2016-08-01 22:14:06 UTC; 21h ago
  Process: 1203 ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker (code=exited, status=0/SUCCESS)
 Main PID: 1195 (flanneld)
    Tasks: 11 (limit: 512)
   Memory: 2.7M
      CPU: 4.012s
   CGroup: /system.slice/flanneld.service
           └─1195 /usr/bin/flanneld -etcd-endpoints=http://master1:2379 -etcd-prefix=/flannel/network

容器的 LXC 设置:

[root@kg-fedora node2]# cat config 
# Template used to create this container: /usr/share/lxc/templates/lxc-fedora
# Parameters passed to the template:
# For additional config options, please look at lxc.container.conf(5)
# Uncomment the following line to support nesting containers:
#lxc.include = /usr/share/lxc/config/nesting.conf
# (Be aware this has security implications)
lxc.network.type = veth
lxc.network.link = virbr0
lxc.network.hwaddr = 00:16:3e:b9:ce:f3
lxc.network.flags = up
lxc.network.ipv4 = 192.168.122.23/24
lxc.network.ipv4.gateway = 192.168.80.2
# Include common configuration
lxc.include = /usr/share/lxc/config/fedora.common.conf
lxc.arch = x86_64
# When using LXC with apparmor, uncomment the next line to run unconfined:
#lxc.aa_profile = unconfined
# example simple networking setup, uncomment to enable
#lxc.network.type = veth
#lxc.network.flags = up
#lxc.network.link = lxcbr0
#lxc.network.name = eth0
# Additional example for veth network type
#    static MAC address,
#lxc.network.hwaddr = 00:16:3e:77:52:20
#    persistent veth device name on host side
#        Note: This may potentially collide with other containers of same name!
#lxc.network.veth.pair = v-fedora-template-e0
lxc.cgroup.devices.allow = a
lxc.cap.drop =
lxc.rootfs = /var/lib/lxc/node2/rootfs
lxc.rootfs.backend = dir
lxc.utsname = node2

libvirt-1.3.3.2-1.fc24.x86_64:

[root@kg-fedora node2]# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2016-07-29 16:33:09 EDT; 3 days ago
     Docs: man:libvirtd(8)
           http://libvirt.org
 Main PID: 1191 (libvirtd)
    Tasks: 18 (limit: 512)
   Memory: 7.3M
      CPU: 9.108s
   CGroup: /system.slice/libvirtd.service
           ├─1191 /usr/sbin/libvirtd
           ├─1597 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           └─1599 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper

法兰绒/Docker 配置:

[root@node2 ~]# systemctl stop docker
[root@node2 ~]# ip link delete docker0
[root@node2 ~]# systemctl start docker
[root@node2 ~]# ip -4 a|grep inet
    inet 127.0.0.1/8 scope host lo
    inet 10.100.72.0/16 scope global flannel.1
    inet 172.17.0.1/16 scope global docker0
    inet 192.168.122.23/24 brd 192.168.122.255 scope global dynamic eth0

注意 docker0 接口(interface)没有使用与 flannel.1 接口(interface)相同的 ip 范围

如有任何指点,我们将不胜感激!

最佳答案

对于可能正在寻找此问题解决方案的任何人: 由于您使用的是 LXC,因此您需要确保所讨论的文件系统已挂载为 rw。这需要在 LXC 的配置文件中指定以下选项:

raw.lxc: "lxc.apparmor.profile=unconfined\nlxc.cap.drop=\nlxc.cgroup.devices.allow=a\nlxc.mount.auto=proc:rw sys:rw"

或者只是

lxc.mount.auto: proc:rw sys:rw

引用文献如下: https://medium.com/@kvaps/run-kubernetes-in-lxc-container-f04aa94b6c9c https://github.com/corneliusweig/kubernetes-lxd

关于kubernetes - 运行 kubelet 时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38728885/

相关文章:

linux - 从 centos 中提取图像(最新)时出错,需要身份验证

kubernetes - Vagrant:在所有虚拟机启动后运行 Ansible 配置,Ansible 无法连接到所有主机

kubernetes - cAdvisor Prometheus container_cpu_load_average_10s 有两个值

linux - 如何配置 lxc 2.0 配置文件以使用多个 overlayfs 下层?

yaml - 云配置中 write_files 指令的正确语法?

ubuntu - Docker 在 ubuntu 上使用 SELinux 运行违反约束

lxc - 如何在没有rootfs的情况下创建LXC容器

kubernetes hpa 请求 cpu 和 objective-c pu 值

amazon-web-services - 我在Kubernetes升级中有任何停机时间吗?

kubernetes - MariaDB Server 与 MariaDB Galera 集群 HA 复制