kubernetes - Kubernetes-在CentOS7中实现Kubernetes Master HA解决方案

标签 kubernetes haproxy kubectl

我正在为CentOS7环境中的Kubernetes Master节点实施HA解决方案。

我的环境看起来像:

K8S_Master1 : 172.16.16.5
K8S_Master2 : 172.16.16.51
HAProxy     : 172.16.16.100
K8S_Minion1 : 172.16.16.50


etcd Version: 3.1.7
Kubernetes v1.5.2
CentOS Linux release 7.3.1611 (Core)

我的etcd集群已正确设置并处于工作状态。
[root@master1 ~]# etcdctl cluster-health
member 282a4a2998aa4eb0 is healthy: got healthy result from http://172.16.16.51:2379
member dd3979c28abe306f is healthy: got healthy result from http://172.16.16.5:2379
member df7b762ad1c40191 is healthy: got healthy result from http://172.16.16.50:2379

我的Master1的K8S配置是:
[root@master1 ~]# cat /etc/kubernetes/apiserver 
KUBE_API_ADDRESS="--address=0.0.0.0"
KUBE_ETCD_SERVERS="--etcd_servers=http://127.0.0.1:4001"
KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.100.0.0/16"
KUBE_ADMISSION_CONTROL="--admission_control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ResourceQuota"

[root@master1 ~]# cat /etc/kubernetes/config 
KUBE_LOGTOSTDERR="--logtostderr=true"
KUBE_LOG_LEVEL="--v=0"
KUBE_ALLOW_PRIV="--allow_privileged=false"
KUBE_MASTER="--master=http://127.0.0.1:8080"

[root@master1 ~]# cat /etc/kubernetes/controller-manager 
KUBE_CONTROLLER_MANAGER_ARGS="--leader-elect"

[root@master1 ~]# cat /etc/kubernetes/scheduler 
KUBE_SCHEDULER_ARGS="--leader-elect"

至于Master2,我将其配置为:
[root@master2 kubernetes]# cat apiserver 
KUBE_API_ADDRESS="--address=0.0.0.0"
KUBE_ETCD_SERVERS="--etcd_servers=http://127.0.0.1:4001"
KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.100.0.0/16"
KUBE_ADMISSION_CONTROL="--admission_control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ResourceQuota"

[root@master2 kubernetes]# cat config 
KUBE_LOGTOSTDERR="--logtostderr=true"
KUBE_LOG_LEVEL="--v=0"
KUBE_ALLOW_PRIV="--allow_privileged=false"
KUBE_MASTER="--master=http://127.0.0.1:8080"

[root@master2 kubernetes]# cat scheduler 
KUBE_SCHEDULER_ARGS=""

[root@master2 kubernetes]# cat controller-manager 
KUBE_CONTROLLER_MANAGER_ARGS=""

请注意,仅在Master1上配置--leader-elect,因为我希望Master1成为领导者。

我的HA代理配置很简单:
frontend K8S-Master
    bind 172.16.16.100:8080
    default_backend K8S-Master-Nodes

backend K8S-Master-Nodes
    mode        http
    balance     roundrobin
    server      master1 172.16.16.5:8080 check
    server      master2 172.16.16.51:8080 check

现在,我已指示我的奴才连接到负载均衡器IP,而不是直接连接到主IP。

在Minion上的配置是:
[root@minion kubernetes]# cat /etc/kubernetes/config 
KUBE_LOGTOSTDERR="--logtostderr=true"
KUBE_LOG_LEVEL="--v=0"
KUBE_ALLOW_PRIV="--allow_privileged=false"
KUBE_MASTER="--master=http://172.16.16.100:8080"

在两个主节点上,我都将奴才/节点状态视为Ready
[root@master1 ~]# kubectl get nodes
NAME           STATUS    AGE
172.16.16.50   Ready     2h

[root@master2 ~]# kubectl get nodes
NAME           STATUS    AGE
172.16.16.50   Ready     2h

我使用以下示例设置了一个nginx pod:
apiVersion: v1
kind: ReplicationController
metadata:
  name: nginx
spec:
  replicas: 2
  selector:
    app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

我使用以下命令在Master1上创建了复制 Controller :
[root@master1 ~]# kubectl create -f nginx.yaml

在两个主节点上,我都能看到创建的 pods 。
[root@master1 ~]# kubectl get po
NAME          READY     STATUS    RESTARTS   AGE
nginx-jwpxd   1/1       Running   0          29m
nginx-q613j   1/1       Running   0          29m

[root@master2 ~]# kubectl get po
NAME          READY     STATUS    RESTARTS   AGE
nginx-jwpxd   1/1       Running   0          29m
nginx-q613j   1/1       Running   0          29m

现在,从逻辑上考虑,如果我要删除Master1节点并删除Master2上的容器,则Master2应该重新创建容器。这就是我要做的。

Master1上:
[root@master1 ~]# systemctl stop kube-scheduler ; systemctl stop kube-apiserver ; systemctl stop kube-controller-manager

Master2上:
[root@slave1 kubernetes]# kubectl delete po --all
pod "nginx-l7mvc" deleted
pod "nginx-r3m58" deleted

现在,由于Replication Controller仍处于启动状态,因此Master2应该创建容器。但是新的Pod卡在了:
[root@master2 kubernetes]# kubectl get po
NAME          READY     STATUS        RESTARTS   AGE
nginx-l7mvc   1/1       Terminating   0          13m
nginx-qv6z9   0/1       Pending       0          13m
nginx-r3m58   1/1       Terminating   0          13m
nginx-rplcz   0/1       Pending       0          13m

我已经等了很长时间,但 pod 卡在了这种状态。

但是当我在Master1上重新启动服务时:
[root@master1 ~]# systemctl start kube-scheduler ; systemctl start kube-apiserver ; systemctl start kube-controller-manager

然后我看到Master1的进度:
NAME          READY     STATUS              RESTARTS   AGE
nginx-qv6z9   0/1       ContainerCreating   0          14m
nginx-rplcz   0/1       ContainerCreating   0          14m

[root@slave1 kubernetes]# kubectl get po
NAME          READY     STATUS    RESTARTS   AGE
nginx-qv6z9   1/1       Running   0          15m
nginx-rplcz   1/1       Running   0          15m

为什么Master2不重新创建 pods ?这是我要解决的困惑。我已经花了很长的时间来设置一个全功能的HA设置,但是只有在我能弄清楚这个难题的情况下,它才差不多出现了。

最佳答案

在我看来,该错误来自Master2没有启用--leader-elect标志的事实。只能同时运行一个schedulercontroller进程,这就是--leader-elect的原因。该标志的目的是使它们“竞争”以查看schedulercontroller进程在给定时间处于 Activity 状态。由于您没有在两个主节点中都设置该标志,因此有两个schedulercontroller进程处于 Activity 状态,因此您遇到了冲突。为了解决此问题,我建议您在所有主节点中启用此标志。

此外,根据k8s文档https://kubernetes.io/docs/tasks/administer-cluster/highly-available-master/#best-practices-for-replicating-masters-for-ha-clusters:

Do not use a cluster with two master replicas. Consensus on a two replica cluster requires both replicas running when changing persistent state. As a result, both replicas are needed and a failure of any replica turns cluster into majority failure state. A two-replica cluster is thus inferior, in terms of HA, to a single replica cluster.

关于kubernetes - Kubernetes-在CentOS7中实现Kubernetes Master HA解决方案,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44838137/

相关文章:

docker - Kubernetes 不创建 docker 容器

ssl - 具有 SDS 和双向 TLS 的 Istio : upstream connect error or disconnect/reset before headers. 重置原因:连接失败

ssl - SSL 的 Haproxy acl 规则

ssl - Haproxy作为ssl pathtrough中的反向代理问题

mysql - 在负载均衡器上记录查询响应时间

docker - Docker镜像的新文件在哪里保存到GCP中?

docker - 配置 prometheus 以从 dockerized nodejs pod 收集自定义指标

kubernetes - 列出命名空间中的所有资源

kubernetes - kubectl exec失败,出现错误 "Unable to use a TTY - input is not a terminal or the right kind of file"

kubernetes - 如何使用 kubectl create 在 Kubernetes 对象上手动设置值?