kubernetes - k8s 普罗米修斯 :pod has unbound PersistentVolumeClaims

标签 kubernetes prometheus

我在我的win10机器上的两个virtualbox(centos 7.4)中安装了kube1.10.3。我使用 git clone 来获取 prometheus yaml 文件。

git clone https://github.com/kubernetes/kubernetes

然后我进入 kubernetes/cluster/addons/prometheus 并按照以下顺序创建 pod:
alertmanager-configmap.yaml
alertmanager-pvc.yaml
alertmanager-deployment.yaml
alertmanager-service.yaml

kube-state-metrics-rbac.yaml
kube-state-metrics-deployment.yaml
kube-state-metrics-service.yaml

node-exporter-ds.yml
node-exporter-service.yaml

prometheus-configmap.yaml
prometheus-rbac.yaml
prometheus-statefulset.yaml
prometheus-service.yaml

但是 Prometheus 和 alertmanage 都处于挂起状态:
kube-system   alertmanager-6bd9584b85-j4h5m              0/2       Pending   0          9m
kube-system   calico-etcd-pnwtr                          1/1       Running   0          16m
kube-system   calico-kube-controllers-5d74847676-mjq4j   1/1       Running   0          16m
kube-system   calico-node-59xfk                          2/2       Running   1          16m
kube-system   calico-node-rqsh5                          2/2       Running   1          16m
kube-system   coredns-7997f8864c-ckhsq                   1/1       Running   0          16m
kube-system   coredns-7997f8864c-jjtvq                   1/1       Running   0          16m
kube-system   etcd-master16g                             1/1       Running   0          15m
kube-system   heapster-589b7db6c9-mpmks                  1/1       Running   0          16m
kube-system   kube-apiserver-master16g                   1/1       Running   0          15m
kube-system   kube-controller-manager-master16g          1/1       Running   0          15m
kube-system   kube-proxy-hqq49                           1/1       Running   0          16m
kube-system   kube-proxy-l8hmh                           1/1       Running   0          16m
kube-system   kube-scheduler-master16g                   1/1       Running   0          16m
kube-system   kube-state-metrics-8595f97c4-g6x5x         2/2       Running   0          8m
kube-system   kubernetes-dashboard-7d5dcdb6d9-944xl      1/1       Running   0          16m
kube-system   monitoring-grafana-7b767fb8dd-mg6dd        1/1       Running   0          16m
kube-system   monitoring-influxdb-54bd58b4c9-z9tgd       1/1       Running   0          16m
kube-system   node-exporter-f6pmw                        1/1       Running   0          8m
kube-system   node-exporter-zsd9b                        1/1       Running   0          8m
kube-system   prometheus-0                               0/2       Pending   0          7m

我通过如下所示的命令检查了 prometheus pod:
[root@master16g prometheus]# kubectl describe pod prometheus-0 -n kube-system
Name:           prometheus-0
Namespace:      kube-system
Node:           <none>
Labels:         controller-revision-hash=prometheus-8fc558cb5
                k8s-app=prometheus
                statefulset.kubernetes.io/pod-name=prometheus-0
Annotations:    scheduler.alpha.kubernetes.io/critical-pod=
Status:         Pending
IP:
Controlled By:  StatefulSet/prometheus
Init Containers:
  init-chown-data:
    Image:      busybox:latest
    Port:       <none>
    Host Port:  <none>
    Command:
      chown
      -R
      65534:65534
      /data
    Environment:  <none>
    Mounts:
      /data from prometheus-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-f6v42 (ro)
Containers:
  prometheus-server-configmap-reload:
    Image:      jimmidyson/configmap-reload:v0.1
    Port:       <none>
    Host Port:  <none>
    Args:
      --volume-dir=/etc/config
      --webhook-url=http://localhost:9090/-/reload
    Limits:
      cpu:     10m
      memory:  10Mi
    Requests:
      cpu:        10m
      memory:     10Mi
    Environment:  <none>
    Mounts:
      /etc/config from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-f6v42 (ro)
  prometheus-server:
    Image:      prom/prometheus:v2.2.1
    Port:       9090/TCP
    Host Port:  0/TCP
    Args:
      --config.file=/etc/config/prometheus.yml
      --storage.tsdb.path=/data
      --web.console.libraries=/etc/prometheus/console_libraries
      --web.console.templates=/etc/prometheus/consoles
      --web.enable-lifecycle
    Limits:
      cpu:     200m
      memory:  1000Mi
    Requests:
      cpu:        200m
      memory:     1000Mi
    Liveness:     http-get http://:9090/-/healthy delay=30s timeout=30s period=10s #success=1 #failure=3
    Readiness:    http-get http://:9090/-/ready delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /data from prometheus-data (rw)
      /etc/config from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-f6v42 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  prometheus-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  prometheus-data-prometheus-0
    ReadOnly:   false
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      prometheus-config
    Optional:  false
  prometheus-token-f6v42:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prometheus-token-f6v42
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  42s (x22 over 5m)  default-scheduler  pod has unbound PersistentVolumeClaims (repeated 2 times)

在最后一行,它显示警告消息: pod has unbound PersistentVolumeClaims (repeated 2 times)

普罗米修斯日志说:
[root@master16g prometheus]# kubectl logs prometheus-0 -n kube-system
Error from server (BadRequest): a container name must be specified for pod prometheus-0, choose one of: [prometheus-server-configmap-reload prometheus-server] or one of the init containers: [init-chown-data]

我描述了 alertmanager pod 及其日志:
[root@master16g prometheus]# kubectl describe pod alertmanager-6bd9584b85-j4h5m -n kube-system
Name:           alertmanager-6bd9584b85-j4h5m
Namespace:      kube-system
Node:           <none>
Labels:         k8s-app=alertmanager
                pod-template-hash=2685140641
                version=v0.14.0
Annotations:    scheduler.alpha.kubernetes.io/critical-pod=
Status:         Pending
IP:
Controlled By:  ReplicaSet/alertmanager-6bd9584b85
Containers:
  prometheus-alertmanager:
    Image:      prom/alertmanager:v0.14.0
    Port:       9093/TCP
    Host Port:  0/TCP
    Args:
      --config.file=/etc/config/alertmanager.yml
      --storage.path=/data
      --web.external-url=/
    Limits:
      cpu:     10m
      memory:  50Mi
    Requests:
      cpu:        10m
      memory:     50Mi
    Readiness:    http-get http://:9093/%23/status delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /data from storage-volume (rw)
      /etc/config from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-snfrt (ro)
  prometheus-alertmanager-configmap-reload:
    Image:      jimmidyson/configmap-reload:v0.1
    Port:       <none>
    Host Port:  <none>
    Args:
      --volume-dir=/etc/config
      --webhook-url=http://localhost:9093/-/reload
    Limits:
      cpu:     10m
      memory:  10Mi
    Requests:
      cpu:        10m
      memory:     10Mi
    Environment:  <none>
    Mounts:
      /etc/config from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-snfrt (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      alertmanager-config
    Optional:  false
  storage-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  alertmanager
    ReadOnly:   false
  default-token-snfrt:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-snfrt
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age               From               Message
  ----     ------            ----              ----               -------
  Warning  FailedScheduling  3m (x26 over 9m)  default-scheduler  pod has unbound PersistentVolumeClaims (repeated 2 times)

及其日志:
[root@master16g prometheus]# kubectl logs alertmanager-6bd9584b85-j4h5m -n kube-system
Error from server (BadRequest): a container name must be specified for pod alertmanager-6bd9584b85-j4h5m, choose one of: [prometheus-alertmanager prometheus-alertmanager-configmap-reload] 

它具有与 Prometheus 相同的警告信息:
pod has unbound PersistentVolumeClaims (repeated 2 times)

然后我通过发出如下命令获得 pvc:
[root@master16g prometheus]# kubectl get pvc --all-namespaces
NAMESPACE     NAME                           STATUS    VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS   AGE
kube-system   alertmanager                   Pending                                       standard       20m
kube-system   prometheus-data-prometheus-0   Pending                                       standard       19m

我的问题是如何绑定(bind)persistentVolumnClaim?为什么日志说必须指定容器名称?

==================================================== ==============

第二版

由于pvc文件定义了存储类,所以我需要定义一个存储类yaml。如果我想要 Nfs 或 GlusterFs,该怎么做?通过这种方式,我可以避开像 Google 或 AWS 这样的云供应商。
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: alertmanager
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: EnsureExists
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: "2Gi"

最佳答案

此日志条目:

Error from server (BadRequest): a container name must be specified for pod alertmanager-6bd9584b85-j4h5m, choose one of: [prometheus-alertmanager prometheus-alertmanager-configmap-reload] 

表示 Pod alertmanager-6bd9584b85-j4h5m由两个容器组成:
  • 普罗米修斯警报管理器
  • prometheus-alertmanager-configmap-reload

  • 当您使用 kubectl logs对于 Pod其中包含多个容器,您必须指定容器的名称才能查看其日志。命令模板:
    kubectl -n <namespace> logs <pod_name> <container_name>
    

    比如要查看容器的日志prometheus-alertmanager这是 Pod 的一部分alertmanager-6bd9584b85-j4h5m在命名空间 kube-system你应该使用这个命令:
    kubectl -n kube-system logs alertmanager-6bd9584b85-j4h5m prometheus-alertmanager
    
    Pending PVC 的状态可能意味着您没有相应的 PV

    关于kubernetes - k8s 普罗米修斯 :pod has unbound PersistentVolumeClaims,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51372593/

    相关文章:

    azure - VictoriaMetrics - 在 azure_sd_config 中传递过滤器,如 ec2_sd_config

    kubernetes - 使用Consul存储kubernetes容器的环境变量

    kubernetes - 编织网Pod处于挂起状态,调度程序日志中存在错误

    kubernetes - Traefik 2 http 到 https 重定向,tls 不起作用

    kubernetes - Kubernetes中的apiserver_request_duration_seconds普罗米修斯度量是什么意思?

    Prometheus UI 始终返回 1,即使 blackbox_exporter 手动返回 0

    azure - 在单个 AKS 中托管多个 .Net Core 网站

    kubernetes - 如何使现有的配置映射使用 kubernetes secret

    kubernetes - 如何在 prometheus 运算符(operator)中的配置重新加载器错误上收到通知或警报?

    monitoring - 如何使用 1GB RAM 运行 Prometheus 监控?