kubernetes - 使用 Helm 安装后 Prometheus 服务器处于挂起状态

标签 kubernetes prometheus kubernetes-helm persistent-volumes persistent-volume-claims

我是 k8s 新手,正在尝试为 k8s 设置 prometheus 监控。我用了 “helm install”来设置普罗米修斯。现在:

  1. 两个 Pod 仍处于挂起状态:
    • 普罗米修斯服务器
    • 普罗米修斯警报管理器
  2. 我手动为两者创建了持久卷 谁能帮助我如何将这些 PV 与 Helm Chart 创建的 PVC 进行映射?
[centos@k8smaster1 ~]$ kubectl get pod -n monitoring
NAME                                             READY   STATUS    RESTARTS   AGE
prometheus-alertmanager-7757d759b8-x6bd7         0/2     Pending   0          44m
prometheus-kube-state-metrics-7f85b5d86c-cq9kr   1/1     Running   0          44m
prometheus-node-exporter-5rz2k                   1/1     Running   0          44m
prometheus-pushgateway-5b8465d455-672d2          1/1     Running   0          44m
prometheus-server-7f8b5fc64b-w626v               0/2     Pending   0          44m
[centos@k8smaster1 ~]$ kubectl get pv
prometheus-alertmanager   3Gi        RWX            Retain           Available                                                                       22m
prometheus-server         12Gi       RWX            Retain           Available                                                                       30m
[centos@k8smaster1 ~]$ kubectl get pvc -n monitoring
NAME                      STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
prometheus-alertmanager   Pending                                                     20m
prometheus-server         Pending                                                     20m
[centos@k8smaster1 ~]$ kubectl describe pvc prometheus-alertmanager -n monitoring
Name:          prometheus-alertmanager
Namespace:     monitoring
StorageClass:
Status:        Pending
Volume:
Labels:        app=prometheus
               chart=prometheus-8.15.0
               component=alertmanager
               heritage=Tiller
               release=prometheus
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason         Age                  From                         Message
  ----       ------         ----                 ----                         -------
  Normal     FailedBinding  116s (x83 over 22m)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set
Mounted By:  prometheus-alertmanager-7757d759b8-x6bd7

我希望 Pod 进入运行状态

!!!更新!!!

NAME                      STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
prometheus-alertmanager   Pending                                      local-storage   4m29s
prometheus-server         Pending                                      local-storage   4m29s
[centos@k8smaster1 prometheus_pv_storage]$ kubectl describe pvc prometheus-server -n monitoring
Name:          prometheus-server
Namespace:     monitoring
StorageClass:  local-storage
Status:        Pending
Volume:
Labels:        app=prometheus
               chart=prometheus-8.15.0
               component=server
               heritage=Tiller
               release=prometheus
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason                Age                   From                         Message
  ----       ------                ----                  ----                         -------
  Normal     WaitForFirstConsumer  11s (x22 over 4m59s)  persistentvolume-controller  waiting for first consumer to be created before binding
Mounted By:  prometheus-server-7f8b5fc64b-bqf42

!!更新2!!

[centos@k8smaster1 ~]$ kubectl get pods prometheus-server-7f8b5fc64b-bqf42 -n monitoring  -o yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2019-08-18T16:10:54Z"
  generateName: prometheus-server-7f8b5fc64b-
  labels:
    app: prometheus
    chart: prometheus-8.15.0
    component: server
    heritage: Tiller
    pod-template-hash: 7f8b5fc64b
    release: prometheus
  name: prometheus-server-7f8b5fc64b-bqf42
  namespace: monitoring
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: prometheus-server-7f8b5fc64b
    uid: c1979bcb-c1d2-11e9-819d-fa163ebb8452
  resourceVersion: "2461054"
  selfLink: /api/v1/namespaces/monitoring/pods/prometheus-server-7f8b5fc64b-bqf42
  uid: c19890d1-c1d2-11e9-819d-fa163ebb8452
spec:
  containers:
  - args:
    - --volume-dir=/etc/config
    - --webhook-url=http://127.0.0.1:9090/-/reload
    image: jimmidyson/configmap-reload:v0.2.2
    imagePullPolicy: IfNotPresent
    name: prometheus-server-configmap-reload
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/config
      name: config-volume
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: prometheus-server-token-7h2df
      readOnly: true
  - args:
    - --storage.tsdb.retention.time=15d
    - --config.file=/etc/config/prometheus.yml
    - --storage.tsdb.path=/data
    - --web.console.libraries=/etc/prometheus/console_libraries
    - --web.console.templates=/etc/prometheus/consoles
    - --web.enable-lifecycle
    image: prom/prometheus:v2.11.1
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /-/healthy
        port: 9090
        scheme: HTTP
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 30
    name: prometheus-server
    ports:
    - containerPort: 9090
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /-/ready
        port: 9090
        scheme: HTTP
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 30
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/config
      name: config-volume
    - mountPath: /data
      name: storage-volume
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: prometheus-server-token-7h2df
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 65534
    runAsGroup: 65534
    runAsNonRoot: true
    runAsUser: 65534
  serviceAccount: prometheus-server
  serviceAccountName: prometheus-server
  terminationGracePeriodSeconds: 300
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - configMap:
      defaultMode: 420
      name: prometheus-server
    name: config-volume
  - name: storage-volume
    persistentVolumeClaim:
      claimName: prometheus-server
  - name: prometheus-server-token-7h2df
    secret:
      defaultMode: 420
      secretName: prometheus-server-token-7h2df
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-08-18T16:10:54Z"
    message: '0/2 nodes are available: 1 node(s) didn''t find available persistent
      volumes to bind, 1 node(s) had taints that the pod didn''t tolerate.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: BestEffort

我还创建了卷并将其分配给本地存储

[centos@k8smaster1 prometheus_pv]$ kubectl get pv -n monitoring

prometheus-alertmanager   3Gi        RWX            Retain           Available                                               local-storage            2d19h
prometheus-server         12Gi       RWX            Retain           Available                                               local-storage            2d19h

最佳答案

如果您在EKS,您的节点需要有下一个权限

arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy

Amazon EBS CSI 驱动程序插件

关于kubernetes - 使用 Helm 安装后 Prometheus 服务器处于挂起状态,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57523549/

相关文章:

kubernetes - 使用 nginx 反向代理公开 Kubernetes 中的服务

kubernetes - 更新 helm 图表中已弃用的 apiVersions

kubernetes - 将 nginx 公开为负载均衡器与入口 Controller 有什么区别?

kubernetes - 初始化容器获取 url 并将数据挂载到 pod

azure - 通过名称解析 Pod 的 IP

django - Prometheus-Django:连接被拒绝

elasticsearch - 根据响应值计算Prometheus中的可用性

普罗米修斯 json 指标

kubernetes - 如何告诉 Helm 不要运行作业

amazon-web-services - Kubernetes:外部 secret 运算符(operator)错误:InvalidClientTokenId:请求中包含的安全 token 无效