amazon-web-services - `the node was low on resource imagefs` -- 导致 pod 定期被驱逐

标签 amazon-web-services docker jenkins kubernetes jenkins-x

我正在将 Jenkins-X 用于一个相对较大的项目,该项目由大约 30 个模块组成,其中 15 个是服务(因此包含 Dockerfile 和用于部署的相应 Helm 图表)。

在其中一些相对较大的构建期间,我间歇性地(~每个其他构建)看到构建 Pod 被驱逐,使用 kubectl describe pod <podname>我可以进行调查,并且注意到 Pod 因以下原因被逐出:

the node was low on resource imagefs

完整数据:

Name:         maven-96wmn
Namespace:    jx
Node:         ip-192-168-66-176.eu-west-1.compute.internal/
Start Time:   Tue, 06 Nov 2018 10:22:54 +0000
Labels:       jenkins=slave
              jenkins/jenkins-maven=true
Annotations:  <none>
Status:       Failed
Reason:       Evicted
Message:      The node was low on resource: imagefs.
IP:           
Containers:
  maven:
    Image:      jenkinsxio/builder-maven:0.0.516
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/sh
      -c
    Args:
      cat
    Limits:
      cpu:     1
      memory:  1Gi
    Requests:
      cpu:     400m
      memory:  512Mi
    Environment:
      JENKINS_SECRET:       131c407141521c0842f62a69004df926be6cb531f9318edf0885aeb96b0662b4
      JENKINS_TUNNEL:       jenkins-agent:50000
      DOCKER_CONFIG:        /home/jenkins/.docker/
      GIT_AUTHOR_EMAIL:     <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e48e818a8f8d8a97c99ca4838b8b83888183968b919497ca878b89" rel="noreferrer noopener nofollow">[email protected]</a>
      GIT_COMMITTER_EMAIL:  <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d1bbb4bfbab8bfa2fca991b6bebeb6bdb4b6a3bea4a1a2ffb2bebc" rel="noreferrer noopener nofollow">[email protected]</a>
      GIT_COMMITTER_NAME:   jenkins-x-bot
      _JAVA_OPTIONS:        -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -Dsun.zip.disableMemoryMapping=true -XX:+UseParallelGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -Xms10m -Xmx192m
      GIT_AUTHOR_NAME:      jenkins-x-bot
      JENKINS_NAME:         maven-96wmn
      XDG_CONFIG_HOME:      /home/jenkins
      JENKINS_URL:          http://jenkins:8080
      HOME:                 /home/jenkins
    Mounts:
      /home/jenkins from workspace-volume (rw)
      /home/jenkins/.docker from volume-2 (rw)
      /home/jenkins/.gnupg from volume-3 (rw)
      /root/.m2 from volume-1 (rw)
      /var/run/docker.sock from volume-0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from jenkins-token-smvvp (ro)
  jnlp:
    Image:      jenkinsci/jnlp-slave:3.14-1
    Port:       <none>
    Host Port:  <none>
    Args:
      131c407141521c0842f62a69004df926be6cb531f9318edf0885aeb96b0662b4
      maven-96wmn
    Requests:
      cpu:     100m
      memory:  128Mi
    Environment:
      JENKINS_SECRET:       131c407141521c0842f62a69004df926be6cb531f9318edf0885aeb96b0662b4
      JENKINS_TUNNEL:       jenkins-agent:50000
      DOCKER_CONFIG:        /home/jenkins/.docker/
      GIT_AUTHOR_EMAIL:     <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7a101f141113140957023a1d15151d161f1d08150f0a0954191517" rel="noreferrer noopener nofollow">[email protected]</a>
      GIT_COMMITTER_EMAIL:  <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c8a2ada6a3a1a6bbe5b088afa7a7afa4adafbaa7bdb8bbe6aba7a5" rel="noreferrer noopener nofollow">[email protected]</a>
      GIT_COMMITTER_NAME:   jenkins-x-bot
      _JAVA_OPTIONS:        -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -Dsun.zip.disableMemoryMapping=true -XX:+UseParallelGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -Xms10m -Xmx192m
      GIT_AUTHOR_NAME:      jenkins-x-bot
      JENKINS_NAME:         maven-96wmn
      XDG_CONFIG_HOME:      /home/jenkins
      JENKINS_URL:          http://jenkins:8080
      HOME:                 /home/jenkins
    Mounts:
      /home/jenkins from workspace-volume (rw)
      /home/jenkins/.docker from volume-2 (rw)
      /home/jenkins/.gnupg from volume-3 (rw)
      /root/.m2 from volume-1 (rw)
      /var/run/docker.sock from volume-0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from jenkins-token-smvvp (ro)
Volumes:
  volume-0:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/docker.sock
    HostPathType:  
  volume-2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  jenkins-docker-cfg
    Optional:    false
  volume-1:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  jenkins-maven-settings
    Optional:    false
  workspace-volume:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  volume-3:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  jenkins-release-gpg
    Optional:    false
  jenkins-token-smvvp:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  jenkins-token-smvvp
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age   From                                                   Message
  ----     ------                 ----  ----                                                   -------
  Normal   Created                7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  Created container
  Normal   SuccessfulMountVolume  7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  MountVolume.SetUp succeeded for volume "workspace-volume"
  Normal   SuccessfulMountVolume  7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  MountVolume.SetUp succeeded for volume "volume-0"
  Normal   SuccessfulMountVolume  7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  MountVolume.SetUp succeeded for volume "volume-1"
  Normal   SuccessfulMountVolume  7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  MountVolume.SetUp succeeded for volume "volume-2"
  Normal   SuccessfulMountVolume  7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  MountVolume.SetUp succeeded for volume "volume-3"
  Normal   SuccessfulMountVolume  7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  MountVolume.SetUp succeeded for volume "jenkins-token-smvvp"
  Normal   Pulled                 7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  Container image "jenkinsxio/builder-maven:0.0.516" already present on machine
  Normal   Scheduled              7m    default-scheduler                                      Successfully assigned maven-96wmn to ip-192-168-66-176.eu-west-1.compute.internal
  Normal   Started                7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  Started container
  Normal   Pulled                 7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  Container image "jenkinsci/jnlp-slave:3.14-1" already present on machine
  Normal   Created                7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  Created container
  Normal   Started                7m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  Started container
  Warning  Evicted                5m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  The node was low on resource: imagefs.
  Normal   Killing                5m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  Killing container with id docker://jnlp:Need to kill Pod
  Normal   Killing                5m    kubelet, ip-192-168-66-176.eu-west-1.compute.internal  Killing container with id docker://maven:Need to kill Pod

如何解决这个问题?我通常不完全理解 imagefs 是什么,如何配置/增加它,或避免它饱和。

ps。抱歉,这篇文章写得太被动了,我不得不使用主动的语气来使措辞足够冗长,这样我才能不只是发布代码片段。

最佳答案

已解决;由于底层存储大小只有20GB,在EBS中更改为50GB并重新启动节点(增加了nodefs),从而解决了这个问题(因为imagefs不再饱和)。

关于amazon-web-services - `the node was low on resource imagefs` -- 导致 pod 定期被驱逐,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53172046/

相关文章:

android - 模拟器上未收到 Firebase 消息

jenkins - 我可以在 Jenkins 管道脚本中创建新作业吗

amazon-ec2 - 如何在 Amazon EC2 上配置基于内存的 Auto Scaling?

amazon-web-services - 未找到 Terraform AWS Provider 的有效凭证来源

node.js - npm install 与 docker-compose 项目

Docker - 服务器命中 EOF

python - 如何/在哪里将(python)脚本上传到 jenkins 来运行?

git - Jenkins 插件失败错误

python - 在 Elastic Beanstalk (Libffi) 中安装软件包

PHP-mysqli 无法连接到在 Docker 中运行的 MySQL 8.0.16