python - Google 管理的 VM 模块卡在重启循环中

标签 python google-app-engine google-cloud-platform managed-vm google-managed-vm

我一直在尝试添加一个新的 App Engine Module使用托管 VM 而不是默认的 GAE 沙箱。目的是提供一个模块,我可以在其中运行更新版本的 SciPy 和 NumPy,我的面向用户的模块可以调用这些模块。 我已经在本地成功构建并运行了我的 Docker 镜像/容器,但是当我尝试部署到 Google 服务器上的自定义版本时遇到了很多问题。

以下是托管 VM 模块实例的串行控制台输出,由于似乎超出我控制的问题,该实例继续重启。

还有其他人遇到过这些吗?我在配置/部署过程中遗漏了什么吗?

FWIW:我已经使用 GAE 好几年了,甚至在我在 Google 期间也为它做出了贡献。我也有使用模块和 Docker 的经验。托管 VM 的文档和工具目前似乎还很不成熟,我已经筋疲力尽了。我需要帮助。

Dec 09 00:52:41 vm_runtime_init: start 'pull_app'.
[   24.288054] docker0: port 1(veth8d67b7c) entered forwarding state
Dec  9 00:52:56 gae-mvm-vmv7-tsia kernel: [   24.288054] docker0: port 1(veth8d67b7c) entered forwarding state
Dec 09 00:52:57 Pulling GAE_FULL_APP_CONTAINER: appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
gcm-Heartbeat:1449622390000
Dec 09 00:53:13 ERROR: Timed out while trying to pull appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7 from registry!
===== Unexpected error during VM startup =====
=== Dump of VM runtime system logs follows ===
WARNING: HTTP 404 error while fetching metadata key gae_cloud_sql_instances. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_cloud_sql_proxy_image_name. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_extra_nginx_confs. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_redirect_appengine_googleapis_com. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_http_loadbalancer_enabled. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_loadbalancer. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_loadbalancer_ip. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_memcache_proxy. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_monitoring_image_name. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_use_cloud_monitoring. Will treat it as an empty string.
vm_runtime_init: Dec 09 00:52:40 Invoking all VM runtime components. /dev/fd/63
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: start 'allow_ssh'.
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: Done start 'allow_ssh'.
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: start 'unlocker'.
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: Done start 'unlocker'.
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: start 'fluentd_logger'.
vm_runtime_init: Dec 09 00:52:41 vm_runtime_init: Done start 'fluentd_logger'.
vm_runtime_init: Dec 09 00:52:41 vm_runtime_init: start 'pull_app'.
vm_runtime_init: Dec 09 00:52:57 Pulling GAE_FULL_APP_CONTAINER: appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Dec 09 00:53:13 ERROR: Timed out while trying to pull appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7 from registry!
Dec 09 00:52:40 Invoking all VM runtime components. /dev/fd/63
Dec 09 00:52:40 vm_runtime_init: start 'allow_ssh'.
Dec 09 00:52:40 vm_runtime_init: Done start 'allow_ssh'.
Dec 09 00:52:40 vm_runtime_init: start 'unlocker'.
Dec 09 00:52:40 vm_runtime_init: Done start 'unlocker'.
Dec 09 00:52:40 vm_runtime_init: start 'fluentd_logger'.
8aa8a33b8daa451d5595b951aeecad772a23d65b6592ac07cae6265cc74b6312
Dec 09 00:52:41 vm_runtime_init: Done start 'fluentd_logger'.
Dec 09 00:52:41 vm_runtime_init: start 'pull_app'.
Dec 09 00:52:57 Pulling GAE_FULL_APP_CONTAINER: appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Pulling metadata
16d49c9e1091: Pulling fs layer
16d49c9e1091: Download complete
54f405e77b26: Pulling metadata
54f405e77b26: Pulling fs layer
54f405e77b26: Download complete
36e2f6c710be: Pulling metadata
36e2f6c710be: Pulling fs layer
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Download complete
54f405e77b26: Download complete
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Download complete
54f405e77b26: Download complete
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Download complete
54f405e77b26: Download complete
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Download complete
54f405e77b26: Download complete
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
CONTAINER ID        IMAGE                                    COMMAND                  CREATED             STATUS              PORTS               NAMES
8aa8a33b8daa        gcr.io/google_appengine/fluentd-logger   "/opt/google-fluentd/"   33 seconds ago      Up 32 seconds                           insane_panini
Container: 8aa8a33b8daa
========= rebooting. ========================

INIT: 
INIT: Sending processes the TERM signal


INIT: Sending processes the KILL signal

Dec  9 00:53:14 gae-mvm-vmv7-tsia init: Switching to runlevel: 1
gcm-StatusUpdate:TIME=1449622394000;STATUS=COMMAND_FAILED;INVOCATION_ID=0
[[36minfo[39;49m] Using makefile-style concurrent boot in runlevel 1.
Dec  9 00:53:15 gae-mvm-vmv7-tsia rpc.statd[1758]: Caught signal 15, un-registering and exiting
Dec  9 00:53:15 gae-mvm-vmv7-tsia google: shutdown script found in metadata.
[....] Stopping NFS common utilities: idmapd statd[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
Dec  9 00:53:15 gae-mvm-vmv7-tsia shutdownscript: Running shutdown script /var/run/google.shutdown.script
Dec  9 00:53:15 gae-mvm-vmv7-tsia rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w"
[....] Stopping rpcbind daemon...[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
Stopping supervisor: supervisord.
udhcpd: Disabled. Edit /etc/default/udhcpd to enable it.
[....] Unmounting iscsi-backed filesystems: Unmounting all devices marked _netdev[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
Dec  9 00:53:16 gae-mvm-vmv7-tsia iscsid: iscsid shutting down.
[....] Unmounting iscsi-backed filesystems: Unmounting all devices marked _netdev[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[....] Disconnecting iSCSI targets:iscsiadm: No matching sessions found
[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[....] Stopping iSCSI initiator service:[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
Dec  9 00:53:16 gae-mvm-vmv7-tsia shutdownscript: Finished running shutdown script /var/run/google.shutdown.script
[....] Stopping Docker: docker[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[....] Stopping The Kubernetes container manager: kubelet[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[....] Stopping enhanced syslogd: rsyslogd[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[   54.500923] docker0: port 1(veth8d67b7c) entered disabled state
[   54.512521] docker0: port 1(veth8d67b7c) entered disabled state
[   54.522249] device veth8d67b7c left promiscuous mode
[   54.527554] docker0: port 1(veth8d67b7c) entered disabled state
Terminating on signal number 15
Traceback (most recent call last):
  File "/usr/share/google/google_daemon/manage_accounts.py", line 94, in <module>
    options.daemon, options.force, options.debug)
  File "/usr/share/google/google_daemon/manage_accounts.py", line 65, in Main
    manager_daemon.StartDaemon()
  File "/usr/share/google/google_daemon/accounts_manager_daemon.py", line 73, in StartDaemon
    self.accounts_manager.Main()
  File "/usr/share/google/google_daemon/accounts_manager.py", line 87, in Main
    writer.close()
IOError: [Errno 32] Broken pipe
[....] Asking all remaining processes to terminate...acpid: exiting

[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0cdone.
[....] All processes ended within 1 seconds....[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0cdone.
[[36minfo[3
INIT: Sending processes the TERM signal


INIT: Sending processes the KILL signal

sulogin: root account is locked, starting shell
root@gae-mvm-vmv7-tsia:~# 

编辑:来自 shutdown.log 的附加信息如下。 docker logs 命令未在我的任何代码或 Dockerfile 中运行——我假设 Google 在其端使用它的方式存在错误。

2015-12-08 17:08:22.194 Sending SIGUSR1 to fluentd to trigger a log flush.
2015-12-08 17:08:22.194 605e9f1ad747e63560fdc28a8c7f3c276d77255edd0a65381f7ad2f9f8eafd2a
2015-12-08 17:08:22.194 ---------------------------------------------------------------------
2015-12-08 17:08:22.194 ---------------App was unhealthy, grabbing debug logs----------------
2015-12-08 17:08:22.194 --------------------------App stdout/stderr--------------------------
2015-12-08 17:08:22.194 /usr/share/vm_runtime/vm_shutdown.sh: line 22: /var/run/app.cid: No such file or directory
2015-12-08 17:08:22.194 docker: "logs" requires 1 argument.
2015-12-08 17:08:22.194 See 'docker logs --help'.
2015-12-08 17:08:22.194 
2015-12-08 17:08:22.194 Usage:  docker logs [OPTIONS] CONTAINER
2015-12-08 17:08:22.194 
2015-12-08 17:08:22.194 Fetch the logs of a container
2015-12-08 17:08:22.194 ---------------------------------------------------------------------
2015-12-08 17:08:22.194 --------------------------Tail of app logs---------------------------
2015-12-08 17:08:22.194 tail: cannot open `/var/log/app_engine/app/app.0.log.json' for reading: No such file or directory
2015-12-08 17:08:22.194 ---------------------------------------------------------------------

最佳答案

编辑

这个答案大部分是正确的,但是根据 curl 推断图像不存在是有缺陷的。 python-compat应该适合你,但是 python也是一个有效的图像,从运行 docker pull gcr.io/google_appengine/python 可以看出:

$ docker pull gcr.io/google_appengine/python
Pulling repository gcr.io/google_appengine/python
ac7db0912786: Download complete 
643a001c5ee0: Download complete 
559718b5f880: Download complete 
8f8068a6a6b4: Download complete 
16d49c9e1091: Download complete 
54f405e77b26: Download complete 
36e2f6c710be: Download complete 
e8aed8091139: Download complete 
8f0415d8e4e9: Download complete 
15ed20635873: Download complete 
6d70c8850a43: Download complete 
93ae290c32a1: Download complete 
7f766358fa71: Download complete 
7f4a74c30dc4: Download complete 
b51802c69e61: Download complete 
Status: Downloaded newer image for gcr.io/google_appengine/python:latest

与贡献者讨论 jonparrott在 github repo 上查看 gcr.io/google_appengine/python docker 图像,各种 python 托管 VM/自定义运行时 docker 图像的关系是 clarified in a comment .

因此,我认为您在此处看到的问题(假设 VM 无法变得健康)可能与您从中获取的图像、您的 Dockerfile、您的应用程序代码或基础架构有关。前三个比最后一个更有可能,但这并非不可想象。似乎“Untar re-exec error: exit status 1: output: unexpected EOF”错误是串行控制台输出期间问题的第一个外在表现。

这可能值得放入 Google Cloud Platform Public Issue Tracker 的问题报告中以及您的 Dockerfile、您的应用程序代码(如果需要)、时间范围(如果它只是暂时发生)等信息。


原始答案

如果您检查容器注册表,您尝试获取的图像 FROM不存在:

curl -v -X HEAD https://gcr.io/google_appengine/python
* Hostname was NOT found in DNS cache
*   Trying 74.125.193.82...
* Connected to gcr.io (74.125.193.82) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
* SSLv3, TLS handshake, Server hello (2):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS handshake, Server key exchange (12):
* SSLv3, TLS handshake, Server finished (14):
* SSLv3, TLS handshake, Client key exchange (16):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSL connection using ECDHE-RSA-AES128-GCM-SHA256
* Server certificate:
*        subject: C=US; ST=California; L=Mountain View; O=Google Inc; CN=*.googlecode.com
*        start date: 2015-12-02 14:40:36 GMT
*        expire date: 2016-03-01 00:00:00 GMT
*        subjectAltName: gcr.io matched
*        issuer: C=US; O=Google Inc; CN=Google Internet Authority G2
*        SSL certificate verify ok.
> HEAD /google_appengine/python HTTP/1.1
> User-Agent: curl/7.35.0
> Host: gcr.io
> Accept: */*
> 
< HTTP/1.1 404 Not Found
< Date: Thu, 10 Dec 2015 19:14:52 GMT
< Content-Type: text/html; charset=UTF-8
* Server Docker Registry is not blacklisted
< Server: Docker Registry
< Content-Length: 1584
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< Alternate-Protocol: 443:quic,p=0
< Alt-Svc: clear
< 

python 运行时实际存在的容器注册表镜像是:

gcr.io/google_appengine/python-compat

这应该是解决部署失败的关键。

关于python - Google 管理的 VM 模块卡在重启循环中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34169058/

相关文章:

javascript - Google Sheet API 最小示例

python - Sublime Text 3 - 清洁粘贴

Python:获取两个列表并通过合并它们来创建一个新列表

python - 使用 Testbed 对 Google App Engine 进行单元测试时的 InvalidModuleError()

google-app-engine - TransientFailureException --> 从拉入队列中租用任务时

java - App Engine 的灵活性有多成熟?

mongodb - 在谷歌云平台上部署mongodb?

python - Tornado 请求处理程序 : How to decode url encoded query before getting the arguments

python - python 中的组合和外观是同一件事吗?

python - GCP 实例为 Ajax 127.0.0.1 路由返回 ERR_CONNECTION_REFUSED