python - 访问 gunicorn 支持的服务时静默超时。如何调试？

我有这个文件:

cd /opt/webapps/deployed/landing-pages
# 1. Activate the virtualenv
source /home/ec2-user/.virtualenvs/landing-pages/bin/activate
# 2. Start gunicorn process as daemon
gunicorn trescloud_landing.wsgi:application --daemon --bind=127.0.0.1:8888 --pid=/opt/webapps/pid/landing-pages.pid --access-logfile=/opt/webapps/log/landing-pages.access.log --error-logfile=/opt/webapps/log/landing-pages.error.log
# 3. Deactivate the virtualenv
deactivate

当我运行这个文件时，我可以找到 trescloud_landing/wsgi.py 文件(即我在项目的基本目录中:像 manage.py 这样的文件在目录 pwd 中)。

我有权写入 .access.log 和 .error.log 文件以及 .pid 文件。当我运行它时，会创建两个进程:

ec2-user 17171 0.3 0.5 214916 11740 ? S 23:28 0:00 /home/ec2-user/.virtualenvs/landing-pages/bin/python2.7 /home/ec2-user/.virtualenvs/landing-pages/bin/gunicorn trescloud_landing.wsgi:application --daemon --bind=127.0.0.1:8888 --pid=/opt/webapps/pid/landing-pages.pid --access-logfile=/opt/webapps/log/landing-pages.access.log --error-logfile=/opt/webapps/log/landing-pages.error.log

ec2-user 17176 4.8 1.0 235144 20556 ? R 23:28 0:00 /home/ec2-user/.virtualenvs/landing-pages/bin/python2.7 /home/ec2-user/.virtualenvs/landing-pages/bin/gunicorn trescloud_landing.wsgi:application --daemon --bind=127.0.0.1:8888 --pid=/opt/webapps/pid/landing-pages.pid --access-logfile=/opt/webapps/log/landing-pages.access.log --error-logfile=/opt/webapps/log/landing-pages.error.log

当我查询 netstat (sudo netstat -anp | grep 8888) 时，我得到如下信息:

tcp 0 0 127.0.0.1:8888 0.0.0.0:* LISTEN 17171/python2.7

这似乎告诉我服务器已启动。

但是当我使用 curl http://127.0.0.1:8888/ 请求处理似乎已停止(即永远不会返回。没有引发错误。没有生成部分响应 - 它变成空白和永恒)。自然地，当我在中间使用 nginx 访问 url(即通过外部链接)时，我会收到 504 响应(因为 nginx 像任何像样的代理一样处理超时)。

通过查看错误日志，我没有得到任何重要信息(如果我通过 nginx 访问，只有 [CRITICAL] WORKER TIMEOUT)。我看到的是这样的:

2015-11-04 23:35:07 [17171] [CRITICAL] WORKER TIMEOUT (pid:17319)
2015-11-04 23:35:07 [17171] [INFO] 1 workers
2015-11-04 23:35:08 [17319] [INFO] Worker exiting (pid: 17319)
2015-11-04 23:35:08 [17171] [INFO] 1 workers
2015-11-04 23:35:08 [17326] [INFO] Booting worker with pid: 17326
2015-11-04 23:35:08 [17171] [INFO] 1 workers
2015-11-04 23:35:08 [17171] [INFO] 1 workers

问题:

错误的原因可能是什么？我该如何调试这个服务器？我在哪里检查？

点卡住:

dateutils==0.6.6
Django==1.8.4
django-cors-headers==1.1.0
django-xmail-ritual==0.0.11 (*)
djangorestframework==3.2.3 (*)
future==0.15.0
gunicorn==19.1.0
psycopg2==2.6.1
python-cantrips==0.7.1 (*)
python-dateutil==2.4.2
pytz==2015.4
six==1.9.0
wheel==0.24.0

(*) 这些包可以正常工作，因为我在其他生产环境中使用它们没有超时。此应用程序曾经可以正常工作，并且这些要求从未更改过。

谢谢 :D。

最佳答案

我找到的答案如下:

在服务器中将其作为 runserver 运行。如果启动需要很多时间，那么您的初始化代码有点繁重(可能是服务、模型元实例化……这取决于您在应用程序中查看代码)。通常这也会影响您的本地环境，但如果没有(并且您有相同的数据库引擎)，请检查服务器中是否有本地未版本控制的文件，并分析其内容。
如果 runserver 命令运行良好但需要大量时间来处理单个请求(或至少是第一个请求)，您应该检查您的 View 或中间件(自定义中间件，如果有的话)是否正在执行一次性初始化代码。
如果运行runserver没有问题，那么通过WSGI检查应用程序是否运行良好。您可以通过在同一 virtualenv 和当前目录中运行交互式解释器并运行代码 from myproject.wsgi import application 来模拟这一点。也许你会像我一样发现时间瓶颈。有时，django WSGI 应用需要一些时间来引导，并且它们会在收到的第一个请求中进行引导(实际上，每次 gunicorn 都需要创建一个新的 worker)。

在我的例子中，我处于场景 3。我注意到将 --timeout=45(或者可能是 60)添加到 gunicorn 启动配置中，我会给工作人员更多的时间来处理请求。否则，创建一个 worker，加载需要 30 多秒，它被杀死，重新启动，尝试相同的请求，需要 30 多秒......然后你会进入一个无限循环。

关于python - 访问 gunicorn 支持的服务时静默超时。如何调试？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33534213/

python - 访问 gunicorn 支持的服务时静默超时。如何调试？

上一篇：java - Java程序的vmstat pidstat结果分析

下一篇：json - 如何从 shell 脚本向 JSON 文件添加新元素