python - 如何在使用gunicorn的Google App Engine上运行长任务？

GAE flex 默认使用gunicorn 作为入口点，这很好，但我有一个需要很长时间来处理的函数(在数据库中抓取网站和故事数据)，并且默认情况下，gunicorn 在 30 秒超时，然后新的工作人员重新开始执行任务，依此类推。

我可以将gunicorn超时设置为20分钟左右，但这看起来不太优雅。有没有什么方法可以在gunicorn“外部”运行这些后端函数，或者也许是我没有考虑的gunicorn配置？没有客户端，因此完成时间较长不是问题。

我的 app.yaml 文件目前如下所示:

runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app

runtime_config:
  python_version: 2

# This sample incurs costs to run on the App Engine flexible environment. 
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/python/configuring-your app-with-app-yaml
manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 3
  disk_size_gb: 10

最佳答案

您可以使用异步工作级，然后无需将超时设置为 20 分钟。默认的工作线程类是sync。有关 worker 的文件 here .

使用 eventlet 异步工作线程(如果使用 Google 客户端库，则不建议使用 gevent)

pip install eventlet

然后在你的gunicorn实例化中设置worker-class = 'eventlet'并将worker数量设置为[核心数] x 2 +1(这只是 google docs 中的建议)。例如:

CMD exec gunicorn --worker-class eventlet --workers 3 -b :$PORT main:app

Gunicorn Worker Configuration

或者，使用描述的实现 here使用 pubsub 和 worker 。

关于python - 如何在使用gunicorn的Google App Engine上运行长任务？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48160535/

python - 如何在使用gunicorn的Google App Engine上运行长任务？

上一篇：sql - 将 SQL 查询结果导出到 CSV

下一篇： python Pandas : split comma-separated column into new columns - one per value