python - redis.exceptions.ConnectionError 在 celery 运行大约一天后

标签 python django redis celery redis-py

这是我的完整轨迹:

    Traceback (most recent call last):
  File "/home/server/backend/venv/lib/python3.4/site-packages/celery/app/trace.py", line 283, in trace_task
    uuid, retval, SUCCESS, request=task_request,
  File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/base.py", line 256, in store_result
    request=request, **kwargs)
  File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/base.py", line 490, in _store_result
    self.set(self.get_key_for_task(task_id), self.encode(meta))
  File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 160, in set
    return self.ensure(self._set, (key, value), **retry_policy)
  File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 149, in ensure
    **retry_policy
  File "/home/server/backend/venv/lib/python3.4/site-packages/kombu/utils/__init__.py", line 243, in retry_over_time
    return fun(*args, **kwargs)
  File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 169, in _set
    pipe.execute()
  File "/home/server/backend/venv/lib/python3.4/site-packages/redis/client.py", line 2593, in execute
    return execute(conn, stack, raise_on_error)
  File "/home/server/backend/venv/lib/python3.4/site-packages/redis/client.py", line 2447, in _execute_transaction
    connection.send_packed_command(all_cmds)
  File "/home/server/backend/venv/lib/python3.4/site-packages/redis/connection.py", line 532, in send_packed_command
    self.connect()
  File "/home/pserver/backend/venv/lib/python3.4/site-packages/redis/connection.py", line 436, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 0 connecting to localhost:6379. Error.
[2016-09-21 10:47:18,814: WARNING/Worker-747] Data collector is not contactable. This can be because of a network issue or because of the data collector being restarted. In the event that contact cannot be made after a period of time then please report this problem to New Relic support for further investigation. The error raised was ConnectionError(ProtocolError('Connection aborted.', BlockingIOError(11, 'Resource temporarily unavailable')),).

我真的搜索了 ConnectionError 但没有匹配到我的问题。

我的平台是ubuntu 14.04。这是我的 redis 配置的一部分。 (如果您需要整个 redis.conf 文件,我可以分享。顺便说一句,所有参数都在 LIMITS 部分关闭。)

# By default Redis listens for connections from all the network interfaces
# available on the server. It is possible to listen to just one or multiple
# interfaces using the "bind" configuration directive, followed by one or
# more IP addresses.
#
# Examples:
#
# bind 192.168.1.100 10.0.0.1
bind 127.0.0.1

# Specify the path for the unix socket that will be used to listen for
# incoming connections. There is no default, so Redis will not listen
# on a unix socket when not specified.
#
# unixsocket /var/run/redis/redis.sock
# unixsocketperm 755

# Close the connection after a client is idle for N seconds (0 to disable)
timeout 0

# TCP keepalive.
#
# If non-zero, use SO_KEEPALIVE to send TCP ACKs to clients in absence
# of communication. This is useful for two reasons:
#
# 1) Detect dead peers.
# 2) Take the connection alive from the point of view of network
#    equipment in the middle.
#
# On Linux, the specified value (in seconds) is the period used to send ACKs.
# Note that to close the connection the double of the time is needed.
# On other kernels the period depends on the kernel configuration.
#
# A reasonable value for this option is 60 seconds.
tcp-keepalive 60

这是我的迷你 redis 包装器:

import redis

from django.conf import settings


REDIS_POOL = redis.ConnectionPool(host=settings.REDIS_HOST, port=settings.REDIS_PORT)


def get_redis_server():
    return redis.Redis(connection_pool=REDIS_POOL)

这就是我使用它的方式:

from redis_wrapper import get_redis_server

# view and task are working in different, indipendent processes

def sample_view(request):
    rs = get_redis_server()
    # some get-set stuff with redis



@shared_task
def sample_celery_task():
    rs = get_redis_server()
    # some get-set stuff with redis

包版本:

celery==3.1.18
django-celery==3.1.16
kombu==3.0.26
redis==2.10.3

所以问题是;此连接错误发生在启动 celery worker 一段时间后。在第一次出现这个错误之后,所有的任务都以这个错误结束,直到我重新启动我所有的 celery worker 。 (有趣的是, celery 花也在那个有问题的时期失败了)

我怀疑我的redis连接池使用方法,或者redis配置或者不太可能是网络问题。关于原因的任何想法?我做错了什么?

(PS:今天会看到这个报错的时候再补充redis-cli info结果)

更新:

我通过添加--maxtasksperchild暂时解决了这个问题我的 worker 启动命令的参数。我设置为200。当然这不是解决这个问题的正确方法,它只是对症治疗。它基本上定期刷新工作实例(关闭旧进程并在旧进程达到 200 个任务时创建新进程)并刷新我的全局 redis 池和连接。 所以我认为我应该关注全局 redis 连接池的使用方式,我仍在等待新的想法和评论。

抱歉我的英语不好,在此先感谢您。

最佳答案

redis中是否开启了rdb后台保存方式??
如果是,请检查 /var/lib/redis 中的 dump.rdb 文件的大小。
有时文件会变大并填满 root 目录,redis 实例无法再保存到该文件。

您可以通过发出
来停止后台保存过程 config set stop-writes-on-bgsave-error no
redis-cli

上的命令

关于python - redis.exceptions.ConnectionError 在 celery 运行大约一天后,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39638839/

相关文章:

python - 属性错误;按类别问题列出的图表列表的动态 url(查询集过滤器)

spring-boot - 在 spring boot 中启用 Redis 缓存

string - 什么是二进制安全字符串?

python - ROC AUC 值为 0

python - numpy.oldnumeric.float32 小数字溢出(1e39 ---> inf)python

python - 移动到服务器时Python脚本失败-(Matplotlib版本)

除根目录外的任何其他页面上的 django、nginx、gunicorn 404

python - Django EmailMultiAlternatives 发送到多个 "to"地址

javascript - 增加 Meteor.observe 的更新频率

python - 使用 Python 中已安装的 libsqlite3 版本