gpt4 book ai didi

Celery:远程 worker 经常失去连接

转载 作者:行者123 更新时间:2023-12-04 15:44:08 26 4
gpt4 key购买 nike

我有一个在云服务器(Django 应用程序)上运行的 Celery 代理,我办公室的本地服务器上有两个工作人员连接在 NAT 后面。本地工作人员经常失去连接,必须重新启动以重新建立与代理的连接。通常 celeryd restart我第一次尝试时挂起,所以我必须按 ctr+C 并重试一两次才能恢复并连接。 worker 记录了两个最常见的错误:

[2014-08-03 00:08:45,398: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 278, in start
blueprint.start(self)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 796, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python2.7/dist-packages/celery/worker/loops.py", line 72, in asynloop
next(loop)
File "/usr/local/lib/python2.7/dist-packages/kombu/async/hub.py", line 320, in create_loop
cb(*cbargs)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/base.py", line 159, in on_readable
reader(loop)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/base.py", line 142, in _read
raise ConnectionError('Socket was disconnected')
ConnectionError: Socket was disconnected

[2014-03-07 20:15:41,963: CRITICAL/MainProcess] Couldn't ack 11, reason:RecoverableConnectionError(None, 'connection already closed', None, '')
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/kombu/message.py", line 93, in ack_log_error
self.ack()
File "/usr/local/lib/python2.7/dist-packages/kombu/message.py", line 88, in ack
self.channel.basic_ack(self.delivery_tag)
File "/usr/local/lib/python2.7/dist-packages/amqp/channel.py", line 1583, in basic_ack
self._send_method((60, 80), args)
File "/usr/local/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 50, in _send_method
raise RecoverableConnectionError('connection already closed')

我该如何调试?工作人员在 NAT 背后的事实是一个问题吗?是否有一个很好的工具来监控 worker 是否失去了连接?至少,我可以通过手动重新启动工作程序来让它们重新上线。

最佳答案

不幸的是,在 Celery+Kombu 中存在延迟确认问题 - 任务处理程序尝试使用关闭的连接。
我是这样解决的:

CELERY_CONFIG = {
'CELERYD_MAX_TASKS_PER_CHILD': 1,
'CELERYD_PREFETCH_MULTIPLIER': 1,
'CELERY_ACKS_LATE': True,
}

CELERYD_MAX_TASKS_PER_CHILD - 保证在完成任务后工作人员将重新启动。

至于已经失去连接的任务,你现在无能为力。也许它会在第 4 版中修复。我只是确保任务尽可能是幂等的。

关于Celery:远程 worker 经常失去连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25120696/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com