gpt4 book ai didi

python - Dask 工作人员进入卡住状态并且不会回来

转载 作者:太空宇宙 更新时间:2023-11-03 20:54:56 25 4
gpt4 key购买 nike

我有一个 Dask 工作人员“卡住”了。当我查看工作线程的调用堆栈时,它看起来像这样:

Worker: tcp://127.0.0.1:59180
Key: _forecast-ee805cbdce4f41ca491bc4dc194c9793
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/threading.py", line 774, in __bootstrap self.__bootstrap_inner()
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run()
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs)
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/site-packages/distributed/threadpoolexecutor.py", line 57, in _worker task.run()
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/site-packages/distributed/_concurrent_futures_thread.py", line 64, in run result = self.fn(*self.args, **self.kwargs)
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/site-packages/distributed/worker.py", line 2811, in apply_function result = function(*args, **kwargs)
File "/Users/wcox/Documents/ovforecast/src/python/ovforecast/forecast.py", line 173, in _forecast end_time, persist=False, testset=testset)
File "/Users/wcox/Documents/ovforecast/src/python/ovforecast/forecast.py", line 236, in _build_features log.debug('for:')
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/logging/__init__.py", line 1162, in debug self._log(DEBUG, msg, args, **kwargs)
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/logging/__init__.py", line 1293, in _log self.handle(record)
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/logging/__init__.py", line 1303, in handle self.callHandlers(record)
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/logging/__init__.py", line 1343, in callHandlers hdlr.handle(record)
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/logging/__init__.py", line 764, in handle self.acquire()
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/logging/__init__.py", line 715, in acquire self.lock.acquire()
File "/Users/wcox/miniconda3/envs/ovf/lib/python2.7/threading.py", line 174, in acquire rc = self.__block.acquire(blocking)

似乎它以某种方式进入了与日志记录相关的死锁状态。这是一个间歇性问题,不会在我每次运行工作时发生。我的其他工作人员愉快地处理数据,但由于最终工作人员处于卡住状态,我的工作永远无法完成。

我的应用程序如下所示:

from dask.distributed import Client, LocalCluster, as_completed

cluster = LocalCluster(processes=config.use_dask_local_processes,
n_workers=6,
threads_per_worker=1,
)
client = Client(cluster)
cluster.scale(config.dask_local_worker_instances)

work_futures = []

# For each group do work
for group in groups:
fcast_futures.append(client.submit(_forecast, group))

# Wait till the work is done
for done_work in as_completed(fcast_futures, with_results=False):
try:
result = done_work.result()
except Exception as error:
log.exception(error)

我的记录器设置为 DEBUG 并且有一个 StreamHandler。设置如下:

    logformat = logformat or default_logformat
log = logging.getLogger()
log.setLevel(logging.DEBUG)
formatter = logging.Formatter(logformat)

# Logging to the console
handler = logging.StreamHandler()
handler.setFormatter(formatter)
log.addHandler(handler)

我在 Python 2.7 上使用 dask==1.2.0。

最佳答案

这是 Python 2 中日志记录模块的一个已知问题。

不幸的是,我不知道有什么好的解决方法。

关于python - Dask 工作人员进入卡住状态并且不会回来,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56081268/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com