python - 在 python 的 concurrent.futures 中查找 BrokenProcessPool 的原因-6ren

python - 在 python 的 concurrent.futures 中查找 BrokenProcessPool 的原因

转载作者：太空狗更新时间：2023-10-29 20:23:48

一言以蔽之

当使用 concurrent.futures 并行化我的代码时，我得到了一个 BrokenProcessPool 异常。不会显示更多错误。我想找到错误的原因并询问如何做到这一点的想法。

完整问题

我正在使用 concurrent.futures并行化一些代码。

with ProcessPoolExecutor() as pool:
    mapObj = pool.map(myMethod, args)

我以(且仅以)以下异常结束:

concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore

遗憾的是，程序比较复杂，运行30分钟后才出现错误。因此，我无法提供一个很好的最小示例。

为了找出问题的原因，我包装了与 try-except-block 并行运行的方法:

def myMethod(*args):
    try:
        ...
    except Exception as e:
        print(e)

问题仍然存在，从未输入过 except block 。我得出结论，异常不是来 self 的代码。

我的下一步是编写一个自定义 ProcessPoolExecutor 类，它是原始 ProcessPoolExecutor 的子类，并允许我用自定义方法替换一些方法。我复制并粘贴了方法 _process_worker 的原始代码，并添加了一些打印语句。

def _process_worker(call_queue, result_queue):
    """Evaluates calls from call_queue and places the results in result_queue.
        ...
    """
    while True:
        call_item = call_queue.get(block=True)
        if call_item is None:
            # Wake up queue management thread
            result_queue.put(os.getpid())
            return
        try:
            r = call_item.fn(*call_item.args, **call_item.kwargs)
        except BaseException as e:
                print("??? Exception ???")                 # newly added
                print(e)                                   # newly added
            exc = _ExceptionWithTraceback(e, e.__traceback__)
            result_queue.put(_ResultItem(call_item.work_id, exception=exc))
        else:
            result_queue.put(_ResultItem(call_item.work_id,
                                         result=r))

同样，永远不会进入except block 。这是意料之中的，因为我已经确保我的代码不会引发异常(如果一切正常，应该将异常传递给主进程)。

现在我不知道如何找到错误。此处引发异常:

def submit(self, fn, *args, **kwargs):
    with self._shutdown_lock:
        if self._broken:
            raise BrokenProcessPool('A child process terminated '
                'abruptly, the process pool is not usable anymore')
        if self._shutdown_thread:
            raise RuntimeError('cannot schedule new futures after shutdown')

        f = _base.Future()
        w = _WorkItem(f, fn, args, kwargs)

        self._pending_work_items[self._queue_count] = w
        self._work_ids.put(self._queue_count)
        self._queue_count += 1
        # Wake up queue management thread
        self._result_queue.put(None)

        self._start_queue_management_thread()
        return f

这里设置进程池被打断:

def _queue_management_worker(executor_reference,
                             processes,
                             pending_work_items,
                             work_ids_queue,
                             call_queue,
                             result_queue):
    """Manages the communication between this process and the worker processes.
        ...
    """
    executor = None

    def shutting_down():
        return _shutdown or executor is None or executor._shutdown_thread

    def shutdown_worker():
        ...

    reader = result_queue._reader

    while True:
        _add_call_item_to_queue(pending_work_items,
                                work_ids_queue,
                                call_queue)

        sentinels = [p.sentinel for p in processes.values()]
        assert sentinels
        ready = wait([reader] + sentinels)
        if reader in ready:
            result_item = reader.recv()
        else:                               #THIS BLOCK IS ENTERED WHEN THE ERROR OCCURS
            # Mark the process pool broken so that submits fail right now.
            executor = executor_reference()
            if executor is not None:
                executor._broken = True
                executor._shutdown_thread = True
                executor = None
            # All futures in flight must be marked failed
            for work_id, work_item in pending_work_items.items():
                work_item.future.set_exception(
                    BrokenProcessPool(
                        "A process in the process pool was "
                        "terminated abruptly while the future was "
                        "running or pending."
                    ))
                # Delete references to object. See issue16284
                del work_item
            pending_work_items.clear()
            # Terminate remaining workers forcibly: the queues or their
            # locks may be in a dirty state and block forever.
            for p in processes.values():
                p.terminate()
            shutdown_worker()
            return
        ...

进程终止是(或似乎是)事实，但我不知道为什么。到目前为止我的想法是否正确？导致进程在没有消息的情况下终止的可能原因是什么？ (这甚至可能吗？)我可以在哪里应用进一步的诊断？为了更接近解决方案，我应该问自己哪些问题？

我在 64 位 Linux 上使用 python 3.5。

最佳答案

我想我能走得尽可能远:

我在更改后的 ProcessPoolExecutor 模块中更改了 _queue_management_worker 方法，以便打印失败进程的退出代码:

def _queue_management_worker(executor_reference,
                             processes,
                             pending_work_items,
                             work_ids_queue,
                             call_queue,
                             result_queue):
    """Manages the communication between this process and the worker processes.
        ...
    """
    executor = None

    def shutting_down():
        return _shutdown or executor is None or executor._shutdown_thread

    def shutdown_worker():
        ...

    reader = result_queue._reader

    while True:
        _add_call_item_to_queue(pending_work_items,
                                work_ids_queue,
                                call_queue)

        sentinels = [p.sentinel for p in processes.values()]
        assert sentinels
        ready = wait([reader] + sentinels)
        if reader in ready:
            result_item = reader.recv()
        else:                               

            # BLOCK INSERTED FOR DIAGNOSIS ONLY ---------
            vals = list(processes.values())
            for s in ready:
                j = sentinels.index(s)
                print("is_alive()", vals[j].is_alive())
                print("exitcode", vals[j].exitcode)
            # -------------------------------------------


            # Mark the process pool broken so that submits fail right now.
            executor = executor_reference()
            if executor is not None:
                executor._broken = True
                executor._shutdown_thread = True
                executor = None
            # All futures in flight must be marked failed
            for work_id, work_item in pending_work_items.items():
                work_item.future.set_exception(
                    BrokenProcessPool(
                        "A process in the process pool was "
                        "terminated abruptly while the future was "
                        "running or pending."
                    ))
                # Delete references to object. See issue16284
                del work_item
            pending_work_items.clear()
            # Terminate remaining workers forcibly: the queues or their
            # locks may be in a dirty state and block forever.
            for p in processes.values():
                p.terminate()
            shutdown_worker()
            return
        ...

后来查了一下exit code的意思:

from multiprocessing.process import _exitcode_to_name
print(_exitcode_to_name[my_exit_code])

其中 my_exit_code 是打印在我插入到 _queue_management_worker 的 block 中的退出代码。在我的例子中，代码是 -11，这意味着我遇到了段错误。找到这个问题的原因将是一项艰巨的任务，但超出了这个问题的范围。

关于python - 在 python 的 concurrent.futures 中查找 BrokenProcessPool 的原因，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41454049/

文章推荐： c# - 预期事件序列报告重复事件的测试助手

文章推荐： c++ - 如何正确使用_beginthread和endthread

文章推荐： python - 如何在 odbcinst -j 中编辑路径

rust - Tokio core.run无法编译。错误: the trait `futures::future::Future` is not implemented for `impl futures::Future`
我正在通过这个示例https://www.rusoto.org/futures.html学习Rust和Rusoto 而且我发现许多代码已经过时了。所以我改变了这样的代码: use rusoto_cor
scala - Future[Future[T]] 到 Future[T] 在另一个 Future.map 中而不使用 Await？
这是一个理论问题。我有一个服务可以调用来完成工作，但该服务可能无法完成所有工作，因此我需要调用第二个服务来完成它。我想知道是否有办法在没有 Await.result 的情况下做类似的事情map 函数
rust - 理解错误 : trait `futures::future::Future` is not implemented for `()`
这个问题是关于如何阅读 Rust 文档并提高我对 Rust 的理解，从而了解如何解决这个特定的编译器错误。我读过 tokio docs并试验了许多 examples .在编写自己的代码时，我经常遇到
rust - 如何满足 `impl futures::Future: futures::TryStream` 的特征界限
我有一个使用分页的 HTTP api，我想将它包装到一个通用的 Rust 流中，以便所有端点都可以使用相同的接口(interface)，这样我就可以使用 Stream 附带的特征函数特征。我收到了这
java - 处理两种不同类型的 future (其中一种 future 依赖于另一种 future )的理想方式是什么？
我正在查看 AKKA 的 Java Futures API，我看到了很多处理同一类型的多个 future 的方法，但我没有看到任何处理不同类型的 future 的方法。我猜我让事情变得更加复杂了。无
java - 我怎样才能把 future 的 future 变成一个 future 的对象？
环境:Akka 2.1，scala 版本 2.10.M6，JDK 1.7，u5 现在是我的问题: 我有: future1 = Futures.future(new Callable>(){...});
java - 有没有一种简单的方法可以将 Future> 变成 Future？
我有一些代码可以将请求提交给另一个线程，该线程可能会也可能不会将该请求提交给另一个线程。这会产生 Future> 的返回类型.是否有一些非令人发指的方法可以立即将其变成 Future等待整个 futu
dart - 在 Dart 中，如果我将 Future.wait 与 Futures 列表一起使用，并且在其中一个 Futures 上抛出错误，那么其他 Futures 会发生什么？
如果我有以下代码: Future a = new Future(() { print('a'); return 1; }); Future b = new Future.error('Error!')
scala - Future[Option[Future[Option[Boolean]]] 简化 future 和期权？
我一直试图简化我在 Scala 中做 future 的方式。我有一次收到了 Future[Option[Future[Option[Boolean]]但我在下面进一步简化了它。有没有更好的方法来简化这
scala - Future[Option[Future[Int]]] 到 Future[Option[Int]]
Scala 中从 Future[Option[Future[Int]]] 转换的最干净的方法是什么？至 Future[Option[Int]] ?甚至有可能吗？最佳答案有两个嵌套Future s
python - 如何以非阻塞方式链接 future ？即，如何在不阻塞的情况下将一个 future 作为另一个 future 的输入？
使用下面的示例，future2 如何在 future1 完成后使用 future1 的结果(不阻塞 future3 从被提交)? from concurrent.futures import Proc
python - 为什么 asyncio.Future 与 concurrent.futures.Future 不兼容？
这两个类代表了并发编程的优秀抽象，因此它们不支持相同的 API 有点令人不安。具体根据docs : asyncio.Future is almost compatible with concurre
rust - 类型不匹配解决 ::Output == std::result::Result
我正在尝试使用 wasm_bindgen 实现 API 类使用异步调用。 #![allow(non_snake_case)] use std::future::Future; use serde::{
scala - 在 Scala 中，如何将 future 列表转换为返回第一个成功 future 的 future ？
这个问题在这里已经有了答案: Futures / Success race (3 个回答) 去年关闭。所有的 future 最终可能会成功(有些可能会失败)，但我们希望第一个成功。并希望将这一结果表
python-3.x - concurrent.futures.Future 可以转换为 asyncio.Future 吗？
我在练习asyncio在编写多线程代码多年之后。注意到一些我觉得很奇怪的东西。都在 asyncio在 concurrent有一个Future目的。 from asyncio import Futur
scala - `Future[Option[Future[Option[X]]]]` 变为 `Future[Option[X]]`
如何将Future[Option[Future[Option[X]]]]转换为Future[Option[X]]？如果它是 TraversableOnce 而不是 Option 我会使用 Futur
python - 为什么在所有 futures 完成后与 futures.as_completed 一起使用时，concurrent.futures 执行器映射会抛出错误？
我正在尝试同时发送 HTTP 请求。为此，我使用 concurrent.futures 这是简单的代码: import requests from concurrent import futures
future - Vertx 中任意数量调用的顺序组合与 Futures
我们在 vertx 中使用 Futures 的例子如下: Future fetchVehicle = getUserBookedVehicle(routingContext, client);
rust - future.then() 如何返回一个 Future？
下面的函数，取自 here : fn connection_for( &self, pool_key: PoolKey, ) -> impl Future>, ClientError>
scala - future Scala的 future
我正在围绕Java库编写一个小的Scala包装器。 Java库有一个对象QueryExecutor，它公开了2种方法: execute(query):结果 asyncExecute(query):Li

太空狗

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 在 python 的 concurrent.futures 中查找 BrokenProcessPool 的原因