gpt4 book ai didi

Python:多进程工作人员,跟踪已完成的任务(缺少完成)

转载 作者:太空宇宙 更新时间:2023-11-03 17:58:33 24 4
gpt4 key购买 nike

默认的multiprocessing.Pool代码包含一个计数器,用于跟踪工作人员已完成的任务数量:

    completed += 1
logging.debug('worker exiting after %d tasks' % completed)

但是从 range(12) 上升到 range(20) pool.map 会导致计数器出现错误(这看起来是与 worker 创建无关)。我也不太清楚是什么原因造成的。

例如:

import multiprocessing as mp

def ret_x(x):
return x
def inform():
print('made a worker!')
pool = mp.Pool(2, maxtasksperchild=2, initializer=inform)
res= pool.map(ret_x, range(8))
print(res)

将正常工作:

made a worker!
made a worker!
worker exiting after 2 tasks
worker exiting after 2 tasks
made a worker!
worker exiting after 2 tasks
made a worker!
worker exiting after 2 tasks
[0, 1, 2, 3, 4, 5, 6, 7]

但是将 range 更改为 20 不会显示正在创建任何其他工作人员或总共 20 个已完成的任务,即使已完成的范围按预期返回列表。

made a worker!
made a worker!
worker exiting after 2 tasks
worker exiting after 2 tasks
made a worker!
worker exiting after 2 tasks
made a worker!
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
worker exiting after 1 tasks

最佳答案

它是这样工作的,因为您没有在 pool.map 中显式定义“chunksize”:

map(func, iterable[, chunksize])

This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer

来源:https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool

对于 8 个项目,考虑到 len(pool)=2, block 大小将为 1 (divmod(8,2*4)),因此您会看到 (8/1)/2 个 worker = 4 个 worker

workers = (len of items / chunksize) /  tasks per process

对于 20 个项目,考虑 len(pool)=2, block 大小将为 3 (divmode(20,2*4)),因此您会看到类似 (20/3)/2 = 3.3 个 worker 的内容

对于 40...chunksize=5, worker = (40/5)/5 = 4 个 worker

如果需要,可以设置 chunksize=1

res = pool.map(ret_x, range(40), 1)

你会看到 (20/1)/2 = 10 个 worker

python mppp.py
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

所以 chunksize 就像一个进程的单位工作量......或者类似的东西。

如何计算 block 大小:https://hg.python.org/cpython/file/1c54def5947c/Lib/multiprocessing/pool.py#l305

关于Python:多进程工作人员,跟踪已完成的任务(缺少完成),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28101232/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com