gpt4 book ai didi

python - 何时在进程上调用 .join()?

转载 作者:IT老高 更新时间:2023-10-28 22:04:35 24 4
gpt4 key购买 nike

我正在阅读有关 Python 中的多处理模块的各种教程,但无法理解为什么/何时调用 process.join()。例如,我偶然发现了这个例子:

nums = range(100000)
nprocs = 4

def worker(nums, out_q):
""" The worker function, invoked in a process. 'nums' is a
list of numbers to factor. The results are placed in
a dictionary that's pushed to a queue.
"""
outdict = {}
for n in nums:
outdict[n] = factorize_naive(n)
out_q.put(outdict)

# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []

for i in range(nprocs):
p = multiprocessing.Process(
target=worker,
args=(nums[chunksize * i:chunksize * (i + 1)],
out_q))
procs.append(p)
p.start()

# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())

# Wait for all worker processes to finish
for p in procs:
p.join()

print resultdict

据我了解,process.join() 会阻塞调用进程,直到调用join方法的进程完成执行。我也相信上面代码示例中启动的子进程在完成目标函数后,即在他们将结果推送到out_q之后完成执行。最后,我相信 out_q.get() 会阻塞调用过程,直到有结果被提取。因此,如果您考虑代码:

resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())

# Wait for all worker processes to finish
for p in procs:
p.join()

主进程被 out_q.get() 调用阻塞,直到每个工作进程 完成将其结果推送到队列。因此,当主进程退出 for 循环时,每个子进程都应该已完成执行,对吗?

如果是这样的话,此时是否有任何理由调用 p.join() 方法?不是所有的工作进程都已经完成,那么这如何导致主进程“等待所有工作进程完成”?我之所以问,主要是因为我在多个不同的示例中看到了这一点,并且我很好奇我是否未能理解某些内容。

最佳答案

尝试运行这个:

import math
import time
from multiprocessing import Queue
import multiprocessing

def factorize_naive(n):
factors = []
for div in range(2, int(n**.5)+1):
while not n % div:
factors.append(div)
n //= div
if n != 1:
factors.append(n)
return factors

nums = range(100000)
nprocs = 4

def worker(nums, out_q):
""" The worker function, invoked in a process. 'nums' is a
list of numbers to factor. The results are placed in
a dictionary that's pushed to a queue.
"""
outdict = {}
for n in nums:
outdict[n] = factorize_naive(n)
out_q.put(outdict)

# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []

for i in range(nprocs):
p = multiprocessing.Process(
target=worker,
args=(nums[chunksize * i:chunksize * (i + 1)],
out_q))
procs.append(p)
p.start()

# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())

time.sleep(5)

# Wait for all worker processes to finish
for p in procs:
p.join()

print resultdict

time.sleep(15)

然后打开任务管理器。您应该能够看到 4 个子进程在被操作系统终止之前进入僵尸状态几秒钟(由于加入调用):

enter image description here

在更复杂的情况下,子进程可能永远处于僵尸状态(就像您在另一个 question 中询问的情况),如果您创建足够多的子进程,您可能会填充进程表,从而给进程带来麻烦操作系统(可能会杀死您的主进程以避免失败)。

关于python - 何时在进程上调用 .join()?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14429703/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com