gpt4 book ai didi

python - 多处理 Pool.map 的奇怪行为

转载 作者:行者123 更新时间:2023-11-28 22:10:32 24 4
gpt4 key购买 nike

我在使用 pool.map 调用方法函数时观察到一个非常奇怪的行为。只有一个进程的行为与简单的 for 循环不同,我们在 if not self.seeded: block 中多次输入,而我们不应该这样做。以下是代码和输出:

import os
from multiprocessing import Pool


class MyClass(object):
def __init__(self):
self.seeded = False
print("Constructor of MyClass called")

def f(self, i):
print("f called with", i)
if not self.seeded:
print("PID : {}, id(self.seeded) : {}, self.seeded : {}".format(os.getpid(), id(self.seeded), self.seeded))
self.seeded = True

def multi_call_pool_map(self):
with Pool(processes=1) as pool:
print("multi_call_pool_map with {} processes...".format(pool._processes))
pool.map(self.f, range(10))

def multi_call_for_loop(self):
print("multi_call_for_loop ...")
list_res = []
for i in range(10):
list_res.append(self.f(i))


if __name__ == "__main__":
MyClass().multi_call_pool_map()

输出:

Constructor of MyClass called
multi_call_pool_map with 1 processes...
f called with 0
PID : 18248, id(self.seeded) : 1864747472, self.seeded : False
f called with 1
f called with 2
f called with 3
PID : 18248, id(self.seeded) : 1864747472, self.seeded : False
f called with 4
f called with 5
f called with 6
PID : 18248, id(self.seeded) : 1864747472, self.seeded : False
f called with 7
f called with 8
f called with 9
PID : 18248, id(self.seeded) : 1864747472, self.seeded : False

还有 for 循环:

if __name__ == "__main__":
MyClass().multi_call_for_loop()

输出:

Constructor of MyClass called
multi_call_for_loop ...
f called with 0
PID : 15840, id(self.seeded) : 1864747472, self.seeded : False
f called with 1
f called with 2
f called with 3
f called with 4
f called with 5
f called with 6
f called with 7
f called with 8
f called with 9

我们如何解释 pool.map 的行为(第一种情况)?我不明白为什么我们在 if block 中多次输入,因为 self.seeded 仅在构造函数中设置为 False 并且构造函数仅被调用一次...(我有 Python 3.6.8)

最佳答案

当运行代码并在 f 中打印 self 时,我们可以看到在每次输入 if 子句之前,实例实际上改变:

    def f(self, i):
print("f called with", i, "self is",self)
if not self.seeded:
print("PID : {}, id(self.seeded) : {}, self.seeded : {}".format(os.getpid(), id(self.seeded), self.seeded))
self.seeded = True

这个输出:

Constructor of MyClass called
multi_call_pool_map with 1 processes...
f called with 0 self is <__main__.MyClass object at 0x7f30cd592b38>
PID : 22879, id(self.seeded) : 10744096, self.seeded : False
f called with 1 self is <__main__.MyClass object at 0x7f30cd592b38>
f called with 2 self is <__main__.MyClass object at 0x7f30cd592b38>
f called with 3 self is <__main__.MyClass object at 0x7f30cd592b00>
PID : 22879, id(self.seeded) : 10744096, self.seeded : False
f called with 4 self is <__main__.MyClass object at 0x7f30cd592b00>
f called with 5 self is <__main__.MyClass object at 0x7f30cd592b00>
f called with 6 self is <__main__.MyClass object at 0x7f30cd592ac8>
PID : 22879, id(self.seeded) : 10744096, self.seeded : False
f called with 7 self is <__main__.MyClass object at 0x7f30cd592ac8>
f called with 8 self is <__main__.MyClass object at 0x7f30cd592ac8>
f called with 9 self is <__main__.MyClass object at 0x7f30cd592a90>
PID : 22879, id(self.seeded) : 10744096, self.seeded : False

如果您将 chunksize=10 添加到 .map(),它的行为就像 for 循环一样:

    def multi_call_pool_map(self):
with Pool(processes=1) as pool:
print("multi_call_pool_map with {} processes...".format(pool._processes))
pool.map(self.f, range(10), chunksize=10)

这个输出:

Constructor of MyClass called
multi_call_pool_map with 1 processes...
f called with 0 self is <__main__.MyClass object at 0x7fd175093b00>
PID : 22972, id(self.seeded) : 10744096, self.seeded : False
f called with 1 self is <__main__.MyClass object at 0x7fd175093b00>
f called with 2 self is <__main__.MyClass object at 0x7fd175093b00>
f called with 3 self is <__main__.MyClass object at 0x7fd175093b00>
f called with 4 self is <__main__.MyClass object at 0x7fd175093b00>
f called with 5 self is <__main__.MyClass object at 0x7fd175093b00>
f called with 6 self is <__main__.MyClass object at 0x7fd175093b00>
f called with 7 self is <__main__.MyClass object at 0x7fd175093b00>
f called with 8 self is <__main__.MyClass object at 0x7fd175093b00>
f called with 9 self is <__main__.MyClass object at 0x7fd175093b00>

发生这种情况的确切原因是一个非常复杂的实现细节,并且与 multiprocessing 如何在同一池中的进程之间共享数据有关。

恐怕我没有足够的资格来准确回答这在内部是如何以及为什么起作用的。

关于python - 多处理 Pool.map 的奇怪行为,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56581494/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com