python - 多处理池不改变处理速度？-6ren

python - 多处理池不改变处理速度？

转载作者：太空宇宙更新时间：2023-11-03 21:40:18

我使用 python 3 和 opencv 创建了一个图像近似遗传算法。它所做的是，它创建了一群个体，这些个体在空白图像上绘制随机颜色、大小和不透明度的圆圈。经过数百代之后，适者最终使种群饱和。

我尝试实现多处理，因为渲染图像需要时间，与人口规模和圆圈大小以及目标图像大小(对于细节精细度很重要)相关

我所做的是使用多处理和池，将单个对象的数组作为可迭代对象，并仅映射出适应度和 ID。实际上，在主进程中，没有人拥有自己的 Canvas ，而在多进程进程中，每个人都渲染出自己的 Canvas 并计算适应度/差异。

但是，似乎使用多进程会使整个程序变慢？事实上，与序列化处理相比，渲染过程的速度似乎相同，但由于多处理方面的原因，速度较慢。

class PopulationCircle:
    def renderPop(self, individual):
        individual.render()
    return [individual.index, individual.fitness]
class IndividualCircle:
    def render(self):
        self.genes.sort(key=lambda x: x[-1], reverse=False)
        self.canvas = np.zeros((self.height,self.width, 4), np.uint8)
        for i in range(self.maxCount):
            overlay=self.canvas.copy()
            cv2.circle(overlay, (self.genes[i][0], self.genes[i][1]), self.genes[i][2], (self.genes[i][3],self.genes[i][4],self.genes[i][5]), -1, lineType=cv2.LINE_AA)
            self.canvas = cv2.addWeighted(overlay, self.genes[i][6], self.canvas, 1-self.genes[i][6], 0)

        diff = np.absolute(np.array(self.target)- np.array(self.canvas))

        diffSum = np.sum(diff)

        self.fitness = diffSum

def evolution(mainPop, generationLimit):
    p = mp.Pool()

    for i in range(int(generationLimit)):
        start_time = time.time()
        result =[]
        print(f"""
-----------------------------------------
Current Generation: {mainPop.generation}
Initial Score: {mainPop.score}
-----------------------------------------
        """)

        #Multiprocessing used for rendering out canvas since it takes time.

        result = p.map(mainPop.renderPop, mainPop.population)

        #returns [individual.index, individual.fitness]; results is a list of list
        result.sort(key = lambda x: x[0], reverse=False)

        #Once multiprocessing is done, we only receive fitness value and index. 
        for k in mainPop.population:
            k.fitness = result[k.index][1]
        mainPop.population.sort(key = lambda x: x.fitness, reverse = True)
        if mainPop.generation == 0:
            mainPop.score = mainPop.population[-1].fitness

        """
        Things to note:
            In main process, none of the individuals have a canvas since the rendering
            is done on a different process tree.
            The only thing that changes in this main process is the individual's 
            fitness.

            After calling .renderHD and .renderLD, the fittest member will have a canvas
            drawn in this process. 
        """

        end_time = time.time() - start_time
        print(f"Time taken: {end_time}")
        if i%50==0:
            mainPop.population[0].renderHD()
            cv2.imwrite( f"../output/generationsPoly/generation{i}.jpg", mainPop.population[0].canvasHD)

        if i%10==0:
            mainPop.population[0].renderLD()
            cv2.imwrite( f"../output/allGenPoly/image{i}.jpg", mainPop.population[0].canvas)

        mainPop.toJSON()
        mainPop.breed()



    p.close()
    p.join()

if __name__ == "__main__":
        #Creates Population object
        #init generates self.population array which is an array of IndividualCircle objects that contain DNA and render methods
    pop = PopulationCircle(targetDIR, maxPop, circleAmount, mutationRate, mutationAmount, cutOff)
    #Starts loop
    evolution(pop, generations)

如果我使用 600 个人口和 800 个圈子，连续拍摄:11 次迭代平均。多进程:18 秒/迭代平均值。

我是多处理的新手，因此我们将不胜感激。

最佳答案

它发生的原因是 opencv 在内部产生了很多线程。当您从 main 分支并运行多个进程时，这些进程中的每一个都会创建单独的一堆 opencv 线程，从而导致小雪崩。这里的问题是它们最终会同步并等待锁释放，这是您可以通过使用 cProfile 分析您的代码来轻松检查。

问题在 joblib 中描述文档。这也可能是您的解决方案:切换到 joblib。我过去遇到过类似的问题，您可以在 this SO post 中找到它.

[编辑] 额外的证据和解决方案 here .简而言之，根据该帖子，这是一个已知问题，但由于 opencv 发布了 GIL，因此可以运行多线程而不是多处理，从而减少开销。

关于python - 多处理池不改变处理速度？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56016733/

文章推荐： c# - 获取c#函数值并在html表中显示

文章推荐： javascript - 选择类(class)的第一个 child 在

首页

博学

6Ren·AI

商城

python - 多处理池不改变处理速度？