gpt4 book ai didi

python - 如何在连续循环中使用python多处理池

转载 作者:行者123 更新时间:2023-12-01 03:21:57 24 4
gpt4 key购买 nike

我正在使用 python 多处理库来执行 selenium 脚本。我的代码如下:

#-- start and join multiple threads ---
thread_list = []
total_threads=10 #-- no of parallel threads
for i in range(total_threads):
t = Process(target=get_browser_and_start, args=[url,nlp,pixel])
thread_list.append(t)
print "starting thread..."
t.start()

for t in thread_list:
print "joining existing thread..."
t.join()

据我了解 join() 函数,它将等待每个进程完成。但我希望一旦一个进程被释放,它就会被分配另一个任务来执行新的功能。

可以这样理解:

假设首先启动了 8 个进程。

no_of_tasks_to_perform = 100

for i in range(no_of_tasks_to_perform):
processes start(8)
if process no 2 finished executing, start new process
maintain 8 process at any point of time till
"i" is <= no_of_tasks_to_perform

最佳答案

不要时不时地启动新进程,而是尝试将所有任务放入 multiprocessing.Queue() 中,并启动 8 个长时间运行的进程,每个进程不断访问任务队列以获取新任务,然后执行该工作,直到不再有任务为止。

就您而言,它更像是这样:

from multiprocessing import Queue, Process

def worker(queue):
while not queue.empty():
task = queue.get()

# now start to work on your task
get_browser_and_start(url,nlp,pixel) # url, nlp, pixel can be unpacked from task

def main():
queue = Queue()

# Now put tasks into queue
no_of_tasks_to_perform = 100

for i in range(no_of_tasks_to_perform):
queue.put([url, nlp, pixel, ...])

# Now start all processes
process = Process(target=worker, args=(queue, ))
process.start()
...
process.join()

关于python - 如何在连续循环中使用python多处理池,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41827157/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com