gpt4 book ai didi

python - 避免在 Python 中等待线程完成

转载 作者:行者123 更新时间:2023-12-01 02:00:28 25 4
gpt4 key购买 nike

我在这里编写了这个脚本,用于从 txt 文件中读取数据并对其进行处理。但似乎如果我给它一个大文件和大量线程,它从列表中读取的内容越多,脚本的速度就越慢。

是否有一种方法可以避免等待所有线程完成并在线程完成工作时启动新线程?

而且,当它完成处理时,脚本似乎不会退出。

import threading, Queue, time

class Work(threading.Thread):

def __init__(self, jobs):
threading.Thread.__init__(self)
self.Lock = threading.Lock()
self.jobs = jobs

def myFunction(self):
#simulate work
self.Lock.acquire()
print("Firstname: "+ self.firstname + " Lastname: "+ self.lastname)
self.Lock.release()
time.sleep(3)

def run(self):
while True:
self.item = self.jobs.get().rstrip()
self.firstname = self.item.split(":")[0]
self.lastname = self.item.split(":")[1]
self.myFunction()
self.jobs.task_done()

def main(file):
jobs = Queue.Queue()
myList = open(file, "r").readlines()
MAX_THREADS = 10
pool = [Work(jobs) for i in range(MAX_THREADS)]
for thread in pool:
thread.start()
for item in myList:
jobs.put(item)
for thread in pool:
thread.join()

if __name__ == '__main__':
main('list.txt')

最佳答案

对于较大的输入,脚本似乎需要更长的时间,因为每批打印之间有 3 秒的暂停。

脚本未完成的问题是,由于您使用的是Queue,因此您需要在Queue上调用join(),不在各个线程上。为了确保脚本在作业停止运行时返回,您还应该设置 daemon = True

Lock 在当前代码中也不起作用,因为 threading.Lock() 每次都会生成一个新锁。您需要让所有作业共享相同的锁。

如果您想在 Python 3 中使用它(您应该这样做),Queue 模块已重命名为 queue

import threading, Queue, time

lock = threading.Lock() # One lock

class Work(threading.Thread):

def __init__(self, jobs):
threading.Thread.__init__(self)
self.daemon = True # set daemon
self.jobs = jobs

def myFunction(self):
#simulate work
lock.acquire() # All jobs share the one lock
print("Firstname: "+ self.firstname + " Lastname: "+ self.lastname)
self.Lock.release()
time.sleep(3)

def run(self):
while True:
self.item = self.jobs.get().rstrip()
self.firstname = self.item.split(":")[0]
self.lastname = self.item.split(":")[1]
self.myFunction()
self.jobs.task_done()


def main(file):
jobs = Queue.Queue()
with open(file, 'r') as fp: # Close the file when we're done
myList = fp.readlines()
MAX_THREADS = 10
pool = [Work(jobs) for i in range(MAX_THREADS)]
for thread in pool:
thread.start()
for item in myList:
jobs.put(item)
jobs.join() # Join the Queue


if __name__ == '__main__':
main('list.txt')

更简单的示例(基于 Python docs 中的示例)

import threading
import time
from Queue import Queue # Py2
# from queue import Queue # Py3

lock = threading.Lock()

def worker():
while True:
item = jobs.get()
if item is None:
break
firstname, lastname = item.split(':')
lock.acquire()
print("Firstname: " + firstname + " Lastname: " + lastname)
lock.release()
time.sleep(3)
jobs.task_done()

jobs = Queue()
pool = []
MAX_THREADS = 10
for i in range(MAX_THREADS):
thread = threading.Thread(target=worker)
thread.start()
pool.append(thread)

with open('list.txt') as fp:
for line in fp:
jobs.put(line.rstrip())

# block until all tasks are done
jobs.join()

# stop workers
for i in range(MAX_THREADS):
jobs.put(None)
for thread in pool:
thread.join()

关于python - 避免在 Python 中等待线程完成,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49713281/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com