gpt4 book ai didi

python - Readline 和线程

转载 作者:行者123 更新时间:2023-12-01 05:59:59 24 4
gpt4 key购买 nike

所以我运行下面的代码,当我运行后使用queue.qsize()时,队列中仍然有450,000个左右的项目,这意味着文本文件的大部分行都没有被读取。知道这里发生了什么吗?

from Queue import Queue
from threading import Thread

lines = 660918 #int(str.split(os.popen('wc -l HGDP_FinalReport_Forward.txt').read())[0]) -1
queue = Queue()
File = 'HGDP_FinalReport_Forward.txt'
num_threads =10
short_file = open(File)

class worker(Thread):
def __init__(self,queue):
Thread.__init__(self)
self.queue = queue
def run(self):
while True:
try:
self.queue.get()
i = short_file.readline()
self.queue.task_done() #signal to the queue that the task is done
except:
break

## This is where I should make the call to the threads

def main():
for i in range(num_threads):
worker(queue).start()
queue.join()


for i in range(lines): # put the range of the number of lines in the .txt file
queue.put(i)

main()

最佳答案

很难确切地知道您在这里要做什么,但如果可以独立处理每一行,多处理是一个更简单的选择,它将为您处理所有同步。额外的好处是您不必提前知道行数。

基本上,

import multiprocessing
pool = multiprocessing.Pool(10)

def process(line):
return len(line) #or whatever

with open(path) as lines:
results = pool.map(process, lines)

或者,如果您只是想从各行中获取某种聚合结果,则可以使用 reduce 来降低内存使用量。

import operator
with open(path) as lines:
result = reduce(operator.add, pool.map(process, lines))

关于python - Readline 和线程,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10919327/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com