gpt4 book ai didi

python - 有没有比 fid.readline() 更快的 pythonic 方法来读取文件的前几行?

转载 作者:太空宇宙 更新时间:2023-11-03 14:53:33 24 4
gpt4 key购买 nike

我必须打开数千个文件,但只读取前 3 行。

目前,我正在这样做:

def test_readline(filename):
fid = open(filename, 'rb')
lines = [fid.readline() for i in range(3)]

这会产生结果:

The slowest run took 10.20 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 59.2 µs per loop

另一种解决方案是将 fid 转换为列表:

def test_list(filename):
fid = open(filename, 'rb')
lines = list(fid)

%timeit test_list(MYFILE)

The slowest run took 4.92 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 374 µs per loop

哎呀!!有没有更快的方法来只读取这些文件的前 3 行,或者 readline() 是最好的吗?您能回复一下替代方案和时间安排吗?

但是最终我必须打开数千个单独的文件,并且它们不会被缓存。那么,这重要吗(看起来很重要)?

(603μs 未缓存方法读取行与 1840μs 列表方法)

此外,这是 readlines() 方法:

def test_readlines(filename):
fid = open(filename, 'rb')
lines = fid.readlines()
return lines

The slowest run took 7.17 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 334 µs per loop

最佳答案

您可以使用itertools.islice对可迭代对象进行切片:

import itertools


def test_list(filename):
with open(filename, 'r', encoding='utf-8') as f:
return list(itertools.islice(f, 3))

(我稍微更改了open,因为以二进制模式逐行读取文件有点不寻常,但您可以恢复它。)

关于python - 有没有比 fid.readline() 更快的 pythonic 方法来读取文件的前几行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45757778/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com