gpt4 book ai didi

python - 读取 .txt 文件中的数据(不包括页眉和页脚)

转载 作者:太空宇宙 更新时间:2023-11-03 18:44:27 24 4
gpt4 key购买 nike

我有一个 .txt 文件,如下所示:

abcd this is the header
more header, nothing here I need
***********
column1 column2
========= =========
12.4 A
34.6 mm
1.3 um
=====================
footer, nothing that I need here
***** more text ******

我正在尝试读取列中的数据,每个列都有自己的列表,col1 = [12.4, 34.6, 1.3] 和 col2 = ['A', 'mm', 'um']。

这是我到目前为止所拥有的,但是当我运行代码时返回的唯一内容是“None”:

def readfile():
y = sys.argv[1]

z = open(y)
for line in z:

data = False
if data == True:
toks = line.split()
print toks

if line.startswith('========= ========='):
data = True
continue

if line.startswith('====================='):
data = False
break
print readfile()

有什么建议吗?

最佳答案

有很多方法可以做到这一点。

一种方法涉及:

  1. 按行读取文件
  2. 从读取的行中,找到包含列标题分隔符的行的索引(因为这也与页脚标题匹配)。
  3. 然后,将数据存储在这些行之间。
  4. 通过根据空格拆分这些行并将其存储到各自的列中来解析这些行。

像这样:

with open('data.dat', 'r') as f:
lines = f.readlines()

#This gets the limits of the lines that contain the header / footer delimiters
#We can use the Column header delimiters double-time as the footer delimiter:
#`=====================` also matches against this.
#Note, the output size is supposed to be 2. If there are lines than contain this delimiter, you'll get problems
limits = [idx for idx, data in enumerate(lines) if '=========' in data]

#`data` now contains all the lines between these limits
data = lines[limits[0]+1:limits[1]]

#Now, you can parse the lines into rows by splitting the line on whitespace
rows = [line.split() for line in data]

#Column 1 has float data, so we convert the string data to float
col1 = [float(row[0]) for row in rows]

#Column 2 is String data, so there is nothing further to do
col2 = [row[1] for row in rows]

print col1, col2

此输出(来自您的示例):

[12.4, 34.6, 1.3] #Column 1
['A', 'mm', 'um'] #Column 2

关于python - 读取 .txt 文件中的数据(不包括页眉和页脚),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19849156/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com