gpt4 book ai didi

python - 以可变间距读取python中的文本文件

转载 作者:太空宇宙 更新时间:2023-11-03 14:05:32 24 4
gpt4 key购买 nike

我有以下文本文件形式的数据,我想将其加载到 python 中:

      pclass  survived                                               name  
0 1 1 Allen, Miss. Elisabeth Walton
1 1 1 Allison, Master. Hudson Trevor
2 1 0 Allison, Miss. Helen Loraine
3 1 0 Allison, Mr. Hudson Joshua Creighton
4 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
5 1 1 Anderson, Mr. Harry
6 1 1 Andrews, Miss. Kornelia Theodosia
7 1 0 Andrews, Mr. Thomas Jr
8 1 1 Appleton, Mrs. Edward Dale (Charlotte Lamson)
9 1 0 Artagaveytia, Mr. Ramon
10 1 0 Astor, Col. John Jacob

由于空白不是常量,而且最后一个字段(名称)之间有一个空白,因此我无法解析它。我尝试了以下方法:

pd.read_csv("test.csv",sep = "\s+", header=0, index_col=0)

但是报错:

CParserError: Error tokenizing data. C error: Expected 7 fields in line 5, saw 8

最佳答案

'\s+' 假定一个或多个空格仍然解析您的最后一列。而是使用假定两个或更多的正则表达式。

pd.read_csv("test.csv", sep="\s{2,}", header=0, index_col=0, engine='python')

整个工作示例

from io import StringIO
import pandas as pd

txt = """ pclass survived name
0 1 1 Allen, Miss. Elisabeth Walton
1 1 1 Allison, Master. Hudson Trevor
2 1 0 Allison, Miss. Helen Loraine
3 1 0 Allison, Mr. Hudson Joshua Creighton
4 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
5 1 1 Anderson, Mr. Harry
6 1 1 Andrews, Miss. Kornelia Theodosia
7 1 0 Andrews, Mr. Thomas Jr
8 1 1 Appleton, Mrs. Edward Dale (Charlotte Lamson)
9 1 0 Artagaveytia, Mr. Ramon
10 1 0 Astor, Col. John Jacob
"""

pd.read_csv(StringIO(txt), sep="\s{2,}", header=0, index_col=0, engine='python')

pclass survived name
0 1 1 Allen, Miss. Elisabeth Walton
1 1 1 Allison, Master. Hudson Trevor
2 1 0 Allison, Miss. Helen Loraine
3 1 0 Allison, Mr. Hudson Joshua Creighton
4 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
5 1 1 Anderson, Mr. Harry
6 1 1 Andrews, Miss. Kornelia Theodosia
7 1 0 Andrews, Mr. Thomas Jr
8 1 1 Appleton, Mrs. Edward Dale (Charlotte Lamson)
9 1 0 Artagaveytia, Mr. Ramon
10 1 0 Astor, Col. John Jacob

关于python - 以可变间距读取python中的文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43811290/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com