gpt4 book ai didi

python - Pandas :读取列中包含特殊字符的文件

转载 作者:太空宇宙 更新时间:2023-11-03 15:00:44 25 4
gpt4 key购买 nike

在我拥有的数据中,一些特征值是 ?。如何用 NA 替换它们?

编辑

代码和输出如下:

df = pd.read_csv("cca-census-income.csv", header = None)

df.replace('?', np.nan, inplace=True)

df.ix[0,]

23 Other relative of householder
24 1700.09
25 ?
26 ?
27 ?
28 Not in universe under 1 year old
29 ?
30 0

最佳答案

将参数 na_values='?' 添加到 read_csv .

示例:

import pandas as pd
import io


temp=u"""Date Time,a
2010-01-27 16:00:00,?
2010-01-27 16:10:00,2.2
2010-01-27 16:30:00,1.7"""

df = pd.read_csv(io.StringIO(temp),na_values='?')
print (df)
Date Time a
0 2010-01-27 16:00:00 NaN
1 2010-01-27 16:10:00 2.2
2 2010-01-27 16:30:00 1.7

编辑:

谢谢'shivsn'对于建议添加 skipinitialspace=True:

temp=u"""Date Time,a
? , ?
? ,?
2010-01-27 16:30:00,1.7"""

df = pd.read_csv(io.StringIO(temp),na_values=['?', '? '], skipinitialspace =True)
print (df)
Date Time a
0 NaN NaN
1 NaN NaN
2 2010-01-27 16:30:00 1.7

按文件编辑 1:

?之前好像只有空格:

df = pd.read_csv('census-income.data', 
header = None,
na_values=['?'],
skipinitialspace =True)
print (df)

关于python - Pandas :读取列中包含特殊字符的文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38233046/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com