gpt4 book ai didi

python - 用 np.NaN 替换 pandas 数据框中的缺失值(以字符串形式给出)

转载 作者:太空宇宙 更新时间:2023-11-03 15:51:21 24 4
gpt4 key购买 nike

我有一个数据框energy,其中某些列缺少值。缺失值由数据框中的字符串 ... 表示。我想用 np.NaN

替换所有这些值
In [3]: import pandas as pd

In [4]: import numpy as np

In [7]: energy = pd.read_excel('test.xls', skiprows = 17, skip_footer = 38, parse_cols = range(2, 6), index_col = None, names = ['Country', 'ES'
...: , 'ESC', '% Renewable'])

In [8]: energy[(energy['ES'] == "...") | (energy['ESC'] == "...")]
Out[8]:
Country ES ESC % Renewable
3 American Samoa ... ... 0.641026
86 Guam ... ... 0.000000
150 Northern Mariana Islands ... ... 0.000000
210 Tuvalu ... ... 0.000000
217 United States Virgin Islands ... ... 0.000000

为了替换这些值,我尝试过:

In [9]: energy[(energy['ES'] == "...")]['ES'] = np.NaN
/usr/local/bin/ipython:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
#!/usr/bin/python3

我不明白这个错误,而且我也没有看到任何其他方法来实现我想要的。有什么想法吗?

最佳答案

我认为你需要:

energy['ES'] = energy.loc[energy['ES'] != "...", 'ES'] 

另一个解决方案:

energy['ES'] = energy['ES'].mask(energy['ES'] == "...")

或者:

energy['ES'] = energy['ES'].replace({'...': np.nan})

但最好的是 ayhan 评论:

you can pass na_values='...' to pd.read_excel

关于python - 用 np.NaN 替换 pandas 数据框中的缺失值(以字符串形式给出),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41267367/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com