gpt4 book ai didi

Python - Pandas - 对特定子集的 dropna 调用期间出现关键错误

转载 作者:行者123 更新时间:2023-11-30 22:25:58 26 4
gpt4 key购买 nike

我的目标:我希望删除特定列中包含 NaN 的行。我将允许 NaN 存在于某些列上,但不允许存在于其他列上。英文示例:如果一行中“detail_age”的值为 NaN,我想删除该行。

这是我的数据 View :

import pandas as pd
df = pd.read_csv('allDeaths.csv', index_col=0, nrows=3, engine='python')
print(df.shape)
print(list(df))

哪些输出:

(3,15)
['education_1989_revision', 'education_2003_revision',
'education_reporting_flag', 'sex', 'detail_age', 'marital_status',
'current_data_year', 'injury_at_work', 'manner_of_death', 'activity_code',
'place_of_injury_for_causes_w00_y34_except_y06_and_y07_', '358_cause_recode',
'113_cause_recode', '39_cause_recode', 'race']

当我尝试删除列值为 NaN 的行时,如下所示:

df.dropna(subset=[2,3,4,5,6,7,8,9,11,12,13,14], axis=1, inplace=True, how='any')

我收到以下错误:

Traceback (most recent call last):
File "clean.py", line 10, in <module>
df.dropna(subset=[2,3,4,5,6,7,8,9,11,12,13,14], axis=1, inplace=True, how='any')
File "/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py", line 3052, in dropna
raise KeyError(list(np.compress(check, subset)))
KeyError: [3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14]

这很奇怪,因为这有效:

df.dropna(subset=[2], axis=1, inplace=True, how='any')

但不是这个:

df.dropna(subset=[5], axis=1, inplace=True, how='any')

因此,某些列或这些列中的值一定有问题。这是我使用 df.head(3) 的数据:

As image because formatting is annoying

最佳答案

演示:

In [360]: df
Out[360]:
A B C D
0 1.0 2.0 NaN 4
1 5.0 NaN 7.0 8
2 NaN 10.0 11.0 12
3 13.0 14.0 15.0 16

In [362]: df = df.dropna(subset=df.columns[[1,2]], how='any')

In [363]: df
Out[363]:
A B C D
2 NaN 10.0 11.0 12
3 13.0 14.0 15.0 16

PS 当然,您可以指定列名称:

In [370]: df.dropna(subset=['B','C'], how='any')
Out[370]:
A B C D
2 NaN 10.0 11.0 12
3 13.0 14.0 15.0 16

关于Python - Pandas - 对特定子集的 dropna 调用期间出现关键错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47361424/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com