gpt4 book ai didi

python-3.x - 替换 Pandas 数据框列中的特定值,否则将列转换为数字

转载 作者:行者123 更新时间:2023-12-01 12:07:14 25 4
gpt4 key购买 nike

给定以下 Pandas 数据框

+----+------------------+-------------------------------------+--------------------------------+
| | AgeAt_X | AgeAt_Y | AgeAt_Z |
|----+------------------+-------------------------------------+--------------------------------+
| 0 | Older than 100 | Older than 100 | 74.13 |
| 1 | nan | nan | 58.46 |
| 2 | nan | 8.4 | 54.15 |
| 3 | nan | nan | 57.04 |
| 4 | nan | 57.04 | nan |
+----+------------------+-------------------------------------+--------------------------------+

如何用 nan 替换特定列中等于 Older than 100 的值

+----+------------------+-------------------------------------+--------------------------------+
| | AgeAt_X | AgeAt_Y | AgeAt_Z |
|----+------------------+-------------------------------------+--------------------------------+
| 0 | nan | nan | 74.13 |
| 1 | nan | nan | 58.46 |
| 2 | nan | 8.4 | 54.15 |
| 3 | nan | nan | 57.04 |
| 4 | nan | 57.04 | nan |
+----+------------------+-------------------------------------+--------------------------------+

注意事项

  • 从所需列中删除 Older than 100 字符串后,我将这些列转换为数字,以便对所述列执行计算。
  • 此数据框中还有其他列(我已从该示例中排除),它们不会转换为数字,因此必须一次一列转换为数字。

我尝试过的

尝试 1

if df.isin('Older than 100'):
df.loc[df['AgeAt_X']] = ''
else:
df['AgeAt_X'] = pd.to_numeric(df["AgeAt_X"])

尝试 2

if df.loc[df['AgeAt_X']] == 'Older than 100r':
df.loc[df['AgeAt_X']] = ''
elif df.loc[df['AgeAt_X']] == '':
df['AgeAt_X'] = pd.to_numeric(df["AgeAt_X"])

尝试 3

df['AgeAt_X'] = ['' if ele == 'Older than 100' else df.loc[df['AgeAt_X']] for ele in df['AgeAt_X']]

尝试 1、2 和 3 返回以下错误:

KeyError: 'None of [0 NaN\n1 NaN\n2 NaN\n3 NaN\n4 NaN\n5 NaN\n6 NaN\n7 NaN\n8 NaN\n9 NaN\n10 NaN\n11 NaN\n12 NaN\n13 NaN\n14 NaN\n15 NaN\n16 NaN\n17 NaN\n18 NaN\n19 NaN\n20 NaN\n21 NaN\n22 NaN\n23 NaN\n24 NaN\n25 NaN\n26 NaN\n27 NaN\n28 NaN\n29 NaN\n ..\n6332 NaN\n6333 NaN\n6334 NaN\n6335 NaN\n6336 NaN\n6337 NaN\n6338 NaN\n6339 NaN\n6340 NaN\n6341 NaN\n6342 NaN\n6343 NaN\n6344 NaN\n6345 NaN\n6346 NaN\n6347 NaN\n6348 NaN\n6349 NaN\n6350 NaN\n6351 NaN\n6352 NaN\n6353 NaN\n6354 NaN\n6355 NaN\n6356 NaN\n6357 NaN\n6358 NaN\n6359 NaN\n6360 NaN\n6361 NaN\nXName,: AgeAt Length: 6362, dtype: float64] 在[index]'

尝试 4

df['AgeAt_X'] = df['AgeAt_X'].replace({'Older than 100': ''})

尝试 4 返回以下错误:

TypeError: 无法比较类型 'ndarray(dtype=float64)' 和 'str'

我也看了一些帖子。下面两个实际上并没有替换值,而是创建了一个从其他派生的新列

Replace specific values in Pandas DataFrame

Pandas replace DataFrame values

最佳答案

我们可以遍历每一列并检查句子是否存在。如果命中,我们将 NaN 的句子替换为 Series.str.replace并在使用 Series.astype 将其转换为数字后立即使用,在这种情况下 float:

df.dtypes
AgeAt_X object
AgeAt_Y object
AgeAt_Z float64
dtype: object

sent = 'Older than 100'

for col in df.columns:
if sent in df[col].values:
df[col] = df[col].str.replace(sent, 'NaN')
df[col] = df[col].astype(float)

print(df)
AgeAt_X AgeAt_Y AgeAt_Z
0 NaN NaN 74.13
1 NaN NaN 58.46
2 NaN 8.40 54.15
3 NaN NaN 57.04
4 NaN 57.04 NaN

df.dtypes
AgeAt_X float64
AgeAt_Y float64
AgeAt_Z float64
dtype: object

关于python-3.x - 替换 Pandas 数据框列中的特定值,否则将列转换为数字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55383805/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com