gpt4 book ai didi

python - DataFrame 采用列并集并保留找到第一个非 NaN 值

转载 作者:行者123 更新时间:2023-11-28 22:11:27 24 4
gpt4 key购买 nike

Dataframe df 有数千列和行。对于以特定序列给出的列的子集,比如 B、C、E 列,我想在 B 中填充 NaN 值在其余列 (C, E) 中找到的第一个非 NaN 值按顺序搜索。最后 C, E 被丢弃

示例 df 可以按如下方式构建:

import numpy as np
import pandas as pd
df = pd.DataFrame(10*(2+np.random.randn(6, 5)), columns=list('ABCDE'))
df.loc[1, 'B'] = np.nan
df.loc[2, 'B'] = np.nan
df.loc[5, 'B'] = np.nan
df.loc[2, 'C'] = np.nan
df.loc[5, 'C'] = np.nan
df.loc[2, 'D'] = np.nan
df.loc[2, 'E'] = np.nan
df.loc[4, 'E'] = np.nan
df
A B C D E
0 18.161033 6.453597 25.253036 18.542586 20.667311
1 27.629402 NaN 40.654821 22.804547 23.633502
2 15.459256 NaN NaN NaN NaN
3 19.115203 4.002131 14.167508 23.796780 29.557706
4 27.180622 NaN 20.763618 15.923794 NaN
5 17.917170 NaN NaN 21.865184 9.867743

预期结果如下:

           A         B         D
0 18.161033 6.453597 18.542586
1 27.629402 40.654821 22.804547
2 15.459256 NaN NaN
3 19.115203 4.002131 23.796780
4 27.180622 20.763618 15.923794
5 17.917170 9.867743 21.865184

最佳答案

这是一种方式

drop = ['C', 'E']
fill= 'B'
d=dict(zip(df.columns,[fill if x in drop else x for x in df.columns.tolist() ]))
df.groupby(d,axis=1).first()
Out[172]:
A B D
0 14.472915 30.598602 24.528571
1 22.010242 22.215140 15.412039
2 5.383674 NaN NaN
3 38.265940 24.746673 35.367622
4 22.730089 20.244289 27.570413
5 31.216037 15.496690 9.746814

关于python - DataFrame 采用列并集并保留找到第一个非 NaN 值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55670952/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com