gpt4 book ai didi

python - Pandas 从长到宽的分类数据框

转载 作者:行者123 更新时间:2023-12-05 02:38:57 25 4
gpt4 key购买 nike

通常当我们想在 Pandas 中将一个数据帧从长转换为宽时,我们使用 pivotpivot_table,或 unstack,或 groupby,但当存在可聚合元素时,它会很好地工作。我们如何以相同的方式转换分类数据框?

例子:

d = {'Fruit':['Apple', 'Apple', 'Apple', 'Kiwi'], 
'Color1':['Red', 'Yellow', 'Red', 'Green'],
'Color2':['Red', 'Red', 'Green', 'Brown'],'Color3':[np.nan,np.nan,'Red',np.nan]}

pd.DataFrame(d)

Fruit Color1 Color2 Color3
0 Apple Red Red NaN
1 Apple Yellow Red NaN
2 Apple Red Green Red
3 Kiwi Green Brown NaN

应该变成这样:

d = {'Fruit':['Apple','Kiwi'], 
'Color1':['Red','Green'],
'Color1_1':['Yellow',np.nan],
'Color1_2':['Red',np.nan],
'Color2':['Red', 'Brown'],
'Color2_1':['Red',np.nan],
'Color2_2':['Green',np.nan],
'Color3':[np.nan,np.nan],
'Color3_1':[np.nan,np.nan],
'Color3_2':['Red',np.nan]
}

pd.DataFrame(d)

Fruit Color1 Color1_1 Color1_2 Color2 Color2_1 Color2_2 Color3 Color3_1 Color3_2
0 Apple Red Yellow Red Red Red Green NaN NaN Red
1 Kiwi Green NaN NaN Brown NaN NaN NaN NaN NaN

最佳答案

尝试 cumcountgroupby得到计数,然后 pivot在其上作为列,然后设置列名,其中:

df = df.assign(idx=df.groupby('Fruit').cumcount()).pivot(index='Fruit',columns='idx')
print(df.set_axis([f'{x}_{y}' if y != 0 else x for x, y in df.columns], axis=1).reset_index())

输出:

   Fruit Color1 Color1_1 Color1_2 Color2 Color2_1 Color2_2 Color3 Color3_1 Color3_2
0 Apple Red Yellow Red Red Red Green NaN NaN Red
1 Kiwi Green NaN NaN Brown NaN NaN NaN NaN NaN

与您的输出完全匹配。

关于python - Pandas 从长到宽的分类数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69340680/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com