python - 如何在一次热编码后聚合行-6ren

python - 如何在一次热编码后聚合行

转载作者：行者123 更新时间：2023-12-01 00:06:33

应用一种热编码后如何聚合结果？以下是我的示例数据

df= pd.DataFrame([
    ['apple','sweet'],
    ['apple','affordable'],
    ['apple','fruit'],
    ['orange','fruit'],
    ['orange','soup'],
    ['orange','cheap'],
    ['orange','sweet'],
    ['soda','sweet'],
    ['soda','cheap'],
    ['soda','softdrinks']
    ])

df= df.rename(columns={0: "productName", 1: "itemFeatures"})

我已经尝试过了

df_ohe = pd.get_dummies(df['itemFeatures'])
df_ohe_merged = pd.concat([df, df_ohe],axis='columns')
df_final = df_ohe_merged.drop(['itemFeatures'],axis='columns')

如何获得如下所需的输出？或者有更好的方法吗？

desired_output = pd.DataFrame([
    ['apple',1,0,0,1,0,0,1],
    ['orange',0,1,0,1,0,1,1],
    ['soda',0,0,1,0,1,0,1]
])
desired_output = desired_output.rename(columns={0: "productName",
                                                1: "affordable",
                                                2: "cheap",
                                                3: "famous",
                                                4: "fruit",
                                                5: "softdrinks",
                                                6: "sour",
                                                7: "sweet",
                                               })

非常感谢

最佳答案

使用pd.crosstab

new_df = pd.crosstab(df['productName'],df['itemFeatures'],colnames = [None]).reset_index()

另一种方法是DataFrame.pivot_table

new_df = (df.pivot_table(index = 'productName',
                         columns = 'itemFeatures',
                         aggfunc = 'size',
                         fill_value = 0)
            .reset_index()
            .rename_axis (columns = None))
print(new_df)

  productName  affordable  cheap  fruit  softdrinks  soup  sweet
0       apple           1      0      1           0     0      1
1      orange           0      1      1           0     1      1
2        soda           0      1      0           1     0      1

关于python - 如何在一次热编码后聚合行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59937072/

文章推荐： java - 可能已经分配了常量变量

文章推荐： wpf - WPF 中的文本框绑定(bind)

文章推荐： Python错误 " missing 1 required positional argument: ' self '“

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 如何在一次热编码后聚合行