gpt4 book ai didi

pandas - 在所有列中获取虚拟值

转载 作者:行者123 更新时间:2023-12-04 10:56:37 25 4
gpt4 key购买 nike

get dummies 方法在使用多于一列时似乎无法按预期工作。
例如如果我有这个数据框...

shopping_list = [
["Apple", "Bread", "Fridge"],
["Rice", "Bread", "Milk"],
["Apple", "Rice", "Bread", "Milk"],
["Rice", "Milk"],
["Apple", "Bread", "Milk"],
]

df = pd.DataFrame(shopping_list)

如果我使用 get_dummmies 方法,这些项目会在列中重复,如下所示:
pd.get_dummies(df)

0_Apple 0_Rice 1_Bread 1_Milk 1_Rice 2_Bread 2_Fridge 2_Milk 3_Milk
0 1 0 1 0 0 0 1 0 0
1 0 1 1 0 0 0 0 1 0
2 1 0 0 0 1 1 0 0 1
3 0 1 0 1 0 0 0 0 0
4 1 0 1 0 0 0 0 1 0

虽然预期的结果是:
    Apple Bread Fridge Milk Rice
0 1 1 1 0 0
1 0 1 0 1 1
2 1 1 0 1 1
3 0 0 0 1 1
4 1 1 0 1 0

最佳答案

添加参数 prefixprefix_sep get_dummies 然后添加 max为了避免重复的列名(它由 max 聚合):

df = pd.get_dummies(df, prefix='', prefix_sep='').max(axis=1, level=0)
print(df)

Apple Rice Bread Milk Fridge
0 1 0 1 0 1
1 0 1 1 1 0
2 1 1 1 1 0
3 0 1 0 1 0
4 1 0 1 1 0

关于pandas - 在所有列中获取虚拟值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59133202/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com