gpt4 book ai didi

python - Pandas 根据列值复制行

转载 作者:太空宇宙 更新时间:2023-11-04 08:26:31 24 4
gpt4 key购买 nike

给定以下数据框

data = [[1, 'Yes','A','No','Yes','No','No','No'],
[2, 'Yes','A','No','No','Yes','No','No'],
[3, 'Yes','B','No','No','Yes','No','No'],
[4, 'No','','','','','',''],
[5, 'No','','','','','',''],
[6, 'Yes','C','No','No','Yes','Yes','No'],
[7, 'Yes','A','No','Yes','No','No','No'],
[8, 'Yes','A','No','No','Yes','No','No'],
[9, 'No','','','','','',''],
[10, 'Yes','B','Yes','Yes','No','No','No']]
df = pd.DataFrame(data,columns=['Cust_ID','OrderMade','OrderType','OrderCategoryA','OrderCategoryB','OrderCategoryC','OrderCategoryD'])


+----+-----------+-------------+-------------+------------------+------------------+------------------+------------------+
| | Cust_ID | OrderMade | OrderType | OrderCategoryA | OrderCategoryB | OrderCategoryC | OrderCategoryD |
|----+-----------+-------------+-------------+------------------+------------------+------------------+------------------|
| 0 | 1 | Yes | A | No | Yes | No | No |
| 1 | 2 | Yes | A | No | No | Yes | No |
| 2 | 3 | Yes | B | No | No | Yes | No |
| 3 | 4 | No | | | | | |
| 4 | 5 | No | | | | | |
| 5 | 6 | Yes | C | No | No | Yes | Yes |
| 6 | 7 | Yes | A | No | Yes | No | No |
| 7 | 8 | Yes | A | No | No | Yes | No |
| 8 | 9 | No | | | | | |
| 9 | 10 | Yes | B | Yes | Yes | No | No |
+----+-----------+-------------+-------------+------------------+------------------+------------------+------------------+

我如何将其转换为基于 OrderCategory 的行?

+--------+-----------+----------+----------------+
|Cust_ID | OrderMade |OrderType | OrderCategory |
|--------+-----------+----------+----------------|
|1 | Yes | A | OrderCategoryB |
|2 | Yes | A | OrderCategoryC |
|3 | Yes | B | OrderCategoryC |
|4 | No | | |
|5 | No | | |
|6 | Yes | C | OrderCategoryC |
|6 | Yes | C | OrderCategoryD |
|7 | Yes | A | OrderCategoryB |
|8 | Yes | A | OrderCategoryC |
|9 | No | | |
|10 | Yes | B | OrderCategoryA |
|10 | Yes | B | OrderCategoryB |
+--------+-----------+----------+----------------+

我尝试使用 crosstab 从一个 OrderCategory 开始,并计划为每个类别复制,但这似乎效率低下,我不确定如何继续得到我想要的结果...

imgCROSS = pd.crosstab(df["Cust_ID"], df["OrderCategoryA"])

返回...

OrderCategoryA     No  Yes
Cust_ID
1 0 1 0
2 0 1 0
3 0 1 0
4 1 0 0
5 1 0 0
6 0 1 0
7 0 1 0
8 0 1 0
9 1 0 0
10 0 0 1

我还想我可以填充一个名为 Category 的新空列并遍历每一行,根据 Yes/No 值填充适当的类别,但这不会' 适用于具有多个类别的行。此外,此想法的以下实现返回了一个空列。

imgRaw["Category"] = ""
for index, row in df.iterrows():
catA = row["OrderCategoryA"]
catB = row["OrderCategoryB"]
catC = row["OrderCategoryC"]
catD = row["OrderCategoryD"]

if catA == "Yes":
row["Category"] = "OrderCategoryA"
elif catB == "Yes":
row["Category"] = "OrderCategoryB"
elif catC == "Yes":
row["Category"] = "OrderCategoryC"
elif catD == "Yes":
row["Category"] = "OrderCategoryD"

我知道我需要转换数据框,可能要转换多次才能得到我想要的结果。只是停留在如何继续。

最佳答案

让我们分四步使用 pandas:

df_1 = df.set_index(['Cust_ID', 'OrderMade', 'OrderType'])

df_2 = df_1.where((df_1 == "Yes") | (df_1 == "")).rename_axis('OrderCategory', axis=1).stack().reset_index()

df_2['OrderCategory'] = df_2['OrderCategory'].mask(df_2['OrderMade'] == 'No','')

df_2.drop_duplicates().drop(0, axis=1)

输出:

    Cust_ID OrderMade OrderType   OrderCategory
0 1 Yes A OrderCategoryB
1 2 Yes A OrderCategoryC
2 3 Yes B OrderCategoryC
3 4 No
8 5 No
13 6 Yes C OrderCategoryC
14 6 Yes C OrderCategoryD
15 7 Yes A OrderCategoryB
16 8 Yes A OrderCategoryC
17 9 No
22 10 Yes B OrderCategoryA
23 10 Yes B OrderCategoryB

关于python - Pandas 根据列值复制行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56584075/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com