gpt4 book ai didi

python - 多列交叉表

转载 作者:行者123 更新时间:2023-12-03 23:31:09 25 4
gpt4 key购买 nike

我有一个带有名称、日期和位置的数据框。对于每个名称日位置三元组,我想知 Prop 有该名称日的行中有多少比例具有该位置。

在代码中,我从 df 开始并寻找 expected .

import pandas as pd

df = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left"},
{"name": "Alice", "day": "friday", "location": "right"},
{"name": "Bob", "day": "monday", "location": "left"},
]
)

print(df)



expected = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left", "row_percent": 50.0},
{"name": "Alice", "day": "friday", "location": "right", "row_percent": 50.0},
{"name": "Bob", "day": "monday", "location": "left", "row_percent": 100.0},
]
).set_index(['name', 'day', ])
print(expected)

打印:
In [13]: df                                                                                                                                                                                  
Out[13]:
day location name
0 friday left Alice
1 friday right Alice
2 monday left Bob




In [12]: expected
Out[12]:
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0

最佳答案

使用 groupbyvalue_counts :

df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
name   day     location
Alice friday left 50.0
right 50.0
Bob monday left 100.0
Name: location, dtype: float64

为您想要的输出做更多的清洁:
out = (df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
.rename('row_percent').reset_index(2))
             location  row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
out == expected
              location  row_percent
name day
Alice friday True True
friday True True
Bob monday True True

关于python - 多列交叉表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53148069/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com