gpt4 book ai didi

python - Pandas:计算与 pivot_table 或 crosstab 的重叠

转载 作者:行者123 更新时间:2023-11-28 19:14:21 25 4
gpt4 key购买 nike

我正在尝试与数据框中的一些数据重叠。这是一个简单的例子:

df=pd.DataFrame({
'player':['A', 'B', 'C', 'D', 'A', 'C', 'B'],
'game':['gameA', 'gameB', 'gameC', 'gameC', 'gameB', 'gameD', 'gameA']})

df:

    game player
0 gameA A
1 gameB B
2 gameC C
3 gameC D
4 gameB A
5 gameD C
6 gameA B

我想做的是计算每个组合在两个游戏中的玩家数量。

例如,结果应该是这样的:

   game1 game2   overlap
gameA gameB 2 #Because there is 2 players who play at gameA and gameB
gameA gameC 0
gameA gameD 0
gameB gameA 2
gameB gameC 0
gameB gameD 0
...

我可以用字典和 foreach 来做到这一点,但是有没有一种简单的方法可以用 pivot_table 或交叉表来做到这一点?

非常感谢。

最佳答案

您可以使用 pd.merge 来创建 game_table:

game_table = pd.merge(df, df, how='left', on=['player'])
# game_x player game_y
# 0 gameA A gameA
# 1 gameA A gameB
# 2 gameB B gameB
# 3 gameB B gameA
# 4 gameC C gameC
# 5 gameC C gameD
# 6 gameC D gameC
# 7 gameB A gameA
# 8 gameB A gameB
# 9 gameD C gameC
# 10 gameD C gameD
# 11 gameA B gameB
# 12 gameA B gameA

然后将pd.crosstab应用到game_table:

freq = pd.crosstab(game_table['game_x'], game_table['game_y'])
# game_y gameA gameB gameC gameD
# game_x
# gameA 2 2 0 0
# gameB 2 2 0 0
# gameC 0 0 2 1
# gameD 0 0 1 1

stack 后跟 reset_index 将 DataFrame reshape 为所需的形式:

result = freq.stack().reset_index()

import pandas as pd
df = pd.DataFrame(
{'player':['A', 'B', 'C', 'D', 'A', 'C', 'B'],
'game':['gameA', 'gameB', 'gameC', 'gameC', 'gameB', 'gameD', 'gameA']})

game_table = pd.merge(df, df, how='left', on=['player'])
freq = pd.crosstab(game_table['game_x'], game_table['game_y'])
result = freq.stack()
result.name = 'overlap'
result = result.reset_index()
mask = (result['game_x'] != result['game_y'])
result = result.loc[mask]
print(result)

产量

   game_x game_y  overlap
1 gameA gameB 2 # Because both A and B played in gameA and gameB
2 gameA gameC 0
3 gameA gameD 0
4 gameB gameA 2
6 gameB gameC 0
7 gameB gameD 0
8 gameC gameA 0
9 gameC gameB 0
11 gameC gameD 1
12 gameD gameA 0
13 gameD gameB 0
14 gameD gameC 1

关于python - Pandas:计算与 pivot_table 或 crosstab 的重叠,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35506313/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com