gpt4 book ai didi

python - 如何在 Pandas 中组合和形成复杂的数据框

转载 作者:太空宇宙 更新时间:2023-11-04 05:53:30 25 4
gpt4 key购买 nike

我有一个名为 df 的数据框,格式如下:

       match_up     result
0 1985_1116_1234 1
1 1985_1120_1345 1
2 1985_1207_1250 1
3 1985_1229_1425 1

我有另一个名为 df1 的数据框

  team       win percentage     sum_of_last_six  seed_frequency
0 1116 0.700 5 7
1 1234 0.667 3 10
2 1120 0.636 4 9
3 1207 0.615 2 11
4 1229 0.345 2 3
5 1345 0.621 5 11
6 1425 0.572 1 2
7 1250 0.968 4 12

我需要以 df2 包含所有左侧值(成功1985 年之后_) 数据框 df 中的列 matchup 即。 1116、1120、1207、1229df3 的值应位于 matchup 列的右侧。

  team_df2        win_df2           sum_df2       seed_df2
0 1116 0.700 5 7
1 1120 0.636 4 9
2 1207 0.615 2 11
3 1229 0.345 2 3

team_df3 win_df3 sum_df3 seed_df3
1 1234 0.667 3 10
5 1345 0.621 5 11
7 1250 0.968 4 12
6 1425 0.572 1 2

最后我需要一个结合了三个数据框(dfdf2df3)的新数据框

我需要按照以下格式形成一个名为 combi 的新数据框:

      match_up      result  team_df2   win_df2  sum_df2  seed_df2  
0 1985_1116_1234 1 1116 0.700 5 7
1 1985_1120_1345 1 1120 0.636 4 9
2 1985_1207_1250 1 1207 0.615 2 11
3 1985_1229_1425 1 1229 0.345 2 3

team_df3 win_df3 sum_df3 seed_df3
1234 0.667 3 10
1345 0.621 5 11
1250 0.968 4 12
1425 0.572 1 2

我如何在 pandas 中执行此操作?

最佳答案

您可以调用 'match_up' 列上的矢量化 str 方法来拆分字符串,将它们映射到 int 并创建一个列表,以便我们可以过滤第二个 df 以创建 df2 和 df3:

In [90]:

left = list(map(int,(df['match_up'].str.split('_').str[1])))
right = list(map(int,(df['match_up'].str.split('_').str[2])))
print(left)
right
[1116, 1120, 1207, 1229]
Out[90]:
[1234, 1345, 1250, 1425]
In [91]:

df2 = df1[df1.win.isin(left)]
df2
Out[91]:
team win percentage sum_of_last_six seed_frequency
0 0 1116 0.700 5 7
2 2 1120 0.636 4 9
3 3 1207 0.615 2 11
4 4 1229 0.345 2 3
In [92]:

df3 = df1[df1.win.isin(right)]
df3
Out[92]:
team win percentage sum_of_last_six seed_frequency
1 1 1234 0.667 3 10
5 5 1345 0.621 5 11
6 6 1425 0.572 1 2
7 7 1250 0.968 4 12

如果需要,您可以重命名调用rename 的列。

要使用重命名的列获得所需的合并输出 df:

In [95]:

df2 = df2.rename(columns={'team':'team_df2', 'win':'win_df2', 'sum_of_last_six':'sum_df2', 'seed_frequency':'seed_df2'})
df3 = df3.rename(columns={'team':'team_df3', 'win':'win_df3', 'sum_of_last_six':'sum_df3', 'seed_frequency':'seed_df3'})
In [101]:

pd.concat([df,df2,df3],axis=1)
Out[101]:
match_up result team_df2 win_df2 percentage sum_df2 seed_df2 \
0 1985_1116_1234 1 0 1116 0.700 5 7
1 1985_1120_1345 1 NaN NaN NaN NaN NaN
2 1985_1207_1250 1 2 1120 0.636 4 9
3 1985_1229_1425 1 3 1207 0.615 2 11
4 NaN NaN 4 1229 0.345 2 3
5 NaN NaN NaN NaN NaN NaN NaN
6 NaN NaN NaN NaN NaN NaN NaN
7 NaN NaN NaN NaN NaN NaN NaN

team_df3 win_df3 percentage sum_df3 seed_df3
0 NaN NaN NaN NaN NaN
1 1 1234 0.667 3 10
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN
5 5 1345 0.621 5 11
6 6 1425 0.572 1 2
7 7 1250 0.968 4 12

关于python - 如何在 Pandas 中组合和形成复杂的数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28967654/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com