gpt4 book ai didi

python - Pandas - 在数据框中创建滚动百分比

转载 作者:太空宇宙 更新时间:2023-11-04 04:21:52 28 4
gpt4 key购买 nike

我是 Pandas 的新手。我有一个 Horse 结果的数据框,看起来像这样(只是大得多):

    Horses        RaceDate Position
1 RedHorse 1/2/00 2
2 BlueHorse 1/2/00 6
3 YellowHorse 1/2/00 7
4 RedHorse 15/1/00 1
5 RedHorse 1/1/00 5

对于每个条目,我想计算出这匹马第一次奔跑时的获胜百分比。像这样:

    Horses        RaceDate Position WinPercentage
1 RedHorse 1/2/00 2 50%
2 BlueHorse 1/2/00 6 0%
3 YellowHorse 1/2/00 7 0%
4 RedHorse 15/1/00 5 100%
5 RedHorse 1/1/00 1 0%

我该怎么做呢?

最佳答案

每匹马获胜

df2 = df.copy(deep=True)
df2 = df2.reset_index()
df2 = df2.sort_values('RaceDate')
df2['win'] = np.where(df2.Position == 1, 1, 0)
df2['win_count'] = df2.groupby(['Horses'])['win'].cumsum()
df2['race_count'] = df2.groupby(['Horses'])['win_count'].cumsum()
df2['WinPercentage'] = df2['win_count'] / df2['race_count'] * 100
df2 = df2.sort_index()
print(df2)

输出:

   index       Horses   RaceDate  Position  win  win_count  race_count  WinPercentage
0 1 RedHorse 2000-02-01 2 0 1 2 50.0
1 2 BlueHorse 2000-02-01 6 0 0 0 NaN
2 3 YellowHorse 2000-02-01 7 0 0 0 NaN
3 4 RedHorse 2000-01-15 1 1 1 1 100.0
4 5 RedHorse 2000-01-01 5 0 0 0 NaN

每行获胜

df1 = df.copy(deep=True)
df1 = df1.reset_index()
df1 = df1.sort_values(['RaceDate', 'index'])
df1['win'] = np.where(df1.Position == 1, 1, 0)
df1['win'] = df1.win.ffill()
df1['win_count'] = df1.win.cumsum()
df1['race_count'] = df1.win_count.cumsum()
df1['WinPercentage'] = df1['win_count'] / df1['race_count'] * 100
print(df1)

输出:

   index       Horses   RaceDate  Position  win  win_count  race_count  WinPercentage
4 5 RedHorse 2000-01-01 5 0 0 0 NaN
3 4 RedHorse 2000-01-15 1 1 1 1 100.000000
0 1 RedHorse 2000-02-01 2 0 1 2 50.000000
1 2 BlueHorse 2000-02-01 6 0 1 3 33.333333
2 3 YellowHorse 2000-02-01 7 0 1 4 25.000000

2 个数据帧的连接

dfFinal = df1[['index', 'Horses', 'RaceDate', 'WinPercentage']].merge(df2[['index', 'Horses', 'RaceDate', 'WinPercentage']], on=['index', 'Horses', 'RaceDate'], how='outer')
print(dfFinal)

输出:

   index       Horses   RaceDate  WinPercentage_x  WinPercentage_y
0 5 RedHorse 2000-01-01 NaN NaN
1 4 RedHorse 2000-01-15 100.000000 100.0
2 1 RedHorse 2000-02-01 50.000000 50.0
3 2 BlueHorse 2000-02-01 33.333333 NaN
4 3 YellowHorse 2000-02-01 25.000000 NaN

关于python - Pandas - 在数据框中创建滚动百分比,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54347200/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com