gpt4 book ai didi

python - Pandas shift - 如果满足多个条件,则获取先前的值

转载 作者:行者123 更新时间:2023-12-01 23:28:09 26 4
gpt4 key购买 nike

实际上我已经问过这个关于 SQL 的问题并且在这里得到了很好的答案: SQL - LAG to get previous value if condition using multiple previous columns satisfied

但现在 Pandas 需要它。假设我们有一个数据框:

df = pd.DataFrame({'id':[1,2,3,4,5,6,7,8],
'EventName':['Team A vs Team B',
'Team A vs Team B',
'Team C vs Team D',
'Team Z vs Team A',
'Team A vs Team B',
'Team C vs Team D',
'Team C vs Team D',
'Team E vs Team F',],
'HomeTeam': ['Team A', 'Team A', 'Team C', 'Team Z',
'Team A', 'Team C', 'Team C', 'Team E'],
'Metric':[5,7,6,8,9,3,1,2]})

结果是:

id  EventName           HomeTeam    Metric
------------------------------------------
1 Team A vs Team B Team A 5
2 Team A vs Team B Team A 7
3 Team C vs Team D Team C 6
4 Team Z vs Team A Team Z 8
5 Team A vs Team B Team A 9
6 Team C vs Team D Team C 3
7 Team C vs Team D Team C 1
8 Team E vs Team F Team E 2

我想计算一个新列 PreviousMetricN,其中 N 可以是 1、2、3...,它显示 Metric 的先前值,但前提是 HomeTeam 参与了先前的事件。例如:

id  EventName           HomeTeam    Metric  PreviousMetric1 PreviousMetric2
------------------------------------------------------------------------
1 Team A vs Team B Team A 5 NULL NULL
2 Team A vs Team B Team A 7 5 NULL
3 Team C vs Team D Team C 6 NULL NULL
4 Team Z vs Team A Team Z 8 NULL NULL
5 Team A vs Team B Team A 9 8 7
6 Team C vs Team D Team C 3 6 NULL
7 Team C vs Team D Team C 1 3 6
8 Team E vs Team F Team E 2 NULL NULL

我想使用 for 循环会很容易。但我需要一个矢量化的解决方案或使用 shift/groupby/np.where 的某种组合。甚至不确定从哪里开始?

最佳答案

使用@Alollz结构:

df = pd.DataFrame({'id':[1,2,3,4,5,6,7,8],
'EventName':['Team A vs Team B',
'Team A vs Team B',
'Team C vs Team D',
'Team Z vs Team A',
'Team A vs Team B',
'Team C vs Team D',
'Team C vs Team D',
'Team E vs Team F',],
'HomeTeam': ['Team A', 'Team A', 'Team C', 'Team Z',
'Team A', 'Team C', 'Team C', 'Team E'],
'Metric':[5,7,6,8,9,3,1,2]})

dfe = df.assign(teams = df['EventName'].str.split(' vs ')).explode('teams')

shifts = [1, 2, 3]
for i in shifts:
mapper = dfe.groupby('teams')['Metric'].shift(i).mask(dfe['teams'] != dfe['HomeTeam']).drop_duplicates()
df[f'PreviousMetrics{i}'] = df.index.map(mapper)
df

输出:

   id         EventName HomeTeam  Metric  PreviousMetrics1  PreviousMetrics2  PreviousMetrics3
0 1 Team A vs Team B Team A 5 NaN NaN NaN
1 2 Team A vs Team B Team A 7 5.0 NaN NaN
2 3 Team C vs Team D Team C 6 NaN NaN NaN
3 4 Team Z vs Team A Team Z 8 NaN NaN NaN
4 5 Team A vs Team B Team A 9 8.0 7.0 5.0
5 6 Team C vs Team D Team C 3 6.0 NaN NaN
6 7 Team C vs Team D Team C 1 3.0 6.0 NaN
7 8 Team E vs Team F Team E 2 NaN NaN NaN

关于python - Pandas shift - 如果满足多个条件,则获取先前的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66858486/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com