gpt4 book ai didi

python - Pandas Groupby 计算 ewm 未按预期工作

转载 作者:行者123 更新时间:2023-12-01 01:35:08 27 4
gpt4 key购买 nike

假设我有一个如下所示的数据框

import pandas as pd

data = {'team': ['team1','team1','team1','team1','team1','team1','team1','team1','team1','team1','team1','team1','team1','team1',
'team2','team2','team2','team2','team2','team2','team2','team2','team2','team2','team2','team2','team2','team2',],
'score': [1,2,3,4,5,6,7,8,9,10,11,12,13,14,1,2,3,4,5,6,7,8,9,10,11,12,13,14],
'yards': [10,20,30,40,50,60,70,80,90,100,110,120,130,140,10,20,30,40,50,60,70,80,90,100,110,120,130,140]}

df = pd.DataFrame.from_dict(data)

我正在尝试使用这篇文章中找到的手动方法(Does Pandas calculate ewm wrong?)来计算“分数”和“码数”列的 ewm,但我注意到我的跨度对于每个分组团队来说并没有按预期工作。这就是我到目前为止的代码

ema_features = df[['team']].copy()

for feature_name in df[['score','yards']]:
span=10
feature_ema = (df.groupby('team')[feature_name].rolling(window=span, min_periods=span).mean()[:span])
rest = df[feature_name][span:]
x = pd.concat([feature_ema, rest]).ewm(span=span, adjust=False).mean()


ema_features[feature_name] = x

输出如下

ema_features

team score yards
0 team1 NaN NaN
1 team1 NaN NaN
2 team1 NaN NaN
3 team1 NaN NaN
4 team1 NaN NaN
5 team1 NaN NaN
6 team1 NaN NaN
7 team1 NaN NaN
8 team1 NaN NaN
9 team1 NaN NaN
10 team1 6.500000 65.000000
11 team1 7.500000 75.000000
12 team1 8.500000 85.000000
13 team1 9.500000 95.000000
14 team2 7.954545 79.545455
15 team2 6.871901 68.719008
16 team2 6.167919 61.679189
17 team2 5.773752 57.737518
18 team2 5.633070 56.330696
19 team2 5.699784 56.997843
20 team2 5.936187 59.361871
21 team2 6.311426 63.114258
22 team2 6.800257 68.002575
23 team2 7.382029 73.820289
24 team2 8.039842 80.398418
25 team2 8.759871 87.598706
26 team2 9.530803 95.308032
27 team2 10.343384 103.433844

我的问题是,如何让我的跨度也适用于团队 2?而不是上面的输出,其中团队 2 的 ewm 是与团队 1 一起计算的。我希望每个团队的 ewm 彼此单独计算,这需要应用正确的跨度,然后进行计算,就像我在下面所期望的那样。

   ema_features

team score yards
0 team1 NaN NaN
1 team1 NaN NaN
2 team1 NaN NaN
3 team1 NaN NaN
4 team1 NaN NaN
5 team1 NaN NaN
6 team1 NaN NaN
7 team1 NaN NaN
8 team1 NaN NaN
9 team1 NaN NaN
10 team1 6.500000 65.000000
11 team1 7.500000 75.000000
12 team1 8.500000 85.000000
13 team1 9.500000 95.000000
14 team2 NaN NaN
15 team2 NaN NaN
16 team2 NaN NaN
17 team2 NaN NaN
18 team2 NaN NaN
19 team2 NaN NaN
20 team2 NaN NaN
21 team2 NaN NaN
22 team2 NaN NaN
23 team2 6.500000 65.000000
24 team2 7.500000 75.000000
25 team2 8.500000 85.000000
26 team2 9.500000 95.000000

最佳答案

您可以尝试使用 GroupBy.apply具有自定义功能。因此,调整您的 for 循环,尝试如下操作:

def team_ema(team, span=10):
feature_ema = team.rolling(window=span, min_periods=span).mean()[:span]
rest = team[span:]
return pd.concat([feature_ema, rest]).ewm(span=span, adjust=False).mean()

df.groupby('team').apply(team_ema)

关于python - Pandas Groupby 计算 ewm 未按预期工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52459397/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com