gpt4 book ai didi

pandas - 循环乘以 Pandas 中的先前值

转载 作者:行者123 更新时间:2023-12-01 11:21:08 25 4
gpt4 key购买 nike

我在 pandas 中有一个 DataFrame,如下所示:

df = pd.DataFrame({'origin_dte':['2009-08-01','2009-08-01','2009-08-01','2009-08-01','2009-09-01','2009-09-01','2009-09-01'],
'date':['2009-08-01','2009-08-02','2009-08-03','2009-08-04','2009-09-01','2009-09-02','2009-09-03'],
'bal_pred':[10.,11.,12.,13.,21.,22.,23.],
'dbal_pred':[np.nan,.25,.3,.5,np.nan,.4,.45]})

bal_pred date dbal_pred origin_dte
0 10 2009-08-01 NaN 2009-08-01
1 11 2009-08-02 0.25 2009-08-01
2 12 2009-08-03 0.30 2009-08-01
3 13 2009-08-04 0.50 2009-08-01
4 21 2009-09-01 NaN 2009-09-01
5 22 2009-09-02 0.40 2009-09-01
6 23 2009-09-03 0.45 2009-09-01

我想遍历并替换 bal_pred 的每个观察值,其中 dbal_pred != NaNdbal_pred[i] * bal_pred[i-1]。例如,bal_pred 的第二个值将变为 0.25*10=2.5。当 origin_dte 发生变化时,意味着 dbal_pred 再次为 NaN,计算将跳过 NaN 观察并计算下一个 bal_pred。所以 df 看起来如下所示。我有一个 while 循环来执行此操作,但问题是循环遍历大型数据帧需要很长时间。非常感谢更简单/更快的方法来做到这一点!

    bal_pred  date       dbal_pred  origin_dte
0 10.000 2009-08-01 NaN 2009-08-01
1 2.500 2009-08-02 0.25 2009-08-01
2 0.750 2009-08-03 0.30 2009-08-01
3 0.375 2009-08-04 0.50 2009-08-01
4 21.000 2009-09-01 NaN 2009-09-01
5 8.400 2009-09-02 0.40 2009-09-01
6 3.780 2009-09-03 0.45 2009-09-01

最佳答案

另一种方法是标记每组数据,然后取每组的累积乘积

group = df['dbal_pred'].isnull().cumsum() 
df.dbal_pred.fillna(df.bal_pred, inplace=True)
df['bal_pred'] = df.groupby(group)['dbal_pred'].cumprod()

输出

   bal_pred        date  dbal_pred  origin_dte
0 10.000 2009-08-01 NaN 2009-08-01
1 2.500 2009-08-02 0.25 2009-08-01
2 0.750 2009-08-03 0.30 2009-08-01
3 0.375 2009-08-04 0.50 2009-08-01
4 21.000 2009-09-01 NaN 2009-09-01
5 8.400 2009-09-02 0.40 2009-09-01
6 3.780 2009-09-03 0.45 2009-09-01

关于pandas - 循环乘以 Pandas 中的先前值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42913881/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com