gpt4 book ai didi

python - 如何加速非常慢的 Pandas 应用功能?

转载 作者:太空狗 更新时间:2023-10-30 00:19:16 26 4
gpt4 key购买 nike

我有一个非常大的 pandas 数据集,在某些时候我需要使用以下函数

def proc_trader(data):
data['_seq'] = np.nan
# make every ending of a roundtrip with its index
data.ix[data.cumq == 0,'tag'] = np.arange(1, (data.cumq == 0).sum() + 1)
# backfill the roundtrip index until previous roundtrip;
# then fill the rest with 0s (roundtrip incomplete for most recent trades)
data['_seq'] =data['tag'].fillna(method = 'bfill').fillna(0)
return data['_seq']
# btw, why on earth this function returns a dataframe instead of the series `data['_seq']`??

我用应用

reshaped['_spell']=reshaped.groupby(['trader','stock'])[['cumq']].apply(proc_trader)

显然,我不能在这里共享数据,但是您是否发现我的代码存在瓶颈?会不会是 arange 的事情?数据中有很多name-productid组合。

最小工作示例:

import pandas as pd
import numpy as np

reshaped= pd.DataFrame({'trader' : ['a','a','a','a','a','a','a'],'stock' : ['a','a','a','a','a','a','b'], 'day' :[0,1,2,4,5,10,1],'delta':[10,-10,15,-10,-5,5,0] ,'out': [1,1,2,2,2,0,1]})


reshaped.sort_values(by=['trader', 'stock','day'], inplace=True)
reshaped['cumq']=reshaped.groupby(['trader', 'stock']).delta.transform('cumsum')
reshaped['_spell']=reshaped.groupby(['trader','stock'])[['cumq']].apply(proc_trader).reset_index()['_seq']

最佳答案

这里没什么特别的,只是在几个地方进行了调整。真的没有必要放一个函数,所以我没有。在这个微小的样本数据上,它的速度大约是原来的两倍。

reshaped.sort_values(by=['trader', 'stock','day'], inplace=True)
reshaped['cumq']=reshaped.groupby(['trader', 'stock']).delta.cumsum()
reshaped.loc[ reshaped.cumq == 0, '_spell' ] = 1
reshaped['_spell'] = reshaped.groupby(['trader','stock'])['_spell'].cumsum()
reshaped['_spell'] = reshaped.groupby(['trader','stock'])['_spell'].bfill().fillna(0)

结果:

   day  delta  out stock trader  cumq  _spell
0 0 10 1 a a 10 1.0
1 1 -10 1 a a 0 1.0
2 2 15 2 a a 15 2.0
3 4 -10 2 a a 5 2.0
4 5 -5 2 a a 0 2.0
5 10 5 0 a a 5 0.0
6 1 0 1 b a 0 1.0

关于python - 如何加速非常慢的 Pandas 应用功能?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36044890/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com