gpt4 book ai didi

performance - 如何按组加速 Pandas 多级数据帧移动?

转载 作者:行者123 更新时间:2023-12-04 02:54:43 25 4
gpt4 key购买 nike

我正在尝试按第一个索引组移动 Pandas 数据框列数据。这是演示代码:

 In [8]: df = mul_df(5,4,3)

In [9]: df
Out[9]:
COL000 COL001 COL002
STK_ID RPT_Date
A0000 B000 -0.5505 0.7445 -0.3645
B001 0.9129 -1.0473 -0.5478
B002 0.8016 0.0292 0.9002
B003 2.0744 -0.2942 -0.7117
A0001 B000 0.7064 0.9636 0.2805
B001 0.4763 0.2741 -1.2437
B002 1.1563 0.0525 -0.7603
B003 -0.4334 0.2510 -0.0105
A0002 B000 -0.6443 0.1723 0.2657
B001 1.0719 0.0538 -0.0641
B002 0.6787 -0.3386 0.6757
B003 -0.3940 -1.2927 0.3892
A0003 B000 -0.5862 -0.6320 0.6196
B001 -0.1129 -0.9774 0.7112
B002 0.6303 -1.2849 -0.4777
B003 0.5046 -0.4717 -0.2133
A0004 B000 1.6420 -0.9441 1.7167
B001 0.1487 0.1239 0.6848
B002 0.6139 -1.9085 -1.9508
B003 0.3408 -1.3891 0.6739

In [10]: grp = df.groupby(level=df.index.names[0])

In [11]: grp.shift(1)
Out[11]:
COL000 COL001 COL002
STK_ID RPT_Date
A0000 B000 NaN NaN NaN
B001 -0.5505 0.7445 -0.3645
B002 0.9129 -1.0473 -0.5478
B003 0.8016 0.0292 0.9002
A0001 B000 NaN NaN NaN
B001 0.7064 0.9636 0.2805
B002 0.4763 0.2741 -1.2437
B003 1.1563 0.0525 -0.7603
A0002 B000 NaN NaN NaN
B001 -0.6443 0.1723 0.2657
B002 1.0719 0.0538 -0.0641
B003 0.6787 -0.3386 0.6757
A0003 B000 NaN NaN NaN
B001 -0.5862 -0.6320 0.6196
B002 -0.1129 -0.9774 0.7112
B003 0.6303 -1.2849 -0.4777
A0004 B000 NaN NaN NaN
B001 1.6420 -0.9441 1.7167
B002 0.1487 0.1239 0.6848
B003 0.6139 -1.9085 -1.9508

mul_df() 代码附在此处:How to speed up Pandas multilevel dataframe sum?

现在我想为一个大数据帧grp.shift(1)

In [1]: df = mul_df(5000,30,400)
In [2]: grp = df.groupby(level=df.index.names[0])
In [3]: timeit grp.shift(1)
1 loops, best of 3: 5.23 s per loop

5.23s 太慢了。如何加快速度?

(我的电脑配置是:Pentium Dual-Core T4200@2.00GHZ, 3.00GB RAM, WindowXP, Python 2.7.4, Numpy 1.7.1, Pandas 0.11.0, numexpr 2.0.1 , Anaconda 1.5.0 (32 -位))

最佳答案

如何移动整个 DataFrame 对象,然后将每个组的第一行设置为 NaN?

dfs = df.shift(1)
dfs.iloc[df.groupby(level=0).size().cumsum()[:-1]] = np.nan

关于performance - 如何按组加速 Pandas 多级数据帧移动?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17401197/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com