gpt4 book ai didi

python - 如何加速 Pandas 系列中的排名功能?

转载 作者:行者123 更新时间:2023-12-03 18:57:17 24 4
gpt4 key购买 nike

我想翻滚来计算一个系列的排名。
假设我有一个 Pandas 系列:

In [18]: s = pd.Series(np.random.rand(10))

In [19]: s
Out[19]:
0 0.340396
1 0.664459
2 0.647212
3 0.529363
4 0.535349
5 0.781628
6 0.313549
7 0.933539
8 0.618337
9 0.013442
dtype: float64
我可以像这样使用 Pandas 内部函数等级:
In [20]: s.rolling(4).apply(lambda x: pd.Series(x).rank().iloc[-1])
<ipython-input-20-41df4deb36f8>:1: FutureWarning: Currently, 'apply' passes the values as ndarrays to the applied function. In the future, this will change to passing it as Series objects. You need to specify 'raw=True' to keep the current behaviour, and you can pass 'raw=False' to silence this warning
s.rolling(4).apply(lambda x: pd.Series(x).rank().iloc[-1])
Out[20]:
0 NaN
1 NaN
2 NaN
3 2.0
4 2.0
5 4.0
6 1.0
7 4.0
8 2.0
9 1.0
dtype: float64
这没关系,但它很慢,这里是一个测试。
In [24]: %timeit pd.Series(np.random.rand(100000)).rolling(100).apply(lambda x: pd.Series(x).rank().iloc[-1])
<magic-timeit>:1: FutureWarning: Currently, 'apply' passes the values as ndarrays to the applied function. In the future, this will change to passing it as Series objects. You need to specify 'raw=True' to keep the current behaviour, and you can pass 'raw=False' to silence this warning
22.5 s ± 292 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
有什么好的方法可以用来加速,我认为滚动循环可以做些什么来改进。谢谢

最佳答案

使用 scipy/numpy 速度更快(需要 latest version of numpy):

import pandas as pd
import numpy as np
from time import time
from scipy.stats import rankdata
from numpy.lib.stride_tricks import sliding_window_view

np.random.seed()
array = np.random.rand(100000)

t0 = time()
ranks = pd.Series(array).rolling(100).apply(lambda x: x.rank().iloc[-1])
t1 = time()
print(f'With pandas: {t1-t0} sec.')

t0 = time()
ranks = [rankdata(x)[-1] for x in sliding_window_view(array, window_shape=100)]
t1 = time()
print(f'With numpy: {t1-t0} sec.')
输出:
With pandas: 11.682222127914429 sec.
With numpy: 3.9317219257354736 sec.

关于python - 如何加速 Pandas 系列中的排名功能?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65624626/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com