gpt4 book ai didi

python - NumPy 版本的 "Exponential weighted moving average",相当于 pandas.ewm().mean()

转载 作者:IT老高 更新时间:2023-10-28 22:23:47 25 4
gpt4 key购买 nike

如何在 NumPy 中获得指数加权移动平均线,就像 pandas 中的以下内容一样?

import pandas as pd
import pandas_datareader as pdr
from datetime import datetime

# Declare variables
ibm = pdr.get_data_yahoo(symbols='IBM', start=datetime(2000, 1, 1), end=datetime(2012, 1, 1)).reset_index(drop=True)['Adj Close']
windowSize = 20

# Get PANDAS exponential weighted moving average
ewm_pd = pd.DataFrame(ibm).ewm(span=windowSize, min_periods=windowSize).mean().as_matrix()

print(ewm_pd)

我用 NumPy 尝试了以下操作

import numpy as np
import pandas_datareader as pdr
from datetime import datetime

# From this post: http://stackoverflow.com/a/40085052/3293881 by @Divakar
def strided_app(a, L, S): # Window len = L, Stride len/stepsize = S
nrows = ((a.size - L) // S) + 1
n = a.strides[0]
return np.lib.stride_tricks.as_strided(a, shape=(nrows, L), strides=(S * n, n))

def numpyEWMA(price, windowSize):
weights = np.exp(np.linspace(-1., 0., windowSize))
weights /= weights.sum()

a2D = strided_app(price, windowSize, 1)

returnArray = np.empty((price.shape[0]))
returnArray.fill(np.nan)
for index in (range(a2D.shape[0])):
returnArray[index + windowSize-1] = np.convolve(weights, a2D[index])[windowSize - 1:-windowSize + 1]
return np.reshape(returnArray, (-1, 1))

# Declare variables
ibm = pdr.get_data_yahoo(symbols='IBM', start=datetime(2000, 1, 1), end=datetime(2012, 1, 1)).reset_index(drop=True)['Adj Close']
windowSize = 20

# Get NumPy exponential weighted moving average
ewma_np = numpyEWMA(ibm, windowSize)

print(ewma_np)

但是结果和pandas中的不一样。

是否有更好的方法可以直接在 NumPy 中计算指数加权移动平均值并获得与 pandas.ewm().mean() 完全相同的结果? ?

在 pandas 解决方案的 60,000 个请求中,我得到了大约 230 秒。我确信使用纯 NumPy 可以显着减少。

最佳答案

我想我终于破解了!

这是 numpy_ewma 函数的矢量化版本,据称它可以从 @RaduS's post 产生正确的结果-

def numpy_ewma_vectorized(data, window):

alpha = 2 /(window + 1.0)
alpha_rev = 1-alpha

scale = 1/alpha_rev
n = data.shape[0]

r = np.arange(n)
scale_arr = scale**r
offset = data[0]*alpha_rev**(r+1)
pw0 = alpha*alpha_rev**(n-1)

mult = data*pw0*scale_arr
cumsums = mult.cumsum()
out = offset + cumsums*scale_arr[::-1]
return out

进一步提升

我们可以通过一些代码重用来进一步提升它,就像这样 -

def numpy_ewma_vectorized_v2(data, window):

alpha = 2 /(window + 1.0)
alpha_rev = 1-alpha
n = data.shape[0]

pows = alpha_rev**(np.arange(n+1))

scale_arr = 1/pows[:-1]
offset = data[0]*pows[1:]
pw0 = alpha*alpha_rev**(n-1)

mult = data*pw0*scale_arr
cumsums = mult.cumsum()
out = offset + cumsums*scale_arr[::-1]
return out

运行时测试

让我们针对大数据集的同一个循环函数对这两个函数进行计时。

In [97]: data = np.random.randint(2,9,(5000))
...: window = 20
...:

In [98]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized(data, window))
Out[98]: True

In [99]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized_v2(data, window))
Out[99]: True

In [100]: %timeit numpy_ewma(data, window)
100 loops, best of 3: 6.03 ms per loop

In [101]: %timeit numpy_ewma_vectorized(data, window)
1000 loops, best of 3: 665 µs per loop

In [102]: %timeit numpy_ewma_vectorized_v2(data, window)
1000 loops, best of 3: 357 µs per loop

In [103]: 6030/357.0
Out[103]: 16.89075630252101

大约有 17 倍的加速!

关于python - NumPy 版本的 "Exponential weighted moving average",相当于 pandas.ewm().mean(),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42869495/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com