gpt4 book ai didi

python - 我如何使用 pandas 的 rolling_std 在其观察中考虑两列?

转载 作者:行者123 更新时间:2023-11-28 21:53:03 25 4
gpt4 key购买 nike

数据:

{'Open': {0: 159.18000000000001, 1: 157.99000000000001, 2: 157.66, 3: 157.53999999999999, 4: 155.03999999999999, 5: 155.47999999999999, 6: 155.44999999999999, 7: 155.93000000000001, 8: 155.0, 9: 157.72999999999999},  
'Close': {0: 157.97999999999999, 1: 157.66, 2: 157.53999999999999, 3: 155.03999999999999, 4: 155.47999999999999, 5: 155.44999999999999, 6: 155.87, 7: 155.0, 8: 157.72999999999999, 9: 157.31}}

代码:

import pandas as pd

d = #... data above.
df = pd.DataFrame.from_dict(d)
df['Close_Stdev'] = pd.rolling_std(df[['Close']],window=5)

print df

# Close Open Close_Stdev
# 0 157.98 159.18 NaN
# 1 157.66 157.99 NaN
# 2 157.54 157.66 NaN
# 3 155.04 157.54 NaN
# 4 155.48 155.04 1.369452
# 5 155.45 155.48 1.259754
# 6 155.87 155.45 0.975464
# 7 155.00 155.93 0.358567
# 8 157.73 155.00 1.065190
# 9 157.31 157.73 1.189378

问题:

上面的代码没有问题。但是,rolling_std 是否有可能在其观察窗口中考虑 Close 中的前四个值和 Open 中的第五个值?基本上,我希望 rolling_std 为其第一个 Stdev 计算以下内容:

157.98 # From Close
157.66 # From Close
157.54 # From Close
155.04 # From Close
155.04 # Bzzt, from Open.

从技术上讲,这意味着观察列表的最后一个值始终是最后一个 Close 值。

逻辑/原因:

显然,这是股票数据。我正在尝试检查在标准偏差的计算中考虑当前交易日股票的 Open 价格是否更好,而不是仅仅检查之前的 Closes.

期望的结果:

#     Close    Open  Close_Stdev  Desired_Stdev
# 0 157.98 159.18 NaN NaN
# 1 157.66 157.99 NaN NaN
# 2 157.54 157.66 NaN NaN
# 3 155.04 157.54 NaN NaN
# 4 155.48 155.04 1.369452 1.480311
# 5 155.45 155.48 1.259754 1.255149
# 6 155.87 155.45 0.975464 0.994017
# 7 155.00 155.93 0.358567 0.361151
# 8 157.73 155.00 1.065190 0.368035
# 9 157.31 157.73 1.189378 1.291464

额外细节:

这可以在 Excel 中轻松完成,方法是使用公式 STDEV.S 并选择数字,如下面的屏幕截图所示。但是,出于个人原因,我希望在 Python 和 pandas 中完成此操作(我突出显示 F6,它不只是由于 Snagit 的效果而可见)。

enter image description here

最佳答案

你可以使用 Welford's method计算标准偏差。这样做的好处是它可以表示为整个列上的矢量化算术,只需 5 次迭代。这应该比逐行计算和必须为每一行组成窗口更快。

首先,这是一个完整性检查,显示 Welford 的方法可以重现与

相同的结果
df['Close_Stdev'] = pd.rolling_std(df[['Close']],window=5)

import numpy as np
import pandas as pd

class OnlineVariance(object):
"""
Welford's algorithm computes the sample variance incrementally.
"""
def __init__(self, iterable=None, ddof=1):
self.ddof, self.n, self.mean, self.M2 = ddof, 0, 0.0, 0.0
if iterable is not None:
for datum in iterable:
self.include(datum)

def include(self, datum):
self.n += 1
self.delta = datum - self.mean
self.mean += self.delta / self.n
self.M2 += self.delta * (datum - self.mean)
self.variance = self.M2 / (self.n-self.ddof)

@property
def std(self):
return np.sqrt(self.variance)


d = {'Open': {0: 159.18000000000001, 1: 157.99000000000001, 2: 157.66, 3:
157.53999999999999, 4: 155.03999999999999, 5: 155.47999999999999, 6:
155.44999999999999, 7: 155.93000000000001, 8: 155.0, 9: 157.72999999999999},
'Close': {0: 157.97999999999999, 1: 157.66, 2: 157.53999999999999, 3:
155.03999999999999, 4: 155.47999999999999, 5: 155.44999999999999, 6: 155.87, 7:
155.0, 8: 157.72999999999999, 9: 157.31}}

df = pd.DataFrame.from_dict(d)

df['Close_Stdev'] = pd.rolling_std(df[['Close']],window=5)

ov = OnlineVariance()
for n in range(5):
ov.include(df['Close'].shift(n))

df['std'] = ov.std
print(df)
assert np.isclose(df['Close_Stdev'], df['std'], equal_nan=True).all()

产量

    Close    Open  Close_Stdev       std
0 157.98 159.18 NaN NaN
1 157.66 157.99 NaN NaN
2 157.54 157.66 NaN NaN
3 155.04 157.54 NaN NaN
4 155.48 155.04 1.369452 1.369452
5 155.45 155.48 1.259754 1.259754
6 155.87 155.45 0.975464 0.975464
7 155.00 155.93 0.358567 0.358567
8 157.73 155.00 1.065190 1.065190
9 157.31 157.73 1.189378 1.189378

因此,要将期初值纳入计算,

ov = OnlineVariance()
ov.include(df['Open'])
for n in range(1, 5):
ov.include(df['Close'].shift(n))
df['std'] = ov.std
print(df)

产量

    Close    Open       std
0 157.98 159.18 NaN
1 157.66 157.99 NaN
2 157.54 157.66 NaN
3 155.04 157.54 NaN
4 155.48 155.04 1.480311
5 155.45 155.48 1.255149
6 155.87 155.45 0.994017
7 155.00 155.93 0.361151
8 157.73 155.00 0.368035
9 157.31 157.73 1.291464

关于python - 我如何使用 pandas 的 rolling_std 在其观察中考虑两列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27307838/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com