gpt4 book ai didi

python - 在烛台 OHLCV 数据中填充 NaN

转载 作者:行者123 更新时间:2023-11-28 18:48:35 24 4
gpt4 key购买 nike

我有一个这样的DataFrame

                       OPEN    HIGH     LOW   CLOSE         VOL
2012-01-01 19:00:00 449000 449000 449000 449000 1336303000
2012-01-01 20:00:00 NaN NaN NaN NaN NaN
2012-01-01 21:00:00 NaN NaN NaN NaN NaN
2012-01-01 22:00:00 NaN NaN NaN NaN NaN
2012-01-01 23:00:00 NaN NaN NaN NaN NaN
...
OPEN HIGH LOW CLOSE VOL
2013-04-24 14:00:00 11700000 12000000 11600000 12000000 20647095439
2013-04-24 15:00:00 12000000 12399000 11979000 12399000 23997107870
2013-04-24 16:00:00 12399000 12400000 11865000 12100000 9379191474
2013-04-24 17:00:00 12300000 12397995 11850000 11850000 4281521826
2013-04-24 18:00:00 11850000 11850000 10903000 11800000 15546034128

我需要按照这个规则填写NaN

当 OPEN、HIGH、LOW、CLOSE 为 NaN 时,

  • 将音量设置为 0
  • 将 OPEN、HIGH、LOW、CLOSE 设置为之前的 CLOSE 蜡烛值

否则保留NaN

最佳答案

由于其他两个答案都不起作用,这里是一个完整的答案。

我在这里测试两种方法。第一个基于 working4coin 对 hd1 答案的评论,第二个是较慢的纯 python 实现。很明显,python 实现应该更慢,但我决定对这两种方法进行计时,以确保并量化结果。

def nans_to_prev_close_method1(data_frame):
data_frame['volume'] = data_frame['volume'].fillna(0.0) # volume should always be 0 (if there were no trades in this interval)
data_frame['close'] = data_frame.fillna(method='pad') # ie pull the last close into this close
# now copy the close that was pulled down from the last timestep into this row, across into o/h/l
data_frame['open'] = data_frame['open'].fillna(data_frame['close'])
data_frame['low'] = data_frame['low'].fillna(data_frame['close'])
data_frame['high'] = data_frame['high'].fillna(data_frame['close'])

方法 1 在 c 中(在 pandas 代码中)完成了大部分繁重的工作,因此应该非常快。

缓慢的 python 方法(方法 2)如下所示

def nans_to_prev_close_method2(data_frame):
prev_row = None
for index, row in data_frame.iterrows():
if np.isnan(row['open']): # row.isnull().any():
pclose = prev_row['close']
# assumes first row has no nulls!!
row['open'] = pclose
row['high'] = pclose
row['low'] = pclose
row['close'] = pclose
row['volume'] = 0.0
prev_row = row

测试他们两个的时间:

df = trades_to_ohlcv(PATH_TO_RAW_TRADES_CSV, '1s') # splits raw trades into secondly candles
df2 = df.copy()

wrapped1 = wrapper(nans_to_prev_close_method1, df)
wrapped2 = wrapper(nans_to_prev_close_method2, df2)

print("method 1: %.2f sec" % timeit.timeit(wrapped1, number=1))
print("method 2: %.2f sec" % timeit.timeit(wrapped2, number=1))

结果是:

method 1:   0.46 sec
method 2: 151.82 sec

显然方法 1 快得多(大约快 330 倍)。

关于python - 在烛台 OHLCV 数据中填充 NaN,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16466670/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com