gpt4 book ai didi

python - 使用 OHLC 对 Pandas 进行重采样

转载 作者:行者123 更新时间:2023-12-01 08:25:21 25 4
gpt4 key购买 nike

我是 Pandas 新手。因此,如果我做了一些愚蠢的事情,请告诉我。

输入文件:(下面仅显示head。文件有 10K+ 行)

$ head /var/tmp/ticks_data.csv 
2019-01-18 14:55:00,296
2019-01-18 14:55:01,296
2019-01-18 14:55:02,296
2019-01-18 14:55:03,296.05
2019-01-18 14:55:04,296.05
2019-01-18 14:55:05,296
2019-01-18 14:55:06,296
2019-01-18 14:55:08,296
2019-01-18 14:55:09,296
2019-01-18 14:55:10,296.05

代码:

$ cat create_candles.py 

import pandas as pd

filename = '/var/tmp/ticks_data.csv'
df = pd.read_csv(filename, names=['timestamp', 'ltp'], index_col=1, parse_dates=['timestamp'])
# print(df.head())
data = df['ltp'].resample('1min').ohlc()
print(data)

错误:

$ python3 create_candles.py 
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexes/base.py", line 3078, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'ltp'

我认为文件包含未知字符,因此我在 /var/tmp/ticks_data.csv 上运行了 dos2unix,但仍然存在同样的问题。

如果我尝试从 df 中删除 index_col=1,:

df = pd.read_csv(filename, names=['timestamp', 'ltp'], parse_dates=['timestamp'])

然后我收到以下错误:

Traceback (most recent call last):
File "/Users/dheeraj.kabra/Desktop/Ticks/create_candles.py", line 6, in <module>
data = df['ltp'].resample('1min').ohlc()
File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py", line 7110, in resample
base=base, key=on, level=level)
File "/usr/local/lib/python3.7/site-packages/pandas/core/resample.py", line 1148, in resample
return tg._get_resampler(obj, kind=kind)
File "/usr/local/lib/python3.7/site-packages/pandas/core/resample.py", line 1276, in _get_resampler
"but got an instance of %r" % type(ax).__name__)
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'
[Finished in 0.5s with exit code 1]

任何解决这个问题的指示都会非常有帮助。

最佳答案

index_col 更改为 0['timestamp'] 将第一列转换为 DatatimeIndex:

import pandas as pd

temp=u"""2019-01-18 14:55:00,296
2019-01-18 14:55:01,296
2019-01-18 14:55:02,296
2019-01-18 14:55:03,296.05
2019-01-18 14:55:04,296.05
2019-01-18 14:55:05,296
2019-01-18 14:55:06,296
2019-01-18 14:55:08,296
2019-01-18 14:55:09,296
2019-01-18 14:55:10,296.05"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
#df = pd.read_csv(pd.compat.StringIO(temp), sep=";", index_col=None, parse_dates=False)
df = pd.read_csv(pd.compat.StringIO(temp),
names=['timestamp', 'ltp'],
index_col=0,
parse_dates=['timestamp'])

替代解决方案:

df = pd.read_csv(pd.compat.StringIO(temp), 
names=['timestamp', 'ltp'],
index_col=['timestamp'],
parse_dates=['timestamp'])
<小时/>
print (df)
ltp
timestamp
2019-01-18 14:55:00 296.00
2019-01-18 14:55:01 296.00
2019-01-18 14:55:02 296.00
2019-01-18 14:55:03 296.05
2019-01-18 14:55:04 296.05
2019-01-18 14:55:05 296.00
2019-01-18 14:55:06 296.00
2019-01-18 14:55:08 296.00
2019-01-18 14:55:09 296.00
2019-01-18 14:55:10 296.05

data = df.resample('1min')['ltp'].ohlc()
print(data)
open high low close
timestamp
2019-01-18 14:55:00 296.0 296.05 296.0 296.05
<小时/>

原始解决方案的详细信息 - index_col=1 解析第二列,此处ltp:

df = pd.read_csv(pd.compat.StringIO(temp), 
names=['timestamp', 'ltp'],
index_col=1,
parse_dates=['timestamp'])


print (df)
timestamp
ltp
296.00 2019-01-18 14:55:00
296.00 2019-01-18 14:55:01
296.00 2019-01-18 14:55:02
296.05 2019-01-18 14:55:03
296.05 2019-01-18 14:55:04
296.00 2019-01-18 14:55:05
296.00 2019-01-18 14:55:06
296.00 2019-01-18 14:55:08
296.00 2019-01-18 14:55:09
296.05 2019-01-18 14:55:10

关于python - 使用 OHLC 对 Pandas 进行重采样,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54291559/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com