gpt4 book ai didi

python - 越界纳秒时间戳 : 1-01-01 00:00:00

转载 作者:行者123 更新时间:2023-12-01 00:43:34 25 4
gpt4 key购买 nike

我从 github 导入数据使用下面的代码:

series = read_csv('shampoo-sales.csv', header=0, index_col=0, squeeze=True). 

我想将其索引设置为 datetimeindex。我用的是

series.index = pd.to_datetime(series.index).

但是 python 给了我以下错误:

Out of bounds nanosecond timestamp: 1-01-01 00:00:00

我不知道如何修复这个错误。

series = read_csv('shampoo-sales.csv',header=0,index_col=0,squeeze=True)

series.index = pd.to_datetime(series.index)

<小时/>

更新:感谢 EdChum 指出了一种从索引转换为日期时间索引的方法。但是,我现在遇到了另一个问题。考虑以下代码。

X = series.rename("actual").to_frame() 
X = X.loc[~X.index.duplicated(keep='last')].asfreq('d', 'ffill')

现在我让 X = 系列,它返回一个错误,指出索引必须单调递增或递减。

最佳答案

您需要将格式字符串作为 to_datetime 的参数传递:

In[20]:
series.index = pd.to_datetime(series.index, format='%d-%m')
series.index

Out[20]:
DatetimeIndex(['1900-01-01', '1900-02-01', '1900-03-01', '1900-04-01',
'1900-05-01', '1900-06-01', '1900-07-01', '1900-08-01',
'1900-09-01', '1900-10-01', '1900-11-01', '1900-12-01',
'1900-01-02', '1900-02-02', '1900-03-02', '1900-04-02',
'1900-05-02', '1900-06-02', '1900-07-02', '1900-08-02',
'1900-09-02', '1900-10-02', '1900-11-02', '1900-12-02',
'1900-01-03', '1900-02-03', '1900-03-03', '1900-04-03',
'1900-05-03', '1900-06-03', '1900-07-03', '1900-08-03',
'1900-09-03', '1900-10-03', '1900-11-03', '1900-12-03'],
dtype='datetime64[ns]', name='Month', freq=None)

默认情况下,它会尝试推断格式,并认为格式为 YYYY-MM-DD,因此字符串 01-01 会转换为 1 年 1 月这超出了纳秒范围

如果您想要一个单调递增的索引(这就是您的数据实际上已经看起来的样子),我们只需将字符串 '20' 添加到索引中,然后进行转换:

In[24]:
series.index = '20' + series.index
series.index

Out[24]:
Index(['2001-01', '2001-02', '2001-03', '2001-04', '2001-05', '2001-06',
'2001-07', '2001-08', '2001-09', '2001-10', '2001-11', '2001-12',
'2002-01', '2002-02', '2002-03', '2002-04', '2002-05', '2002-06',
'2002-07', '2002-08', '2002-09', '2002-10', '2002-11', '2002-12',
'2003-01', '2003-02', '2003-03', '2003-04', '2003-05', '2003-06',
'2003-07', '2003-08', '2003-09', '2003-10', '2003-11', '2003-12'],
dtype='object')

In[25]:
series.index = pd.to_datetime(series.index, format='%Y-%m')
series

Out[25]:
2001-01-01 266.0
2001-02-01 145.9
2001-03-01 183.1
2001-04-01 119.3
2001-05-01 180.3
2001-06-01 168.5
2001-07-01 231.8
2001-08-01 224.5
2001-09-01 192.8
2001-10-01 122.9
2001-11-01 336.5
2001-12-01 185.9
2002-01-01 194.3
2002-02-01 149.5
2002-03-01 210.1
2002-04-01 273.3
2002-05-01 191.4
2002-06-01 287.0
2002-07-01 226.0
2002-08-01 303.6
2002-09-01 289.9
2002-10-01 421.6
2002-11-01 264.5
2002-12-01 342.3
2003-01-01 339.7
2003-02-01 440.4
2003-03-01 315.9
2003-04-01 439.3
2003-05-01 401.3
2003-06-01 437.4
2003-07-01 575.5
2003-08-01 407.6
2003-09-01 682.0
2003-10-01 475.3
2003-11-01 581.3
2003-12-01 646.9

然后你的代码就可以工作了:

In[28]:
X = series.rename("actual").to_frame()
X = X.loc[~X.index.duplicated(keep='last')].asfreq('d', 'ffill')
X

Out[28]:
actual
2001-01-01 266.0
2001-01-02 266.0
2001-01-03 266.0
2001-01-04 266.0
2001-01-05 266.0
2001-01-06 266.0
2001-01-07 266.0
2001-01-08 266.0
2001-01-09 266.0
2001-01-10 266.0
2001-01-11 266.0
2001-01-12 266.0
2001-01-13 266.0
2001-01-14 266.0
2001-01-15 266.0
2001-01-16 266.0
2001-01-17 266.0
2001-01-18 266.0
2001-01-19 266.0
2001-01-20 266.0
2001-01-21 266.0
2001-01-22 266.0
2001-01-23 266.0
2001-01-24 266.0
2001-01-25 266.0
2001-01-26 266.0
2001-01-27 266.0
2001-01-28 266.0
2001-01-29 266.0
2001-01-30 266.0
...
2003-11-02 581.3
2003-11-03 581.3
2003-11-04 581.3
2003-11-05 581.3
2003-11-06 581.3
2003-11-07 581.3
2003-11-08 581.3
2003-11-09 581.3
2003-11-10 581.3
2003-11-11 581.3
2003-11-12 581.3
2003-11-13 581.3
2003-11-14 581.3
2003-11-15 581.3
2003-11-16 581.3
2003-11-17 581.3
2003-11-18 581.3
2003-11-19 581.3
2003-11-20 581.3
2003-11-21 581.3
2003-11-22 581.3
2003-11-23 581.3
2003-11-24 581.3
2003-11-25 581.3
2003-11-26 581.3
2003-11-27 581.3
2003-11-28 581.3
2003-11-29 581.3
2003-11-30 581.3
2003-12-01 646.9

[1065 rows x 1 columns]

关于python - 越界纳秒时间戳 : 1-01-01 00:00:00,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57166570/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com