gpt4 book ai didi

r - 创建xts对象会导致时间戳更改

转载 作者:行者123 更新时间:2023-12-01 15:41:45 25 4
gpt4 key购买 nike

假设我有:

R> str(data)
'data.frame': 4 obs. of 2 variables:
$ datetime: Factor w/ 4 levels "2011-01-05 09:30:00.001",..: 1 2 3 4
$ price : num 18.3 18.3 18.3 18.3

R> data
datetime price
1 2011-01-05 09:30:00.001 18.31
2 2011-01-05 09:30:00.321 18.33
3 2011-01-05 09:30:01.511 18.33
4 2011-01-05 09:30:02.192 18.34

当我尝试将其加载到 xts对象中时,时间戳被微妙地更改了:
R> x <- xts(data[-1], as.POSIXct(strptime(data$datetime, '%Y-%m-%d %H:%M:%OS')))
R> str(x)
An ‘xts’ object from 2011-01-05 09:30:00.000 to 2011-01-05 09:30:02.191 containing:
Data: num [1:4, 1] 18.3 18.3 18.3 18.3
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "price"
Indexed by objects of class: [POSIXct,POSIXt] TZ:
xts Attributes:
NULL

R> x
price
2011-01-05 09:30:00.000 18.31
2011-01-05 09:30:00.321 18.33
2011-01-05 09:30:01.510 18.33
2011-01-05 09:30:02.191 18.34

您会注意到时间戳已更改。现在,第一个条目出现在 09:30:00.000而不是原始数据所说的 09:30:00.001。第三和第四行也不正确。

是什么原因造成的?我是从根本上做错了吗?我尝试了各种方法来将数据放入 xts对象,它们似乎都表现出这种行为。

编辑:添加 sessionInfo()
R> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] xts_0.8-2 zoo_1.7-4

loaded via a namespace (and not attached):
[1] grid_2.13.1 lattice_0.19-30 tools_2.13.1

编辑2:如果我将源数据修改为微秒精度,如下所示:
datetime,price
2011-01-05 09:30:00.001000,18.31
2011-01-05 09:30:00.321000,18.33
2011-01-05 09:30:01.511000,18.33
2011-01-05 09:30:02.192000,18.34

然后加载它,所以我有:
R> test
datetime price
1 2011-01-05 09:30:00.001000 18.31
2 2011-01-05 09:30:00.321000 18.33
3 2011-01-05 09:30:01.511000 18.33
4 2011-01-05 09:30:02.192000 18.34

然后,最后将其转换为 xts对象并设置索引格式:
R> x <- xts(test[,-1], as.POSIXct(strptime(test$datetime, '%Y-%m-%d %H:%M:%OS')))
R> indexFormat(x) <- '%Y-%m-%d %H:%M:%OS6'
R> x
[,1]
2011-01-05 09:30:00.000999 18.31
2011-01-05 09:30:00.321000 18.33
2011-01-05 09:30:01.510999 18.33
2011-01-05 09:30:02.191999 18.34

您也可以看到效果。我希望增加额外的精度会有所帮助,但不幸的是,这样做没有帮助。

编辑3:请参阅 @DWin's answer以获取可重现此行为的端到端测试用例。

编辑4:该行为似乎不是毫秒级的。下面显示了微秒分辨率时间戳的相同更改结果。如果我将输入数据更改为:
R> data
datetime price
1 2011-01-05 09:30:00.001001 18.31
2 2011-01-05 09:30:00.321001 18.33
3 2011-01-05 09:30:01.511001 18.33
4 2011-01-05 09:30:02.192005 18.34

然后创建一个 xts对象:
R> x <- xts(data[-1], 
as.POSIXct(strptime(as.character(data$datetime), '%Y-%m-%d %H:%M:%OS')))
R> indexFormat(x) <- '%Y-%m-%d %H:%M:%OS6'
R> x
price
2011-01-05 09:30:00.001000 18.31
2011-01-05 09:30:00.321001 18.33
2011-01-05 09:30:01.511001 18.33
2011-01-05 09:30:02.192004 18.34

编辑5:这似乎是浮点精度问题。观察:
R> t <- as.POSIXct("2011-01-05 09:30:00.001001")
R> t
[1] "2011-01-05 09:30:00.001 CST"
R> as.numeric(t)
[1] 1294241400.0010008812

这表现出错误行为,并且与EDIT 4中的示例一致。但是,使用未显示错误的示例:
R> t <- as.POSIXct("2011-01-05 09:30:01.511001")
R> t
[1] "2011-01-05 09:30:01.511001 CST"
R> as.numeric(t)
[1] 1294241401.5110011101

好像 xts或某些底层组件在四舍五入而不是四舍五入?

最佳答案

看来问题仅在印刷上。使用OP的原始data:

ind <- as.POSIXct(strptime(data$datetime, '%Y-%m-%d %H:%M:%OS'))
as.numeric(ind)*1e6 # as expected
# [1] 1294241400001000 1294241400321000 1294241401511000 1294241402192000
ind # wrong
# [1] "2011-01-05 09:30:00.000 CST" "2011-01-05 09:30:00.321 CST"
# [3] "2011-01-05 09:30:01.510 CST" "2011-01-05 09:30:02.191 CST"
x <- xts(data[-1], ind)
x # wrong
# price
# 2011-01-05 09:30:00.000 18.31
# 2011-01-05 09:30:00.321 18.33
# 2011-01-05 09:30:01.510 18.33
# 2011-01-05 09:30:02.191 18.34
as.numeric(index(x))*1e6 # but the underlying index values are as expected
# [1] 1294241400001000 1294241400321000 1294241401511000 1294241402192000

关于r - 创建xts对象会导致时间戳更改,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7341857/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com