gpt4 book ai didi

r - 使用数据表非等连接日期

转载 作者:行者123 更新时间:2023-12-04 10:57:48 24 4
gpt4 key购买 nike

我有一个编辑数据表:

library(data.table)

edits <- data.table(proposal=c('A','A','A'),
editField=c('probability','probability','probability'),
startDate=as.POSIXct(c('2017-04-14 00:00:00','2019-09-06 12:12:00','2018-10-10 15:47:00')),
endDate=as.POSIXct(c('2019-09-06 12:12:00','2018-10-10 15:47:00','9999-12-31 05:00:00')),
value=c(.1,.3,.1))

proposal editField startDate endDate value
1: A probability 2017-04-14 00:00:00 2019-09-06 12:12:00 0.1
2: A probability 2019-09-06 12:12:00 2018-10-10 15:47:00 0.3
3: A probability 2018-10-10 15:47:00 9999-12-31 05:00:00 0.1

我想加入一个事件数据表:
events <-     data.table(proposal='A',
editDate=as.POSIXct(c('2017-04-14 00:00:00','2019-09-06 12:12:00','2019-09-06 12:12:00','2019-09-06 12:12:00','2018-07-04 15:33:59','2018-07-27 08:01:00','2018-10-10 15:47:00','2018-10-10 15:47:00','2018-10-10 15:47:00','2018-11-26 11:10:00','2019-02-05 13:06:59')),
editField=c('Created','stage','probability','estOrder','estOrder','estOrder','stage','probability','estOrder','estOrder','estOrder'))

proposal editDate editField
1: A 2017-04-14 00:00:00 Created
2: A 2019-09-06 12:12:00 stage
3: A 2019-09-06 12:12:00 probability
4: A 2019-09-06 12:12:00 estOrder
5: A 2018-07-04 15:33:59 estOrder
6: A 2018-07-27 08:01:00 estOrder
7: A 2018-10-10 15:47:00 stage
8: A 2018-10-10 15:47:00 probability
9: A 2018-10-10 15:47:00 estOrder
10: A 2018-11-26 11:10:00 estOrder
11: A 2019-02-05 13:06:59 estOrder

要获得如下所示的输出,其中的值指定了编辑发生时的概率值:
desired.join <- cbind(events, value=c(.1,.3,.3,.3,.3,.3,.3,.1,.1,.1,.1))
proposal editDate editField value
1: A 2017-04-14 00:00:00 Created 0.1
2: A 2019-09-06 12:12:00 stage 0.3
3: A 2019-09-06 12:12:00 probability 0.3
4: A 2019-09-06 12:12:00 estOrder 0.3
5: A 2018-07-04 15:33:59 estOrder 0.3
6: A 2018-07-27 08:01:00 estOrder 0.3
7: A 2018-10-10 15:47:00 stage 0.3
8: A 2018-10-10 15:47:00 probability 0.1
9: A 2018-10-10 15:47:00 estOrder 0.1
10: A 2018-11-26 11:10:00 estOrder 0.1
11: A 2019-02-05 13:06:59 estOrder 0.1

到目前为止,这是我尝试加入两者的方法:
edits[editField=='probability'][events, on=.(proposal, startDate<=editDate, endDate>editDate)]

但是,当我尝试这样做时,我收到一条错误消息,“ vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__, :
连接结果为 16 行;超过 14 = nrow(x)+nrow(i)。检查 i 中的重复键值,每个键值都一遍又一遍地加入 x 中的同一组。如果没问题,请尝试 by=.EACHI 为每个组运行 j 以避免大量分配。如果您确定要继续,请使用 allow.cartesian=TRUE 重新运行。否则,请在 FAQ、Wiki、Stack Overflow 和 data.table 问题跟踪器中搜索此错误消息以获取建议。

最佳答案

看起来您正在尝试连接编辑和事件,以便编辑数据表中的概率值与事件数据表中的正确观察相关联。

看起来错误是因为用于创建编辑数据表的时间间隔不是相互排斥的。当我将时间间隔修改为我认为您想要的时间间隔时,您的代码就会给出您正在寻找的结果。

library(data.table)

edits <- data.table(proposal=c('A','A','A'),
editField=c('probability','probability','probability'),
startDate=as.POSIXct(c('2017-04-14 00:00:00','2018-10-10 15:47:00','2019-09-06 12:12:00')),
endDate=as.POSIXct(c('2018-10-10 15:47:00','2019-09-06 12:12:00','9999-12-31 05:00:00')),
value=c(.1,.3,.1))

events <- data.table(proposal='A',
editDate=as.POSIXct(c('2017-04-14 00:00:00','2019-09-06 12:12:00','2019-09-06 12:12:00','2019-09-06 12:12:00','2018-07-04 15:33:59','2018-07-27 08:01:00','2018-10-10 15:47:00','2018-10-10 15:47:00','2018-10-10 15:47:00','2018-11-26 11:10:00','2019-02-05 13:06:59')),
editField=c('Created','stage','probability','estOrder','estOrder','estOrder','stage','probability','estOrder','estOrder','estOrder'))

edits[editField=='probability'][events, on=.(proposal, startDate<=editDate, endDate>editDate)]

或者你可以在不链接的情况下进行连接
  edits[events, on=.(proposal, startDate<=editDate, endDate>editDate)]

或者您可以按照 Jonny Phelps 的建议进行操作并使用 foverlaps,但这也需要编辑数据表中的互斥时间间隔
events[,startDate:= editDate]

setkey(events, startDate, editDate)

setkey(edits, startDate, endDate)

foverlaps(events, edits, type="any", mult="first")

关于r - 使用数据表非等连接日期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59074525/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com