gpt4 book ai didi

r - data.table 条件不等式连接

转载 作者:行者123 更新时间:2023-12-04 09:34:25 25 4
gpt4 key购买 nike

有两个示例数据集:

> aDT
col1 col2 ExtractDate
1: 1 A 2017-01-01
2: 1 A 2016-01-01
3: 2 B 2015-01-01
4: 2 B 2014-01-01
> bDT
col1 col2 date_pol Value
1: 1 A 2017-05-20 1
2: 1 A 2016-05-20 2
3: 1 A 2015-05-20 3
4: 2 B 2014-05-20 4

我需要:
> cDT
col1 col2 ExtractDate date_pol Value
1: 1 A 2017-01-01 2016-05-20 2
2: 1 A 2016-01-01 2015-05-20 3
3: 2 B 2015-01-01 2014-05-20 4
4: 2 B 2014-01-01 NA NA

基本上,aDT基于col1、col2和ExtractDate >= date_pol左加入bDT,只保留第一个匹配(即最高date_pol)。 不允许笛卡尔连接 由于内存限制。

笔记:
生成样本数据集
aDT <- data.table(col1 = c(1,1,2,2), col2 = c("A","A","B","B"), ExtractDate = c("2017-01-01","2016-01-01","2015-01-01","2014-01-01"))
bDT <- data.table(col1 = c(1,1,1,2), col2 = c("A","A","A","B"), date_pol = c("2017-05-20","2016-05-20","2015-05-20","2014-05-20"), Value = c(1,2,3,4))
cDT <- data.table(col1 = c(1,1,2,2), col2 = c("A","A","B","B"), ExtractDate = c("2017-01-01","2016-01-01","2015-01-01","2014-01-01")
,date_pol = c("2016-05-20","2015-05-20","2014-05-20",NA), Value = c(2,3,4,NA))


aDT[,ExtractDate := ymd(ExtractDate)]
bDT[,date_pol := ymd(date_pol)]
aDT[order(-ExtractDate)]
bDT[order(-date_pol)]

我试过了:
aDT[, c("date_pol", "Value") :=
bDT[aDT,
.(date_pol, Value)
,on = .(date_pol <= ExtractDate
,col1 = col1
,col2 = col2)
,mult = "first"]]

但是结果有点奇怪:
> aDT
col1 col2 ExtractDate date_pol Value ##date_pol values not right
1: 1 A 2017-01-01 2017-01-01 2
2: 1 A 2016-01-01 2016-01-01 3
3: 2 B 2015-01-01 2015-01-01 4
4: 2 B 2014-01-01 2014-01-01 NA

最佳答案

当 i 是 data.table 时,可以使用前缀 i. 在 j 中引用 i 的列,例如,X[Y, .(val, i.val)] .这里 val 指的是 X 的列和 i.val Y 的列。现在可以使用前缀 x 来引用 x 的列。并且在连接期间特别有用以引用 x 的连接列,因为它们否则会被 i 屏蔽。例如,X[Y, .(x.a-i.a, b), on="a"] .

bDT[aDT, .(col1, col2, i.ExtractDate, x.date_pol, Value),
on = .(date_pol <= ExtractDate, col1 = col1, col2 = col2),
mult = "first"]

输出
   col1 col2 i.ExtractDate x.date_pol Value
1: 1 A 2017-01-01 2016-05-20 2
2: 1 A 2016-01-01 2015-05-20 3
3: 2 B 2015-01-01 2014-05-20 4
4: 2 B 2014-01-01 <NA> NA

关于r - data.table 条件不等式连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47524918/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com