gpt4 book ai didi

r - 与 data.tables 的非连接

转载 作者:行者123 更新时间:2023-12-03 21:24:26 27 4
gpt4 key购买 nike

我对 data.table 有疑问“非连接”的习语,灵感来自 Iterator 的 question .下面是一个例子:

library(data.table)

dt1 <- data.table(A1=letters[1:10], B1=sample(1:5,10, replace=TRUE))
dt2 <- data.table(A2=letters[c(1:5, 11:15)], B2=sample(1:5,10, replace=TRUE))

setkey(dt1, A1)
setkey(dt2, A2)
data.table看起来像这样
> dt1               > dt2
A1 B1 A2 B2
[1,] a 1 [1,] a 2
[2,] b 4 [2,] b 5
[3,] c 2 [3,] c 2
[4,] d 5 [4,] d 1
[5,] e 1 [5,] e 1
[6,] f 2 [6,] k 5
[7,] g 3 [7,] l 2
[8,] h 3 [8,] m 4
[9,] i 2 [9,] n 1
[10,] j 4 [10,] o 1

查找 dt2 中的哪些行在 dt1 中具有相同的 key ,设置 which选项 TRUE :
> dt1[dt2, which=TRUE]
[1] 1 2 3 4 5 NA NA NA NA NA

马修在此建议 answer ,一个“非加入”的习语
dt1[-dt1[dt2, which=TRUE]]

子集 dt1到那些索引未出现在 dt2 中的行.在我的机器上 data.table v1.7.1 我收到一个错误:
Error in `[.default`(x[[s]], irows): only 0's may be mixed with negative subscripts

相反,使用选项 nomatch=0 ,“非加入”有效
> dt1[-dt1[dt2, which=TRUE, nomatch=0]]
A1 B1
[1,] f 2
[2,] g 3
[3,] h 3
[4,] i 2
[5,] j 4

这是预期的行为吗?

最佳答案

v1.8.3 中的新功能:

A new "!" prefix on i signals 'not-join' (a.k.a. 'not-where'), #1384.
DT[-DT["a", which=TRUE, nomatch=0]] # old not-join idiom, still works
DT[!"a"] # same result, now preferred.
DT[!J(6),...] # !J == not-join
DT[!2:3,...] # ! on all types of i
DT[colA!=6L | colB!=23L,...] # multiple vector scanning approach
DT[!J(6L,23L)] # same result, faster binary search
'!' has been used rather than '-' :
* to match the 'not-join' and 'not-where' nomenclature
* with '-', DT[-0] would return DT rather than DT[0] and not be backwards
compatibile. With '!', DT[!0] returns DT both before (since !0 is TRUE in
base R) and after this new feature.
* to leave DT[+...] and DT[-...] available for future use

关于r - 与 data.tables 的非连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7920688/

27 4 0