gpt4 book ai didi

r - data.table 内/外连接与 NA 在双 bug 类型的连接列中?

转载 作者:行者123 更新时间:2023-12-04 02:05:35 25 4
gpt4 key购买 nike

按照这篇维基百科文章 SQL join我想清楚地了解我们如何与 data.table 进行连接。
在这个过程中,我们可能在加入 NA 时发现了一个错误。
以维基为例:

R) X = data.table(name=c("Raf","Jon","Ste","Rob","Smi","Joh"),depID=c(31,33,33,34,34,NA),key="depID")
R) Y = data.table(depID=c(31,33,34,35),depName=c("Sal","Eng","Cle","Mar"),key="depID")
R) X
name depID
1: Joh NA
2: Raf 31
3: Jon 33
4: Ste 33
5: Rob 34
6: Smi 34
R) Y
depID depName
1: 31 Sal
2: 33 Eng
3: 34 Cle
4: 35 Mar

左外连接
R) merge.data.frame(X,Y,all.x=TRUE)
depID name depName
1 31 Raf Sal
2 33 Jon Eng
3 33 Ste Eng
4 34 Rob Cle
5 34 Smi Cle
6 NA Joh <NA>
merge.data.table不要输出相同的结果并显示我认为是 lign 2 上的错误。
R) merge(X,Y,all.x=TRUE)
depID name depName
1: NA Joh Eng
2: 31 Raf NA
3: 33 Jon Eng
4: 33 Ste Eng
5: 34 Rob Cle
6: 34 Smi Cle
R) Y[X] #same -> :(
depID depName name
1: NA Eng Joh
2: 31 NA Raf
3: 33 Eng Jon
4: 33 Eng Ste
5: 34 Cle Rob
6: 34 Cle Smi

右外连接
看起来一样
R) merge.data.frame(X,Y,all.y=TRUE)
depID name depName
1 31 Raf Sal
2 33 Jon Eng
3 33 Ste Eng
4 34 Rob Cle
5 34 Smi Cle
6 35 <NA> Mar

R) merge(X,Y,all.y=TRUE)
depID name depName
1: NA Joh Eng
2: 31 NA Sal
3: 33 Jon Eng
4: 33 Ste Eng
5: 34 Rob Cle
6: 34 Smi Cle
7: 35 NA Mar

内(自然)连接
R) merge.data.frame(X,Y)
depID name depName
1 31 Raf Sal
2 33 Jon Eng
3 33 Ste Eng
4 34 Rob Cle
5 34 Smi Cle
R) merge(X,Y)
depID name depName
1: NA Joh Eng
2: 33 Jon Eng
3: 33 Ste Eng
4: 34 Rob Cle
5: 34 Smi Cle

最佳答案

是的,它看起来像是一个(令人尴尬的)与关键 NA 相关的新错误。还有其他关于 NA in key 不可能的讨论,但我没有意识到它会以这种方式搞砸。会调查。谢谢 ...

#2453 NA in double key column messes up joins (NA in integer and character ok)

现在在 1.8.7 (commit 780) 中修复,来自 NEWS :

NA in a join column of type double could cause both X[Y] and merge(X,Y) to return incorrect results, #2453. Due to an errant x==NA_REAL in the C source which should have been ISNA(x). Support for double in keyed joins is a relatively recent addition to data.table, but embarassing all the same. Fixed and tests added. Many thanks to statquant for the thorough and reproducible report.

关于r - data.table 内/外连接与 NA 在双 bug 类型的连接列中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14076065/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com