gpt4 book ai didi

r - 与 data.table 的左外连接,关键变量的名称不同

转载 作者:行者123 更新时间:2023-12-04 10:57:52 31 4
gpt4 key购买 nike

首先让我说这是我在堆栈溢出上发布的第一个问题。如果我需要更改问题中的样式、格式等,请告诉我。

我想对两个数据表进行左外连接操作,条件是允许我对两个数据表中的关键变量使用不同的名称。例子:

DT1 = data.table(x1=c("b","c", "a", "b", "a", "b"),   x2a=1:6,m1=seq(10,60,by=10))
setkey(DT1,x1,x2a)
> DT1
x1 x2a m1
1: a 3 30
2: a 5 50
3: b 1 10
4: b 4 40
5: b 6 60
6: c 2 20
DT2 = data.table(x1=c("b","d", "c", "b","a","a"),x2b=c(1,4,7,6," "," "),m2=5:10)
setkey(DT2,x1,x2b)
> DT2
x1 x2b m2
1: a 9
2: a 10
3: b 1 5
4: b 6 8
5: c 7 7
6: d 4 6
############# first, I use the merge operation on the data frames to do a left outer join
dfL<-merge.data.frame(DT1,DT2,by.x=c('x1','x2a'),by.y=c('x1','x2b'),all.x=TRUE)
> dfL
x1 x2a m1 m2
1 a 3 30 NA
2 a 5 50 NA
3 b 1 10 5
4 b 4 40 NA
5 b 6 60 8
6 c 2 20 NA
################# attempt with data table left outer join
> dtL<-DT2[DT1,on=c("x1","x2a")]
Error in forderv(x, by = rightcols) :
'by' value -2147483648 out of range [1,3]

#################### code that works with data table
DT1 = data.table(x1=c("b","c", "a", "b", "a", "b"), x2=as.character(1:6),m1=seq(10,60,by=10))
setkey(DT1,x1,x2)
DT1
DT2 = data.table(x1=c("b","d", "c", "b","a","a"),x2=c(1,4,7,6," "," ") ,m2=5:10)
setkey(DT2,x1,x2)
DT2
dtL<-DT2[DT1]
######################## this required identical naming of the key variables in the two data tables
################### Also does not allow a ad-hoc selection of the key variables with the "on" argument

我想知道是否可以保留数据帧合并命令的灵活性。带有数据表。

最佳答案

来自 ?data.table::merge

This merge method for data.table behaves very similarly to that of data.frames with one major exception: By default, the columns used to merge the data.tables are the shared key columns rather than the shared columns with the same names. Set the by, or by.x, by.y arguments explicitly to override this default.



所以我们可以使用 by 参数来覆盖 keys
library(data.table)

DT1 = data.table(x1=c("b","c", "a", "b", "a", "b"), x2a=1:6,m1=seq(10,60,by=10))
DT2 = data.table(x1=c("b","d", "c", "b","a","a"),x2b=c(1,4,7,6," "," "),m2=5:10)

## you will get an error when joining a character to a integer:
DT2$x2b <- as.integer(DT2$x2b)
## Alternative:
## DT2 = data.table(x1=c("b","d", "c", "b","a","a"),x2b=c(1,4,7,6,NA,NA),m2=5:10)

merge(DT1, DT2, by.x=c('x1','x2a'), by.y=c('x1','x2b'), all.x=TRUE)

x1 x2a m1 m2
1: a 3 30 NA
2: a 5 50 NA
3: b 1 10 5
4: b 4 40 NA
5: b 6 60 8
6: c 2 20 NA

关于r - 与 data.table 的左外连接,关键变量的名称不同,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34644707/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com