gpt4 book ai didi

r - R中重叠的唯一数据帧

转载 作者:行者123 更新时间:2023-12-04 00:55:59 31 4
gpt4 key购买 nike

我的两个数据框是:

df1<-structure(list(header1 = structure(1:4, .Label = c("a", "b", 
"c", "d"), class = "factor")), class = "data.frame", row.names = c(NA,
-4L))

df2<-structure(list(sample_x = structure(c(1L, 1L, 2L, 3L), .Label = c("0", 
"a", "c"), class = "factor"), sample_y = structure(c(1L, 3L,
2L, 4L), .Label = c("0", "a", "m", "t"), class = "factor"), sample_z = structure(c(3L,
2L, 1L, 1L), .Label = c("0", "a", "c"), class = "factor")), class = "data.frame", row.names = c(NA,
-4L))

df2 中的 0 表示没有值。

现在我想重叠 df1 和 df2 来制作输出数据帧(df3):

df3<-structure(list(sample_x = c(2L, 2L, 0L), sample_y = c(1L, 3L, 
2L), sample_z = c(2L, 2L, 0L)), class = "data.frame", row.names = c("overlap_df1_df2",
"unique_df1", "unique_df2"))

我尝试了数据表函数 foverlaps:

setkeyv(df1, names(df1))
setkeyv(df2, names(df2))
df3<-foverlaps(df1,df2)

但似乎我需要在这两个数据框中有一些共同的列名,这显然不是这种情况。谢谢!

最佳答案

遍历列,并使用设置操作:

sapply(df2, function(i){
x = i[ !is.na(i) ]
o = intersect(df1$header1, x)
u_df1 = setdiff(df1$header1, o)
u_df2 = setdiff(x, o)
c(o = length(o),
u_df1 = length(u_df1),
u_df2 = length(u_df2))
})
# sample_x sample_y sample_z
# o 2 1 2
# u_df1 2 3 2
# u_df2 0 2 0

关于r - R中重叠的唯一数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62528136/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com