gpt4 book ai didi

r - 如何比较一行中的任何元素是否相同

转载 作者:行者123 更新时间:2023-12-04 21:55:55 24 4
gpt4 key购买 nike

有没有办法比较一行的“任何值”是否与上面行的“任何值”相同 - 无论顺序如何?下面是一个非常随机的输入数据表。

DT <- data.table(A=c("a","a","b","d","e","f","h","i","j"),
B=c("a","b","c","c","f","g",NA,"j",NA),
C=c("a","b","c","b","g","h",NA,NA,NA))

> DT
A B C
1: a a a
2: a b b
3: b c c
4: d c b
5: e f g
6: f g h
7: h NA NA
8: i j NA
9: j NA NA

我想添加一个列 D 将一行与上面的行进行比较,并比较两行的任何值是否相同(无论顺序如何)。所以期望的输出是:
 > DT
A B C D
1: a a a 0 #No row above to compare; could be either NA or 0
2: a b b 1 #row 2 has "a", which is in row 1; returns 1
3: b c c 1 #row 3 has "b", which is in row 2; returns 1
4: d c b 1 #row 4 has "b" and "c", which are in row 3; returns 1
5: e f g 0 #row 5 has nothing that is in row 4; returns 0
6: f g h 1 #row 6 has "f" and "g", which are in row 5; returns 1
7: h NA NA 1 #row 7 has "h", which is in row 6; returns 1
8: i j NA 0 #row 8 has nothing that is in row 7 (NA doesn't count)
9: j NA NA 1 #row 9 has "j", which is in row 8; returns 1 (NA doesn't count)

主要思想是我想将一行(或向量)与另一行(向量)进行比较,如果每行(向量)中的任何元素都相同,则将两行定义为相同。 (没有重申比较每个元素)

最佳答案

我们可以通过获取 lead 来做到这一点。数据集的行,paste每行,使用 paste 检查任何模式使用 grepl 编辑原始数据集的行和 Map ,然后 unlist并转换为 integer

DT[, D := {
v1 <- do.call(paste, .SD)
v2 <- do.call(paste, c(shift(.SD, type = "lead"), sep="|"))
v2N <- gsub("NA\\|*|\\|*NA", "", v2)
v3 <- unlist(Map(grepl, v2N, v1), use.names = FALSE)
as.integer(head(c(FALSE, v3), -1))
}]

DT
# A B C D
#1: a a a 0
#2: a b b 1
#3: b c c 1
#4: d c b 1
#5: e f g 0
#6: f g h 1
#7: h NA NA 1
#8: i j NA 0
#9: j NA NA 1

或者我们可以做一个 split并使用 Map 进行比较
as.integer(c(FALSE, unlist(Map(function(x,y) {
x1 <- na.omit(unlist(x))
y1 <- na.omit(unlist(y))
any(x1 %in% y1 | y1 %in% x1) },
split(DT[-nrow(DT)], 1:(nrow(DT)-1)), split(DT[-1], 2:nrow(DT))), use.names = FALSE)))

关于r - 如何比较一行中的任何元素是否相同,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43117461/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com