gpt4 book ai didi

R:逐行比较多列字符串与单列字符串

转载 作者:行者123 更新时间:2023-12-01 00:11:46 25 4
gpt4 key购买 nike

当将一个字符串与 data.table 中的两个以上其他字符串进行比较时,我试图创建一个作为逻辑值的变量,我需要忽略 NA。

D2 的样本数据:

structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1", 
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1")), row.names = c(NA, -3L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x0000015eb1261ef0>)

尝试了以下建议的解决方案:
D2[, Match := apply(sapply(.SD, `==`, D2[, "var1"]), 1, any), .SDcols = 
c("var2", "var3")]

a003 的结果是 TRUE 而它应该是 FALSE 因为 var1 和 var3 不匹配:
structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1", 
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1"), Match = c(TRUE, TRUE, TRUE)), row.names = c(NA,
-3L), class = c("data.table", "data.frame"), .internal.selfref = <pointer:
0x0000015eb1261ef0>)

预期结果:
structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1", 
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1"), Match = c(TRUE, TRUE, FALSE)), row.names = c(NA,
-3L), class = c("data.table", "data.frame"), .internal.selfref = <pointer:
0x0000015eb1261ef0>)

最佳答案

下面怎么样

setDT(D1)
D1[, Match := apply(sapply(.SD, `==`, D1[, "var1"]), 1, any), .SDcols = c("var2", "var3")]
D1
#ID var1 var2 var3 Match
#1: a001 char1 char1 char1 TRUE
#2: a002 char1 <NA> char1 TRUE
#3: a003 char2 char1 char1 FALSE

说明:我们比较子 data.table 中的条目通过 .SDcols 定义条目位于 D1[, "var1"] ;如果有 any匹配,返回 TRUE , 否则 FALSE .

更新

为了回应你的评论,你可以做
setDT(D1)
D1[, Match := apply(sapply(.SD, `==`, D1[, "var1"]), 1, all, na.rm = T), .SDcols = c("var2", "var3")]

关于R:逐行比较多列字符串与单列字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57982605/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com