gpt4 book ai didi

r - 在R的列中查找具有多个值的所有记录

转载 作者:行者123 更新时间:2023-12-04 13:46:03 26 4
gpt4 key购买 nike

对于示例数据框:

df <- structure(list(code = c("a1", "a1", "b2", "v4", "f5", "f5", "h7", 
"a1"), name = c("katie", "katie", "sally", "tom", "amy", "amy",
"ash", "james"), number = c(3.5, 3.5, 2, 6, 4, 4, 7, 3)), .Names = c("code",
"name", "number"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-8L), spec = structure(list(cols = structure(list(code = structure(list(), class = c("collector_character",
"collector")), name = structure(list(), class = c("collector_character",
"collector")), number = structure(list(), class = c("collector_double",
"collector"))), .Names = c("code", "name", "number")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))

我要突出显示具有两个或多个相同值的“code”值的所有记录。我知道我可以使用:
df[duplicated(df$name), ]

但这仅突出显示了重复的记录,但是我想要所有重复的代码值(即3个a1s和2个f5s)。

有任何想法吗?

最佳答案

df[duplicated(df$code) | duplicated(df$code, fromLast=TRUE), ]
code name number
1 a1 katie 3.5
2 a1 katie 3.5
5 f5 amy 4.0
6 f5 amy 4.0
8 a1 james 3.0

受Alok VS启发的另一个解决方案:
ta <- table(df$code)
df[df$code %in% names(ta)[ta > 1], ]

编辑:如果可以保留基数R,那么 gdata::duplicated2()可以提供更多的简洁性。
library(gdata)
df[duplicated2(df$code), ]

关于r - 在R的列中查找具有多个值的所有记录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50963733/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com