gpt4 book ai didi

r - 识别 r 中的重复主题

转载 作者:行者123 更新时间:2023-12-02 07:07:16 24 4
gpt4 key购买 nike

我有以下数据:

subject <- c("A-B10", "A101", "A-B10", "C101", "A101", "C01", "A101", "AB101", "A.B10")
idn <- c(101, 102, 104, 100, 98, 102, 90, 102, 78)
sn <- 1:9
mydata <- data.frame (sn, subject, idn)

sn subject idn
1 1 A-B10 101
2 2 A101 102
3 3 A-B10 104
4 4 C101 100
5 5 A101 98
6 6 C01 102
7 7 A101 90
8 8 AB101 102
9 9 A.B10 78

我想识别大型数据集中重复的主题。预期的结果是这样的:

repeat [1]
sn subject idn
1 1 A-B10 101
3 3 A-B10 104

repeat [2]
sn subject idn
2 2 A101 102
5 5 A101 98
7 7 A101 90

编辑:

dup <- mydata$subject[duplicated(mydata$subject)]
mydata[mydata$subject %in% dup, ]
sn subject idn
1 1 A-B10 101
2 2 A101 102
3 3 A-B10 104
5 5 A101 98
7 7 A101 90
lapply(dup, function(x) mydata[mydata$subject == x,])
[[1]]
sn subject idn
1 1 A-B10 101
3 3 A-B10 104

[[2]]
sn subject idn
2 2 A101 102
5 5 A101 98
7 7 A101 90

[[3]]
sn subject idn
2 2 A101 102
5 5 A101 98
7 7 A101 90

最佳答案

例如:

> ## dup <- mydata$subject[duplicated(mydata$subject)]
> dup <- unique(mydata$subject[duplicated(mydata$subject)]) ## sorry, edited
> mydata[mydata$subject %in% dup, ]
sn subject idn
1 1 A-B10 101
2 2 A101 102
3 3 A-B10 104
5 5 A101 98
> lapply(dup, function(x) mydata[mydata$subject == x,])
[[1]]
sn subject idn
1 1 A-B10 101
3 3 A-B10 104

[[2]]
sn subject idn
2 2 A101 102
5 5 A101 98

关于r - 识别 r 中的重复主题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9720163/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com