gpt4 book ai didi

r - 寻找最频繁的组合

转载 作者:行者123 更新时间:2023-12-04 23:57:26 25 4
gpt4 key购买 nike

我有一个包含 2 列、ID 号和品牌的数据框:

X1     X2
1234 A89
1234 A87
1234 A87
1234 A32
1234 A27
1234 A27
1235 A12
1235 A14
1235 A14
1236 A32
1236 A32
1236 A27
1236 A12
1236 A12
1236 A14
1236 A89
1236 A87
1237 A99
1237 A98

我想找到就 ID 号而言最常出现的前 3 个品牌组合:
A89, A87
A32, A27
A12, A14

我试过:
图书馆(dplyr)
 df %>% 
group_by(X1,X2) %>%
mutate(n = n()) %>%
group_by(X1) %>%
slice(which.max(n)) %>%
select(-n)

但它不能正常工作。我将不胜感激任何想法或想法!

最佳答案

这是在基础 R 中实现的方法。我们拆分 X2来自 X1然后为每个子组获取两个值的组合。然后我们捕获三个最常见的。

with(data.frame(table(unlist(lapply(split(df$X2, df$X1), function(x)
combn(unique(x), min(2, length(x)), paste, collapse = "-"))))),
as.character(Var1[head(order(Freq, decreasing = TRUE), 3)]))
#[1] "A12-A14" "A32-A27" "A89-A87"

数据
df = structure(list(X1 = c(1234L, 1234L, 1234L, 1234L, 1234L, 1234L, 
1235L, 1235L, 1235L, 1236L, 1236L, 1236L, 1236L, 1236L, 1236L,
1236L, 1236L, 1237L, 1237L), X2 = c("A89", "A87", "A87", "A32",
"A27", "A27", "A12", "A14", "A14", "A32", "A32", "A27", "A12",
"A12", "A14", "A89", "A87", "A99", "A98")), .Names = c("X1",
"X2"), class = "data.frame", row.names = c(NA, -19L))

关于r - 寻找最频繁的组合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45491421/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com