gpt4 book ai didi

r - 从多个向量中找出至少按百分比出现的公共(public)元素

转载 作者:行者123 更新时间:2023-12-04 11:23:27 27 4
gpt4 key购买 nike

假设我有 4 个向量:

a <- c("Mark","Kate","Greg", "Mathew")
b <- c("Mark","Tobias","Mary", "Mathew", "Greg")
c <- c("Mary","Chuck","Igor", "Mathew", "Robin", "Tobias")
d <- c("Kate","Mark","Igor", "Greg", "Robin", "Mathew")

我想从这些向量中选择重叠的名称,并假设该名称必须出现在这 4 个向量中的至少 3 个中。当然,我想让使用名称必须存在的矢量百分比变得容易。

我可以修改 intersect 吗?

最佳答案

我认为这会奏效。我们使用 table 函数来完成大部分繁重的工作。

find_perc <- function(..., perc = .75){
list_len <- length(list(...)) # how many vectors
tab_it <- table(c(...)) # tabulate all the names
tab_it_perc <- tab_it / list_len # calculate the frequencies
names(tab_it_perc[tab_it_perc >= perc]) # return those with freq >= perc
}


> find_perc(a, b, c, d)
[1] "Greg" "Mark" "Mathew"
> find_perc(a, b, c, d, perc = .5)
[1] "Greg" "Igor" "Kate" "Mark" "Mary" "Mathew" "Robin" "Tobias"

关于r - 从多个向量中找出至少按百分比出现的公共(public)元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41487507/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com