gpt4 book ai didi

r - 来自不同集合的混合组合/排列

转载 作者:行者123 更新时间:2023-12-01 20:22:52 29 4
gpt4 key购买 nike

此问答的动机是 How to build permutation with some conditions in R .

到目前为止,已经有一些很好的 R 软件包,例如 RcppAlgosarrangements,可以在单个集合上提供有效的组合/排列。例如,如果我们想从 letters[1:6] 中选择 3 个项目,则以下给出所有组合:

library(RcppAlgos)
comboGeneral(letters[1:6], 3)
# [,1] [,2] [,3]
# [1,] "a" "b" "c"
# [2,] "a" "b" "d"
# [3,] "a" "b" "e"
# [4,] "a" "b" "f"
# [5,] "a" "c" "d"
# [6,] "a" "c" "e"
# [7,] "a" "c" "f"
# [8,] "a" "d" "e"
# [9,] "a" "d" "f"
#[10,] "a" "e" "f"
#[11,] "b" "c" "d"
#[12,] "b" "c" "e"
#[13,] "b" "c" "f"
#[14,] "b" "d" "e"
#[15,] "b" "d" "f"
#[16,] "b" "e" "f"
#[17,] "c" "d" "e"
#[18,] "c" "d" "f"
#[19,] "c" "e" "f"
#[20,] "d" "e" "f"

但是,如果我们想要更复杂的东西,比如

  • LETTERS[1:2] 中选择 1 项
  • letters[1:6]中选择 3 项
  • as.character(1:3) 中选择 2 项

如何生成所有组合以及可选的所有排列?

最佳答案

假设我们有一个集合列表set_list,其中k[i]项是从set_list[[i]]中选择的,那么从数学上来说,我们将这样解决这个问题:

  1. 生成每组的所有组合;
  2. 合并所有集合中的组合;
  3. 为每个组合创建所有排列。

下面的函数MixedCombnPerm是我的实现,使用RcppAlgos进行步骤1和步骤3。目前步骤2没有使用最优算法。这是一种“残酷的力量”,依赖于 expand.grid 的更快实现和后续的 rbind。我知道一种更快的递归方法(例如用于在 mgcv 中形成张量积模型矩阵的方法),可以在 Rcpp 中进行编码,但由于时间原因我现在不会这样做。

library(RcppAlgos)

MixedCombnPerm <- function (set_list, k, perm = FALSE) {

###################
## mode checking ##
###################

if (!all(vapply(set_list, is.vector, TRUE)))
stop("All sets must be 'vectors'!")

if (length(unique(vapply(set_list, mode, ""))) > 1L)
stop("Please ensure that all sets have the same mode!")

################
## basic math ##
################

## size of each sets
n <- lengths(set_list, FALSE)
## input validation
if (length(n) != length(k)) stop("length of 'k' different from number of sets!")
if (any(k > n)) stop("can't choose more items than set size!")
## number of sets
n_sets <- length(n)
## total number of items
n_items <- sum(k)
## number of combinations
n_combinations_by_set <- choose(n, k)
n_combinations <- prod(n_combinations_by_set)

#################################
## step 1: combinations by set ##
#################################

## generate `n_combinations[i]` combinations on set i
combinations_by_set <- vector("list", n_sets)
for (i in seq_len(n_sets)) {
## each column of combinations_by_set[[i]] is a record
combinations_by_set[[i]] <- t.default(comboGeneral(set_list[[i]], k[i]))
}

################################
## step 2: merge combinations ##
################################

## merge combinations from all sets
## slow_expand_grid <- function (m) expand.grid(lapply(m, seq_len))
fast_expand_grid <- function (m) {
n_sets <- length(m) ## number of sets
mm <- c(1L, cumprod(m)) ## cumulative leading dimension
grid_size <- mm[n_sets + 1L] ## size of the grid
grid_ind <- vector("list", n_sets)
for (i in seq_len(n_sets)) {
## grid_ind[[i]] <- rep_len(rep(seq_len(m[i]), each = mm[i]), M)
grid_ind[[i]] <- rep_len(rep.int(seq_len(m[i]), rep.int(mm[i], m[i])), grid_size)
}
grid_ind
}
grid_ind <- fast_expand_grid(n_combinations_by_set)

## each column is a record
combinations_grid <- mapply(function (x, j) x[, j, drop = FALSE],
combinations_by_set, grid_ind,
SIMPLIFY = FALSE, USE.NAMES = FALSE)
all_combinations <- do.call("rbind", combinations_grid)

########################################################
## step 3: generate permutations for each combination ##
########################################################

if (!perm) return(all_combinations)
else {
## generate `factorial(n_items)` permutations for each combination
all_permutations <- vector("list", n_combinations)
for (i in seq_len(n_combinations)) {
all_permutations[[i]] <- permuteGeneral(all_combinations[, i], n_items)
}
return(all_permutations)
}

}

该函数执行严格的输入检查。用户应确保所有集合都以“向量”形式给出,并且它们具有相同的模式。因此,对于问题中的示例,我们应该提供:

## note the "as.character(1:3)"
set_list <- list(LETTERS[1:2], letters[1:6], as.character(1:3))
k <- c(1, 3, 2)

如果参数perm = FALSE(默认),该函数将返回矩阵中的组合(每列都是一条记录)。否则,它返回一个矩阵列表,每个矩阵给出特定组合的排列(每行是一条记录)。

尝试这个例子:

combinations <- MixedCombnPerm(set_list, k)
permutations <- MixedCombnPerm(set_list, k, TRUE)

检查结果:

combinations[, 1:6]
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] "A" "B" "A" "B" "A" "B"
#[2,] "a" "a" "a" "a" "a" "a"
#[3,] "b" "b" "b" "b" "b" "b"
#[4,] "c" "c" "d" "d" "e" "e"
#[5,] "1" "1" "1" "1" "1" "1"
#[6,] "2" "2" "2" "2" "2" "2"

permutations[[1]][1:6, ]
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] "A" "a" "b" "c" "1" "2"
#[2,] "A" "a" "b" "c" "2" "1"
#[3,] "A" "a" "b" "1" "c" "2"
#[4,] "A" "a" "b" "1" "2" "c"
#[5,] "A" "a" "b" "2" "c" "1"
#[6,] "A" "a" "b" "2" "1" "c"

关于r - 来自不同集合的混合组合/排列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52682271/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com