Fast way to permute rows of matrices in Rcpp or RcppArmadillo?(在RCPP或RcppArmadillo中快速置换矩阵行的方法？)-6ren

Fast way to permute rows of matrices in Rcpp or RcppArmadillo?(在RCPP或RcppArmadillo中快速置换矩阵行的方法？)

转载作者：bug小助手更新时间：2023-10-24 21:59:15

28

4

I'm running a stationary bootstrap algorithm on an N x M matrix, X, where both N and M are on the order of 1500 to 3000.

我在N x M矩阵X上运行静态引导算法，其中N和M都在1500到3000的数量级上。

The bootstrap matrix of index permutations, Y, is N x B, where B is, say, 10,000.

索引排列的引导矩阵Y是N×B，其中B是，比方说，10,000。

In R syntax, the goal is to compute:

在R语法中，目标是计算：

sapply(1:B, function(b) colSums(X[Y[,b],]))

That is, we rearrange the rows of X (with possible duplications) and take the column sums--10,000 times.

也就是说，我们重新排列X的行(可能有重复)，并计算列总和--10,000次。

The above code takes about 3 minutes with N = 1500, M = 2000, B = 10000.

当N=1500，M=2000，B=10000时，上述代码大约需要3分钟。

Converting the code to Rcpp reduces it to about 25 seconds:

将代码转换为RCPP将时间缩短到大约25秒：

// [[Rcpp::export]]
NumericVector get_bootstrap_statistic_mean(NumericMatrix x, IntegerMatrix theta){
    
    int nr = x.nrow();
    int nc = x.ncol(); 
    int nb = theta.ncol();
    NumericMatrix res(nb, nc);
    
        
    for(int j = 0; j < nc; j++){
        
        NumericMatrix::Column x_col_j = x.column(j);
        NumericMatrix::Column res_col_j = res.column(j);
        
        for(int b = 0; b < nb; b++){
    
            IntegerMatrix::Column theta_col_b = theta.column(b);
            
            double sum = 0.0;
            for(int i = 0; i < nr; i++){
            
                sum += x_col_j[theta_col_b[i]-1]; //Have to subtract one to map the R indices (start at 1) to the C++ indices (start at 0)
            
            }
            res_col_j[b] = sum / nr;
        }
    }
    return res;
}

But this post shows a faster way to get column sums than the nested loops above.

但这篇文章展示了一种比上面的嵌套循环更快地获得列总和的方法。

Thus, is there a way to create the permuted matrices in C++ (Rcpp, RcppArmadillo) that is faster than doing :

因此，有没有一种在C++(Rcpp，RcppArmadillo)中创建置换矩阵的方法比执行以下操作更快：

sapply(1:B, function(b) Arma_colSums(X[Y[,b],])

SApply(1：b，函数(B)arma_colSums(X[Y[，b]，])

which takes about 20 seconds for N = 1500, M = 2000, B = 10000 (where Arma_colSums was (more or less) the fastest method from the linked post), so that we can apply Arma_colSums to the permuted matrices?

对于N=1500，M=2000，B=10000(其中Arma_colSums(或多或少)是链接帖子中最快的方法)，这大约需要20秒，这样我们就可以将Arma_colSums应用于置换矩阵？

I've looked at the RcppArmadillo subsetting documents and most of it seems like it's fetching row or column vectors rather than rearranging the entire matrix. And the "sub2ind" functionality seems not applicable or if it is applicable, that it would take longer to put the permutation matrix in the required form than using one of the faster approaches above.

我看过RcppArmadillo的子集文档，其中大部分似乎是在获取行或列向量，而不是重新排列整个矩阵。并且“sub2ind”功能似乎不适用，或者如果它适用，则将置换矩阵放入所需形式所需的时间将比使用上述更快的方法之一所需的时间更长。

I've also looked at the bootstrap example in the RcppArmadillo introduction, but it uses IID bootstrap on a single column of X (whereas X here has thousands of columns).

我还研究了RcppArmadillo简介中的引导示例，但它在X的单个列上使用IID引导(而这里的X有数千列)。

I tried chatGPT, but it wasn't able to provide any compilable code.

我尝试了chat GPT，但它无法提供任何可编译代码。

更多回答

Welcome to StackOverflow, and (Rcpp)Armadillo! The fundamental issue here is that these vectors are stored as columns by R (and then 'zero-copy' provided to Armadillo). That means that row-wise access "cuts across the grain". So no "obvious" shortcut I can offer.

欢迎来到StackOverflow和(RCPP)Armadillo！这里的基本问题是，这些向量以列的形式由R存储(然后以零复制方式提供给Armadillo)。这意味着逐行访问“贯穿了方方面面”。所以我不能提供任何“显而易见”的捷径。

I short have gleaned as much! Ugh, well, you're the authority @DirkEddelbuettel, so at least I know not to waste any more time on it and be grateful for the existing improvements. Thank you!

我已经收集了同样多的东西！呃，好吧，你是权威@DirkEddelbuettel，所以至少我知道不要在上面浪费更多的时间，并对现有的改进表示感激。谢谢!

Also, can always parallelize it for perhaps another nontrivial increase in speed.

此外，还可以始终将其并行化，以获得另一次不平凡的速度提升。

That was my thought too, "devils is as always in the details".

这也是我的想法，“魔鬼总是存在于细节中”。

优秀答案推荐

Use matrix multiplication instead:

改用矩阵乘法：

library(Rfast) # for `colTabulate` and `Crossprod`

system.time(Z1 <- get_bootstrap_statistic_mean(X, Y))
#>    user  system elapsed 
#>   28.17    0.01   28.19
system.time(Z2 <- Crossprod(X, colTabulate(Y, N)))
#>    user  system elapsed 
#>   26.30    1.36    3.81

all.equal(Z1, t(Z2)/N)
#> [1] TRUE

Data:

数据：

N <- 15e2L
M <- 2e3L
B <- 1e4L

X <- matrix(runif(N*M), N, M)
Y <- matrix(sample(N, N*B, 1), N, B)

更多回答

That is brilliant, @jblood94! Thank you.

“这真是太棒了，”杰罗姆94！谢谢。

28

4

0

文章推荐： Decode JSON using Dart Isolates(使用DART分离物解码JSON)

python - 为什么 np.random.default_rng().permutation(n) 优于原始 np.random.permutation(n)？
Numpy documentation在 np.random.permutation建议所有新代码使用 np.random.default_rng()来自随机生成器包。我在文档中看到，Random G
permutation - 生成有限制的随机多集排列
是否有任何已知的算法如何有效地生成具有附加限制的任何随机多集排列。例子: 我有多个项目，例如:{1,1,1,2,2,3,3,3} ，以及一组限制性的集合，例如 { {3} , {1,2} , {1,
permutation - 生成所有长度的所有排列
您将如何生成列表 b(1,6,8,3,9,5) 的所有可能排列包括不同长度的？例子: List a = [1,2,3] generateperms(a) 1,2,3 3,1,2 3,2,1 1,3,2
permutation - 词排名效率
我不确定如何在限制范围内解决这个问题。将“单词”视为大写字母 A-Z 的任何序列(不仅限于“字典单词”)。对于至少有两个不同字母的单词，还有其他单词由相同的字母组成但顺序不同(例如，STATIONA
46. Permutations 全排列
题目地址：https://leetcode.com/problems/permutations/description/ 题目描述 Given a collection of distinct n
permutation - 找到选择项目的总方法，使得没有两个是连续的
一行中有 n 个项目。我们必须在不能选择两个连续项目的限制下找到可以选择项目的方式数。我试图用递归关系来做，但无法达到任何。请帮我解决问题。最佳答案在网上搜索后，我得到了上述问题的解决方案。假
list - 在列表上应用函数的 "permutations"
创建列表或集合的排列非常简单。我需要将函数应用于列表中所有元素的所有子集的每个元素，按照它们出现的顺序。例如: apply f [x,y] = { [x,y], [f x, y], [x, f y],
python : summation over all permutations
我遇到了一个看似简单的问题，有人可以帮忙吗？我有两个列表 a和 b .我可以将列表的元素称为 a[i][j]其中 0
python - matlab在python中的 "permute"
我正在将一个程序从 matlab 翻译成 Python。 matlab代码使用permute方法: B = PERMUTE(A,ORDER) rearranges the dimensions of
JavaScript实现穷举排列(permutation)算法谜题解答
谜题穷举一个数组中各个元素的排列策略减而治之、递归 JavaScript解复制代码代码如下:
31. Next Permutation 下一个排列
题目地址：https://leetcode.com/problems/next-permutation/description/ 题目描述 Implement next permutation,
permutation - 为什么在并行SIMD/SSE/AVX中需要置换？
从我关于"Using SIMD AVX SSE for tree traversal" ive的另一个问题中，我得到了这个试图进行基准测试的代码。之前我没有对SIMD做任何事情，所以我对这种排列方式有
PHP : Combinations without permutations
这段代码为我提供了长度为 x 的 n 值的所有可能组合，总和为 n。 function GETall_distri_pres($n_valeurs, $x_entrees, $combi_presen
Haskell 的 "permutations"函数定义很奇怪
如果我想找到列表的排列，我知道排列的数量由多项系数给出。例如，“MISSISSIPPI”有 11 个字母，“S”出现 4 次，“I”出现 4 次，“P”出现两次，“M”出现一次。因此“MISSISSI
java - 伪代码: Random Permutation
我有一个伪代码，我已经将其翻译成java代码，但是每当我运行该代码时，我都会得到一个空的数组列表，但它应该给我一个随机的整数列表。这是伪代码: Algorithm 1. RandPerm(N) Inp
r - 使用 permute 包计算分块设计中的所有排列
我想计算适合弗里德曼检验的分块设计的所有排列。考虑以下示例: thedata p dim(p) [1] 1295 12 R> head(p) [,1] [,2] [,3] [,4]
数学问题 : number of different permutations
这与其说是编程问题，不如说是数学问题，但我认为这里的很多人都非常擅长数学! :) 我的问题是:给定一个 9 x 9 的网格(81 个单元格)，其中必须包含数字 1 到 9，每个数字恰好出现 9 次，可
python itertools.permutations 组合
我有这个变量:message = "Hello World"我构建了一个对其进行洗牌的函数: def encrypt3(message,key): random.seed(key) l
python - itertools.permutations 的无序版本
以下程序使用 itertools.permutations 从列表中构造一个 URL。 def url_construct_function(): for i in range(1, len(
python : speed dating & permutation
我有 36 个人和 6 张 table 。我想围绕每张 table 组成 6 个小组。然后再组成 6 个其他组，再组成 6 个其他组……直到每个人都遇到每个人，但没有人遇到两次。到目前为止，我想出了

首页

博学

6Ren·AI

商城

Fast way to permute rows of matrices in Rcpp or RcppArmadillo?(在RCPP或RcppArmadillo中快速置换矩阵行的方法？)