gpt4 book ai didi

r - 在矩阵中查找子矩阵

转载 作者:行者123 更新时间:2023-12-04 11:34:15 25 4
gpt4 key购买 nike

我有一个矩阵,如:

df<-data.frame(a=c(1,2,5,4,5,4), b=c(3,4,8,6,7,4))

我想知道以下矩阵是否包含在前一个矩阵中以及在哪里:
df1<-data.frame(a=c(5,4), b=c(7,4))

我知道如何寻找一个元素:
which( df ==df1[1,1], arr.ind=T )

但不是完全矩阵。我需要获取大矩阵中子矩阵的坐标。在这种情况下将是
(5,1;6,2)

有没有办法不用循环就解决这个问题?

最佳答案

Rcpp 是解决此类问题的好工具。

我在这里有点过火,写了一个非常复杂的函数,它可以找到任何维度的较小数组中较小数组的所有匹配项的最低索引(矩阵的左上角)角的坐标。如果你想在一个 11 维数组中找到一个 9 维数组的所有位置,这个函数可以为你做。

这里是:

library('Rcpp');
cppFunction('
IntegerMatrix findarray(IntegerVector big, IntegerVector small, bool nacmp=true ) {

// debugging macros
#define QUOTEID(...) #__VA_ARGS__
#define QUOTE(...) QUOTEID(__VA_ARGS__)
#define PRINT_VEC(vec,...) Rprintf(QUOTE(vec)"={"); if (vec.size() > 0) { Rprintf("%ld",vec[0]); for (size_t i = 1; i < vec.size(); ++i) Rprintf(",%ld",vec[i]); } Rprintf("}"__VA_ARGS__);

typedef std::vector<size_t> Dims;

// get big dimensions, treating a plain vector as a 1D array
Dims bigdims;
SEXP bigdimsSE = big.attr("dim");
if (Rf_isNull(bigdimsSE)) {
bigdims.push_back(big.size());
} else {
bigdims = as<Dims>(bigdimsSE);
}
//PRINT_VEC(bigdims,"\\n");
// now we can use this macro to easily return a result matrix with no matches
#define RES_NOMATCH IntegerMatrix(0,bigdims.size())

// get small dimensions, treating a plain vector as a 1D array
Dims smalldims;
SEXP smalldimsSE = small.attr("dim");
if (Rf_isNull(smalldimsSE)) {
smalldims.push_back(small.size());
} else {
smalldims = as<Dims>(smalldimsSE);
}
//PRINT_VEC(smalldims,"\\n");

// trivial case: if small has greater dimensionality than big, just return no matches
// note: we could theoretically support this case, at least when all extra small dimensions have only one index, but whatever
if (smalldims.size() > bigdims.size())
return RES_NOMATCH;

// derive a "bounds" Dims object, which will represent the maximum index plus one in big against which we must compare the first index in small for the corresponding dimension
// if small is greater than big in any dimension, then we can return no matches immediately
Dims bounds(smalldims.size());
for (size_t i = 0; i < smalldims.size(); ++i) {
if (smalldims[i] > bigdims[i])
return RES_NOMATCH;
bounds[i] = bigdims[i]-smalldims[i]+1;
}

// trivial case: if either big or small has any zero-length dimension, then just return no matches, because in that case the offending argument cannot have any actual data in it
// theoretically you can consider such degenerate arrays to match everywhere, sort of like the empty string matching at every position in any given string, but whatever
for (size_t i = 0; i < bigdims.size(); ++i) if (bigdims[i] == 0) return RES_NOMATCH;
for (size_t i = 0; i < smalldims.size(); ++i) if (smalldims[i] == 0) return RES_NOMATCH;

// prepare to build up the result data
// it would not make sense to build up the result data directly in a matrix, because we have to add one row at a time, which does not commute with the internal storage arrangement of matrices
// I then tried to use a data.frame, but the Rcpp DataFrame type is surprisingly light in functionality, seemingly without any provision for adding a row, and requires named columns, so best to avoid that
// instead, we\'ll just build up the data on a vector of vectors, going all-STL
typedef std::vector<std::vector<int> > ResBuilder;
ResBuilder resBuilder(bigdims.size());

// retrieve raw vector pointers for best performance
int* bigp = INTEGER(big);
int* smallp = INTEGER(small);

// now, iterate through each index of each (big) dimension from zero through the bound for that dimension (which is automatically the big dimension\'s length if small\'s dimensionality does not extend to that dimension), and see if small\'s first element matches
Dims bdis(bigdims.size()); // conveniently, initializes to all zeroes
size_t bvi = 0; // big vector index
while (true) { // big element loop, restricted to bounds
if (bigp[bvi] == smallp[0] && (nacmp || bigp[bvi] != NA_INTEGER)) {
//PRINT_VEC(bdis," ") Rprintf("found first element match at bvi=%ld big=small=%d\\n",bvi,bigp[bvi]);
size_t bvi2 = bvi; // don\'t screw up the original bvi; matches can overlap
// now we need to iterate through each index of each (small) dimension and test if all remaining elements match
Dims sdis(smalldims.size()); // conveniently, initializes to all zeroes
size_t svi = 0;
bool match = true; // assumption
while (true) { // small element loop
// note: once inside this inner loop, we don\'t have to worry about bounds anymore, because we already enforced that the outer loop will only iterate over indexes within bounds
// increment small and big indexes
++svi; // always increment svi by exactly one; the small array governs this matching loop
//PRINT_VEC(bdis," ") PRINT_VEC(sdis," ") Rprintf("incremented svi=%ld\\n",svi);
size_t bm = 1;
size_t d;
for (d = 0; d < sdis.size(); ++d) {
++sdis[d];
++bvi2;
if (sdis[d] == smalldims[d]) {
//PRINT_VEC(bdis," ") PRINT_VEC(sdis," ") Rprintf("reached small end=%ld of dimension d=%ld; bvi2=%ld bm=%ld\\n",smalldims[d],d,bvi2,bm);
sdis[d] = 0;
bvi2 += (bigdims[d]-smalldims[d])*bm-1;
bm *= bigdims[d];
//PRINT_VEC(bdis," ") PRINT_VEC(sdis," ") Rprintf("after jumping to next index we have bvi2=%ld bm=%ld\\n",bvi2,bm);
} else {
//PRINT_VEC(bdis," ") PRINT_VEC(sdis," ") Rprintf("valid dimension index increment at dimension d=%ld; bvi2=%ld bm=%ld\\n",d,bvi2,bm);
break;
}
}
// test if we reached the end of small; then break the inner while loop, and we have a match
if (d == sdis.size())
break;
// at this point, we have a new element to test; if unequal, we have no match
if (bigp[bvi2] != smallp[svi] || !nacmp && bigp[bvi] == NA_INTEGER) {
//PRINT_VEC(bdis," ") PRINT_VEC(sdis," ") Rprintf("match overturned by big=%d != small=%d\\n",bigp[bvi2],smallp[svi]);
match = false;
break;
} else {
//PRINT_VEC(bdis," ") PRINT_VEC(sdis," ") Rprintf("match respected by big=small=%d\\n",bigp[bvi2]);
}
}
// if we have a match, add it to the result data
if (match) {
//PRINT_VEC(bdis," ") Rprintf("found complete match!\\n");
for (size_t bd = 0; bd < bigdims.size(); ++bd)
resBuilder[bd].push_back(bdis[bd]+1); // also add one to convert from C++ zero-based to R one-based indexes
//PRINT_VEC(bdis," ") Rprintf("resBuilder dims = {%ld,%ld}\\n",resBuilder[0].size(),resBuilder.size());
}
} else {
//PRINT_VEC(bdis," ") Rprintf("first element mismatch: big=%d != small=%d\\n",bigp[bvi],smallp[0]);
}
// increment big index
size_t bm = 1;
size_t d;
for (d = 0; d < bdis.size(); ++d) {
++bdis[d];
++bvi;
size_t bound = bounds.size() > d ? bounds[d] : bigdims[d];
if (bdis[d] >= bound) {
//PRINT_VEC(bdis," ") Rprintf("big index hit bound=%ld of dimension d=%ld; bvi=%ld bm=%ld\\n",bound,d,bvi,bm);
bdis[d] = 0;
bvi += (bigdims[d]-bound)*bm-1;
bm *= bigdims[d];
//PRINT_VEC(bdis," ") Rprintf("after advancing big index we have bvi=%ld bm=%ld\\n",bvi,bm);
} else {
//PRINT_VEC(bdis," ") Rprintf("valid dimension index increment at dimension d=%ld; bvi=%ld bm=%ld\\n",d,bvi,bm);
break;
}
}
// test if we reached the end of big; then break the outer while loop, and we\'re done
if (d == bdis.size() || bvi >= big.size())
break;
}

// copy to a matrix
IntegerMatrix res(resBuilder[0].size(),resBuilder.size());
int* resp = INTEGER(res);
for (size_t c = 0; c < res.ncol(); ++c)
std::copy(resBuilder[c].begin(),resBuilder[c].end(),resp+c*res.nrow());

// return the matrix
return res;

}
');

这是我所做的一些相当随意的测试,最多只到立方体中的立方体(每个测试打印 big 数组,然后是 small 数组,然后是结果,最后是一个逻辑向量,测试大小为 small 的切片是否从每个连续的 big 中的匹配实际上与 small 相同):
## testing
slice <- function(arr,is,ls,...) { length(ls) <- length(is); ls[is.na(ls)] <- 1; do.call(`[`,c(list(arr),Map(function(i,l) seq(i,len=l),is,ls),...)); };
printAndTest <- function(big,small) { print(big); print(small); findarray(big,small); };
printAndTestAndSliceIdentical <- function(big,small) { big <- structure(as.integer(big),dim=dim(big)); small <- structure(as.integer(small),dim=dim(small)); res <- printAndTest(big,small); print(res); if (nrow(res) > 0) sapply(1:nrow(res),function(r) identical(structure(slice(big,res[r,],if (is.null(dim(small))) length(small) else dim(small),drop=F),dim=dim(small)),small)) else logical(); };

## one-element match
printAndTestAndSliceIdentical(1,1);
## [1] 1
## [1] 1
## [,1]
## [1,] 1
## [1] TRUE

## vector in vector
printAndTestAndSliceIdentical(1:3,2:3);
## [1] 1 2 3
## [1] 2 3
## [,1]
## [1,] 2
## [1] TRUE
printAndTestAndSliceIdentical(1:3,1:3);
## [1] 1 2 3
## [1] 1 2 3
## [,1]
## [1,] 1
## [1] TRUE
printAndTestAndSliceIdentical(1:3,1:4);
## [1] 1 2 3
## [1] 1 2 3 4
## [,1]
## logical(0)

## vector in matrix
printAndTestAndSliceIdentical(matrix(rep(1:12,2),4),1:2);
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 9 1 5 9
## [2,] 2 6 10 2 6 10
## [3,] 3 7 11 3 7 11
## [4,] 4 8 12 4 8 12
## [1] 1 2
## [,1] [,2]
## [1,] 1 1
## [2,] 1 4
## [1] TRUE TRUE
printAndTestAndSliceIdentical(matrix(rep(1:12,2),4),12);
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 9 1 5 9
## [2,] 2 6 10 2 6 10
## [3,] 3 7 11 3 7 11
## [4,] 4 8 12 4 8 12
## [1] 12
## [,1] [,2]
## [1,] 4 3
## [2,] 4 6
## [1] TRUE TRUE
printAndTestAndSliceIdentical(matrix(rep(1:12,2),4),5:8);
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 9 1 5 9
## [2,] 2 6 10 2 6 10
## [3,] 3 7 11 3 7 11
## [4,] 4 8 12 4 8 12
## [1] 5 6 7 8
## [,1] [,2]
## [1,] 1 2
## [2,] 1 5
## [1] TRUE TRUE
printAndTestAndSliceIdentical(matrix(rep(1:12,2),4),5:9);
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 9 1 5 9
## [2,] 2 6 10 2 6 10
## [3,] 3 7 11 3 7 11
## [4,] 4 8 12 4 8 12
## [1] 5 6 7 8 9
## [,1] [,2]
## logical(0)

## matrix in matrix
printAndTestAndSliceIdentical(matrix(rep(1:12,2),4),matrix(1:4,2));
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 9 1 5 9
## [2,] 2 6 10 2 6 10
## [3,] 3 7 11 3 7 11
## [4,] 4 8 12 4 8 12
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
## [,1] [,2]
## logical(0)
printAndTestAndSliceIdentical(matrix(rep(1:12,2),4),matrix(c(2,3,6,7),2));
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 9 1 5 9
## [2,] 2 6 10 2 6 10
## [3,] 3 7 11 3 7 11
## [4,] 4 8 12 4 8 12
## [,1] [,2]
## [1,] 2 6
## [2,] 3 7
## [,1] [,2]
## [1,] 2 1
## [2,] 2 4
## [1] TRUE TRUE
printAndTestAndSliceIdentical(matrix(rep(1:12,2),4),matrix(c(7,8,11,12),2));
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 9 1 5 9
## [2,] 2 6 10 2 6 10
## [3,] 3 7 11 3 7 11
## [4,] 4 8 12 4 8 12
## [,1] [,2]
## [1,] 7 11
## [2,] 8 12
## [,1] [,2]
## [1,] 3 2
## [2,] 3 5
## [1] TRUE TRUE

## vector in cube
printAndTestAndSliceIdentical(array(1:12,c(4,3,2)),1);
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## [1] 1
## [,1] [,2] [,3]
## [1,] 1 1 1
## [2,] 1 1 2
## [1] TRUE TRUE
printAndTestAndSliceIdentical(array(1:12,c(4,3,2)),8);
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## [1] 8
## [,1] [,2] [,3]
## [1,] 4 2 1
## [2,] 4 2 2
## [1] TRUE TRUE
printAndTestAndSliceIdentical(array(1:12,c(4,3,2)),9);
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## [1] 9
## [,1] [,2] [,3]
## [1,] 1 3 1
## [2,] 1 3 2
## [1] TRUE TRUE
printAndTestAndSliceIdentical(array(1:12,c(4,3,2)),12);
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## [1] 12
## [,1] [,2] [,3]
## [1,] 4 3 1
## [2,] 4 3 2
## [1] TRUE TRUE
printAndTestAndSliceIdentical(array(1:12,c(4,3,2)),1:4);
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## [1] 1 2 3 4
## [,1] [,2] [,3]
## [1,] 1 1 1
## [2,] 1 1 2
## [1] TRUE TRUE
printAndTestAndSliceIdentical(array(1:12,c(4,3,2)),1:5);
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## [1] 1 2 3 4 5
## [,1] [,2] [,3]
## logical(0)

## matrix in cube
printAndTestAndSliceIdentical(array(1:12,c(4,3,2)),matrix(c(7,8,11,12),2));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## [,1] [,2]
## [1,] 7 11
## [2,] 8 12
## [,1] [,2] [,3]
## [1,] 3 2 1
## [2,] 3 2 2
## [1] TRUE TRUE
printAndTestAndSliceIdentical(array(1:12,c(4,3,2)),matrix(c(7,8,11,11),2));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## [,1] [,2]
## [1,] 7 11
## [2,] 8 11
## [,1] [,2] [,3]
## logical(0)

## cube in cube
printAndTestAndSliceIdentical(array(1:36,c(4,3,3)),array(c(1,13,25),c(1,1,3)));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## , , 1
##
## [,1]
## [1,] 1
##
## , , 2
##
## [,1]
## [1,] 13
##
## , , 3
##
## [,1]
## [1,] 25
##
## [,1] [,2] [,3]
## [1,] 1 1 1
## [1] TRUE
printAndTestAndSliceIdentical(array(1:36,c(4,3,3)),array(c(6,18,30),c(1,1,3)));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## , , 1
##
## [,1]
## [1,] 6
##
## , , 2
##
## [,1]
## [1,] 18
##
## , , 3
##
## [,1]
## [1,] 30
##
## [,1] [,2] [,3]
## [1,] 2 2 1
## [1] TRUE
printAndTestAndSliceIdentical(array(1:36,c(4,3,3)),array(c(18,30),c(1,1,2)));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## , , 1
##
## [,1]
## [1,] 18
##
## , , 2
##
## [,1]
## [1,] 30
##
## [,1] [,2] [,3]
## [1,] 2 2 2
## [1] TRUE
printAndTestAndSliceIdentical(array(1:36,c(4,3,3)),array(1:36,c(4,3,3)));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## [,1] [,2] [,3]
## [1,] 1 1 1
## [1] TRUE
printAndTestAndSliceIdentical(array(1:36,c(4,3,3)),array(c(7,8,11,12,19,20,23,24,31,32,35,36),c(2,2,3)));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## , , 1
##
## [,1] [,2]
## [1,] 7 11
## [2,] 8 12
##
## , , 2
##
## [,1] [,2]
## [1,] 19 23
## [2,] 20 24
##
## , , 3
##
## [,1] [,2]
## [1,] 31 35
## [2,] 32 36
##
## [,1] [,2] [,3]
## [1,] 3 2 1
## [1] TRUE
printAndTestAndSliceIdentical(array(1:36,c(4,3,3)),array(c(7,8,11,12,19,20,23,24,31,32,35,37),c(2,2,3)));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## , , 1
##
## [,1] [,2]
## [1,] 7 11
## [2,] 8 12
##
## , , 2
##
## [,1] [,2]
## [1,] 19 23
## [2,] 20 24
##
## , , 3
##
## [,1] [,2]
## [1,] 31 35
## [2,] 32 37
##
## [,1] [,2] [,3]
## logical(0)
printAndTestAndSliceIdentical(array(1:36,c(4,3,6)),array(c(7,8,11,12,19,20,23,24,31,32,35,36),c(2,2,3)));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## , , 4
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 5
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24
##
## , , 6
##
## [,1] [,2] [,3]
## [1,] 25 29 33
## [2,] 26 30 34
## [3,] 27 31 35
## [4,] 28 32 36
##
## , , 1
##
## [,1] [,2]
## [1,] 7 11
## [2,] 8 12
##
## , , 2
##
## [,1] [,2]
## [1,] 19 23
## [2,] 20 24
##
## , , 3
##
## [,1] [,2]
## [1,] 31 35
## [2,] 32 36
##
## [,1] [,2] [,3]
## [1,] 3 2 1
## [2,] 3 2 4
## [1] TRUE TRUE

这是关于您的数据的演示:
df <- data.frame(a=c(1,2,5,4,5,4),b=c(3,4,8,6,7,4));
df1 <- data.frame(a=c(5,4),b=c(7,4));
findarray(as.matrix(df),as.matrix(df1));
## [,1] [,2]
## [1,] 5 1

我的函数只返回最低索引坐标,因为您可以通过简单地添加 small 的大小来推导出最高索引坐标,如下所示:
t(t(findarray(as.matrix(df),as.matrix(df1)))+dim(df1))-1;
## [,1] [,2]
## [1,] 6 2

请注意,转置是必要的,因为 R 将短向量循环到更大的矩阵(即跨行,然后跨列)。这对于您的特定数据显然不是必需的,因为只有一个匹配项,而且 df1 的两个维度具有相同的长度,因此无论如何都无关紧要,但在一般情况下很重要。

好的,我可以说我做到了,这是在 11D 数组中匹配 9D 的简单测试:
set.seed(12);
big <- array(sample(1:4,factorial(11),replace=T),11:1);
small <- array(sample(1:4,12,replace=T),c(2,3,2,rep(1,9-3)));
res <- findarray(big,small);
res;
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## [1,] 6 6 5 3 1 3 4 3 2 1 1
## [2,] 7 7 6 3 5 6 5 4 3 2 1
sapply(1:nrow(res),function(r) identical(structure(slice(big,res[r,],dim(small),drop=F),dim=dim(small)),small));
## [1] TRUE TRUE

想到了另一个测试这个的好方法:我们可以从大数组中取出切片,看看 findarray() 是否可以找到它们。
set.seed(96);
d <- 11;
big <- array(sample(1:4,factorial(d),replace=T),d:1);
for (i in 1:5) {
is <- sapply(d:1,sample,1);
ls <- mapply(function(i,dl) sample(dl-i+1,1),is,d:1);
small <- slice(big,is,ls,drop=F);
res <- findarray(big,small);
print(rbind(is,ls,res));
};
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## is 7 6 1 4 7 2 2 3 3 1 1
## ls 3 1 2 1 1 1 1 2 1 1 1
## 5 3 6 8 4 4 4 2 1 1 1
## 7 6 1 4 7 2 2 3 3 1 1
## 8 10 7 5 1 2 2 3 1 2 1
## 9 6 3 4 4 1 4 3 3 2 1
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## is 10 10 2 4 5 6 3 1 3 2 1
## ls 2 1 3 4 1 1 3 1 1 1 1
## 10 10 2 4 5 6 3 1 3 2 1
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## is 8 5 5 8 2 1 5 4 1 1 1
## ls 2 1 1 1 2 3 1 1 1 1 1
## 8 5 5 8 2 1 5 4 1 1 1
## 1 4 3 1 5 1 2 1 3 1 1
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## is 7 10 7 7 6 3 5 4 3 2 1
## ls 2 1 1 2 2 2 1 1 1 1 1
## 7 10 7 7 6 3 5 4 3 2 1
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## is 3 8 5 1 6 3 1 3 3 2 1
## ls 9 1 2 7 2 3 4 1 1 1 1
## 3 8 5 1 6 3 1 3 3 2 1

关于r - 在矩阵中查找子矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31330196/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com