gpt4 book ai didi

arrays - 查找 Array 中所有重复元素的所有索引

转载 作者:行者123 更新时间:2023-12-03 22:53:10 25 4
gpt4 key购买 nike

Given an Integer array, find all the indices of all the duplicate elements in it.

For example, consider A = [4, 12, 9, 8, 9, 12, 7, 1]. Since 12 and 9 have duplicates, all their indices will be returned i.e. d = [2, 3, 5, 6]. The length of A is below 200 and the integer elements are between 1 and 5000.



目前我正在使用以下功能。然而,为了满足我的要求,这个功能让我放慢了速度。是否有任何可能的性能改进?

function d = fincDuplicates(A)
U = unique(A);
[co,ce] = hist(A,U);
an = ce(co>1);
d=[];
for i=1:numel(an)
d=[d,find(A==an(i))];
end
end

最佳答案

编辑:

1:针对注释中突出显示的边缘情况更正了代码,更新了基准测试。

2:为基准测试添加了“扩展”解决方案(必须将最大 N 元素减少到 20000)。

3:添加 accumarray基准测试方法(高 N 的优胜者)和 sparse方法。

这是获得结果的另一种方法,无需使用函数 uniquehist .
它确实依赖于函数 sort .

以展开形式(如果您想查看中间步骤的结果):

A = [4, 12, 9, 8, 9, 12, 7, 1] ;

[B,I] = sort(A) ; % this will put duplicate elements side by side
df = diff(B) ; % the duplicates will return '0' when substracted
dx = find(df==0) ; % find their indices

% Since each duplicate concerns 2 elemts of the array, we add the next
% index for each "flagged" index, taking care not to duplicate the indices
% of sucessive duplicates.
if ~isempty(dx)
dd = [diff(dx)~=1 , true] ;
dx = [dx dx(dd)+1] ;
d = I(dx) % get the original position of the duplicates in original array
else
d=[] ;
end

你可以压缩到:
[B,I] = sort(A) ;
dx = find(diff(B)==0) ;
if ~isempty(dx)
d = I([dx dx([diff(dx)~=1,true])+1]) ;
else
d = [] ;
end

给出:
d =
3 2 5 6

我个人也会 sort返回的索引,但如果没有必要并且您担心性能,则可以接受未排序的结果。

这是另一个基准测试(测试从 10 到 20000 的元素数量):

在 MATLAB R2016a 上运行

bench

以及它的代码:
function ExecTimes = benchmark_findDuplicates

nOrder = (1:9).' * 10.^(1:3) ; nOrder = [nOrder(:) ; 10000 ; 20000 ] ;
npt = numel(nOrder) ;

ExecTimes = zeros(npt,6) ;

for k = 1:npt
% Sample data
N = nOrder(k) ;
A = randi(5000,[1,N]) ;

% Benchmark
f1 = @() findDuplicates_histMethod(A) ;
f2 = @() findDuplicates_histcountMethod(A) ;
f3 = @() findDuplicates_sortMethod(A) ;
f4 = @() findDuplicates_expansionMethod(A) ;
f5 = @() findDuplicates_accumarrayMethod(A) ;
f6 = @() findDuplicates_sparseMethod(A) ;
ExecTimes(k,1) = timeit( f1 ) ;
ExecTimes(k,2) = timeit( f2 ) ;
ExecTimes(k,3) = timeit( f3 ) ;
ExecTimes(k,4) = timeit( f4 ) ;
ExecTimes(k,5) = timeit( f5 ) ;
ExecTimes(k,6) = timeit( f6 ) ;

clear A
disp(N)
end

function d = findDuplicates_histMethod(A)
U = unique(A);
[co,ce] = hist(A,U);
an = ce(co>1);
d=[];
for i=1:numel(an)
d=[d,find(A==an(i))];
end
end

function d = findDuplicates_histcountMethod(A)
[~,idxu,idxc] = unique(A);
[count, ~, idxcount] = histcounts(idxc,numel(idxu));
idxkeep = count(idxcount)>1;
idx_A = 1:length(A);
d = idx_A(idxkeep);
end

function d = findDuplicates_sortMethod(A)
[B,I] = sort(A) ;
dx = find(diff(B)==0) ;
if ~isempty(dx)
d = I([dx dx([diff(dx)~=1,true])+1]) ;
else
d=[];
end
end

function d = findDuplicates_expansionMethod(A)
Ae = ones(numel(A),1) * A ;
d = find(sum(Ae==Ae.')>1) ;
end

function d = findDuplicates_accumarrayMethod(A)
d = find(ismember(A, find(accumarray(A(:), 1)>1))) ;
end

function d = findDuplicates_sparseMethod(A)
d = find(ismember(A, find(sparse(A, 1, 1)>1)));
end

end

关于arrays - 查找 Array 中所有重复元素的所有索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62168783/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com