gpt4 book ai didi

python - 快速查找数组中重复元素的长度和起始索引的方法

转载 作者:行者123 更新时间:2023-12-03 15:08:56 27 4
gpt4 key购买 nike

我有一个数组 A:

import numpy as np
A = np.array( [0, 0, 1, 1, 1, 0, 1, 1, 0 ,0, 1, 0] )
连续“1”的长度为:
output: [3, 2, 1]
具有相应的起始索引:
idx = [2, 6, 10]
原始数组很大,我更喜欢使用较少 for 循环的解决方案。
编辑(运行时):
import numpy as np
import time

A = np.array( [0, 0, 1, 1, 1, 0, 1, 1, 0 ,0, 1, 0] )

def LoopVersion(A):
l_A = len(A)
size = []
idx = []
temp_idx = []
temp_size = []
for i in range(l_A):
if A[i] == 1:
temp_size.append(1)
if not temp_idx:
temp_idx = i
idx.append(temp_idx)
else:
size.append( len(temp_size) )
size = [i for i in size if i != 0]
temp_size = []
temp_idx = []
return size, idx
Quang的解决方案:
def UniqueVersion(A):
_, idx, counts = np.unique(np.cumsum(1-A)*A, return_index=True, return_counts=True)
return idx, counts
Jacco 的解决方案:
def ConcatVersion(A):
A = np.concatenate(([0], A, [0])) # get rid of some edge cases
starts = np.argwhere((A[:-1] + A[1:]) == 1).ravel()[::2]
ends = np.argwhere((A[:-1] + A[1:]) == 1).ravel()[1::2]
len_of_repeats = ends - starts
return starts, len_of_repeats
Dan 的解决方案(也适用于特殊情况):
def structure(A):
ZA = np.concatenate(([0], A, [0]))
indices = np.flatnonzero( ZA[1:] != ZA[:-1] )
counts = indices[1:] - indices[:-1]
return indices[::2], counts[::2]
10000 个元素的运行时分析:
np.random.seed(1234)
B = np.random.randint(2, size=10000)


start = time.time()
size, idx = LoopVersion(B)
end = time.time()
print ( (end - start) )
# 0.32489800453186035 seconds

start = time.time()
idx, counts = UniqueVersion(B)
end = time.time()
print ( (end - start) )
# 0.008305072784423828 seconds

start = time.time()
idx, counts = ConcatVersion(B)
end = time.time()
print ( (end - start) )
# 0.0009801387786865234 seconds

start = time.time()
idx, counts = structure(B)
end = time.time()
print ( (end - start) )
# 0.000347137451171875 seconds

最佳答案

这里是一个行人尝试,通过编程解决问题。
我们在 A 前面加上一个零。 ,得到一个向量 ZA ,然后检测 1岛屿和 0岛屿以交替方式出现在 ZA通过比较移动版本 ZA[1:]ZA[-1] . (在构造的数组中,我们取偶数位置,对应于 A 中的位置。)

import numpy as np

def structure(A):
ZA = np.concatenate(([0], A, [0]))
indices = np.flatnonzero( ZA[1:] != ZA[:-1] )
counts = indices[1:] - indices[:-1]
return indices[::2], counts[::2]
一些示例运行:
In [71]: structure(np.array( [0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0] ))
Out[71]: (array([ 2, 6, 10]), array([3, 2, 1]))

In [72]: structure(np.array( [1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1] ))
Out[72]: (array([ 0, 5, 9, 13, 15]), array([3, 3, 2, 1, 1]))

In [73]: structure(np.array( [1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0] ))
Out[73]: (array([0, 5, 9]), array([3, 3, 2]))

In [74]: structure(np.array( [1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1] ))
Out[74]: (array([ 0, 2, 5, 7, 11, 14]), array([1, 2, 1, 3, 2, 3]))

关于python - 快速查找数组中重复元素的长度和起始索引的方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64266229/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com