gpt4 book ai didi

python - 从掩码中提取高效的 numpy 子数组

转载 作者:太空宇宙 更新时间:2023-11-03 14:06:08 24 4
gpt4 key购买 nike

我正在搜索一种 pythonic 方法,使用示例中所示的掩码从给定数组中提取多个子数组:

a = np.array([10, 5, 3, 2, 1])
m = np.array([True, True, False, True, True])

输出将是如下数组的集合,其中只有掩码 m 的真值的连续“区域”(真值彼此相邻)表示生成子数组的索引.

L[0] = np.array([10, 5])
L[1] = np.array([2, 1])

最佳答案

这是一种方法-

def separate_regions(a, m):
m0 = np.concatenate(( [False], m, [False] ))
idx = np.flatnonzero(m0[1:] != m0[:-1])
return [a[idx[i]:idx[i+1]] for i in range(0,len(idx),2)]

sample 运行-

In [41]: a = np.array([10, 5, 3, 2, 1])
...: m = np.array([True, True, False, True, True])
...:

In [42]: separate_regions(a, m)
Out[42]: [array([10, 5]), array([2, 1])]

运行时测试

其他方法-

# @kazemakase's soln
def zip_split(a, m):
d = np.diff(m)
cuts = np.flatnonzero(d) + 1

asplit = np.split(a, cuts)
msplit = np.split(m, cuts)

L = [aseg for aseg, mseg in zip(asplit, msplit) if np.all(mseg)]
return L

时间 -

In [49]: a = np.random.randint(0,9,(100000))

In [50]: m = np.random.rand(100000)>0.2

# @kazemakase's's solution
In [51]: %timeit zip_split(a,m)
10 loops, best of 3: 114 ms per loop

# @Daniel Forsman's solution
In [52]: %timeit splitByBool(a,m)
10 loops, best of 3: 25.1 ms per loop

# Proposed in this post
In [53]: %timeit separate_regions(a, m)
100 loops, best of 3: 5.01 ms per loop

增加岛屿的平均长度 -

In [58]: a = np.random.randint(0,9,(100000))

In [59]: m = np.random.rand(100000)>0.1

In [60]: %timeit zip_split(a,m)
10 loops, best of 3: 64.3 ms per loop

In [61]: %timeit splitByBool(a,m)
100 loops, best of 3: 14 ms per loop

In [62]: %timeit separate_regions(a, m)
100 loops, best of 3: 2.85 ms per loop

关于python - 从掩码中提取高效的 numpy 子数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43385877/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com