gpt4 book ai didi

python - 在大型 numpy 数组中查找常量子数组

转载 作者:塔克拉玛干 更新时间:2023-11-03 03:05:25 24 4
gpt4 key购买 nike

我有一个像这样的 numpy float 组

v = np.array([1.0,1.0,2.0,2.0,2.0,2.0,...])

我需要像这样识别数组中的所有常量段

[{value:1.0,location:0,duration:2},..]

效率是主要指标

最佳答案

这是一种方法-

def island_props(v):
# Get one-off shifted slices and then compare element-wise, to give
# us a mask of start and start positions for each island.
# Also, get the corresponding indices.
mask = np.concatenate(( [True], v[1:] != v[:-1], [True] ))
loc0 = np.flatnonzero(mask)

# Get the start locations
loc = loc0[:-1]

# The values would be input array indexe by the start locations.
# The lengths woul be the differentiation between start and stop indices.
return v[loc], loc, np.diff(loc0)

sample 运行-

In [143]: v
Out[143]: array([ 1., 1., 2., 2., 2., 2., 5., 2.])

In [144]: value, location, lengths = island_props(v)

In [145]: value
Out[145]: array([ 1., 2., 5., 2.])

In [146]: location
Out[146]: array([0, 2, 6, 7])

In [147]: lengths
Out[147]: array([2, 4, 1, 1])

运行时测试

其他方法-

import itertools
def MSeifert(a):
return [{'value': k, 'duration': len(list(v))} for k, v in
itertools.groupby(a.tolist())]

def Kasramvd(a):
return np.split(v, np.where(np.diff(v) != 0)[0] + 1)

时间 -

In [156]: v0 = np.array([1.0,1.0,2.0,2.0,2.0,2.0,5.0,2.0])

In [157]: v = np.tile(v0,10000)

In [158]: %timeit MSeifert(v)
...: %timeit Kasramvd(v)
...: %timeit island_props(v)
...:
10 loops, best of 3: 44.7 ms per loop
10 loops, best of 3: 36.1 ms per loop
10000 loops, best of 3: 140 µs per loop

关于python - 在大型 numpy 数组中查找常量子数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46502265/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com