gpt4 book ai didi

python - 基于列表在python numpy中排列数据的最快方法

转载 作者:太空狗 更新时间:2023-10-30 01:17:20 24 4
gpt4 key购买 nike

我在 numpy 中排列数据时遇到问题示例有数据范围列表:

numpy.array([1,3,5,4,6])

我有数据:

numpy.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19])

我需要整理数据

numpy.array([

[1,9999,9999,9999,9999,9999,9999]

[2,3,4,9999,9999,9999]

[5,6,7,8,9,9999]

[10,11,12,13,9999,9999]

[14,15,16,17,18,19]

])

我认为它与 diag/diagonal/trace 功能有点相似。

我通常使用基本迭代来完成这项工作...numpy 是否具有此功能以便执行得更快??

最佳答案

这里有一些排列数据的方法:

from numpy import arange, array, ones, r_, zeros
from numpy.random import randint

def gen_tst(m, n):
a= randint(1, n, m)
b, c= arange(a.sum()), ones((m, n), dtype= int)* 999
return a, b, c

def basic_1(a, b, c):
# some assumed basic iteration based
n= 0
for k in xrange(len(a)):
m= a[k]
c[k, :m], n= b[n: n+ m], n+ m

def advanced_1(a, b, c):
# based on Svens answer
cum_a= r_[0, a.cumsum()]
i= arange(len(a)).repeat(a)
j= arange(cum_a[-1])- cum_a[:-1].repeat(a)
c[i, j]= b

def advanced_2(a, b, c):
# other loopless version
c[arange(c.shape[1])+ zeros((len(a), 1), dtype= int)< a[:, None]]= b

还有一些时间安排:

In []: m, n= 10, 100
In []: a, b, c= gen_tst(m, n)
In []: 1.* a.sum()/ (m* n)
Out[]: 0.531
In []: %timeit advanced_1(a, b, c)
10000 loops, best of 3: 99.2 us per loop
In []: %timeit advanced_2(a, b, c)
10000 loops, best of 3: 68 us per loop
In []: %timeit basic_1(a, b, c)
10000 loops, best of 3: 47.1 us per loop

In []: m, n= 50, 500
In []: a, b, c= gen_tst(m, n)
In []: 1.* a.sum()/ (m* n)
Out[]: 0.455
In []: %timeit advanced_1(a, b, c)
1000 loops, best of 3: 1.03 ms per loop
In []: %timeit advanced_2(a, b, c)
1000 loops, best of 3: 1.06 ms per loop
In []: %timeit basic_1(a, b, c)
1000 loops, best of 3: 227 us per loop

In []: m, n= 250, 2500
In []: a, b, c= gen_tst(m, n)
In []: 1.* a.sum()/ (m* n)
Out[]: 0.486
In []: %timeit advanced_1(a, b, c)
10 loops, best of 3: 30.4 ms per loop
In []: %timeit advanced_2(a, b, c)
10 loops, best of 3: 32.4 ms per loop
In []: %timeit basic_1(a, b, c)
1000 loops, best of 3: 2 ms per loop

所以基本的迭代似乎是相当高效的。

更新:
当然,基于迭代的基本实现的性能仍然可以进一步提高。作为起点建议;例如考虑这个(基于减少加法的基本迭代):

def basic_2(a, b, c):
n= 0
for k, m in enumerate(a):
nm= n+ m
c[k, :m], n= b[n: nm], nm

关于python - 基于列表在python numpy中排列数据的最快方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5456194/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com