gpt4 book ai didi

python - 具有不同结果大小的 numpy apply_along_axis

转载 作者:行者123 更新时间:2023-11-28 22:28:20 24 4
gpt4 key购买 nike

我有一个返回列子集的函数,我想高效地将它应用到每一列。所以结果不再是矩阵,而是不同长度的列列表。由于大小不匹配,我未能使用 numpy apply_along_axis 来执行此操作。除了自己遍历列之外,有没有一种方法可以有效地做到这一点?

col_pred = lambda x: [v for v in x if v > 0.5]
filteredData = np.apply_along_axis(col_pred, 0, data)
# ValueError: could not broadcast input array from shape (3) into shape (4)

例如输入

data = [[0, 1, 1, 0], [1, 1, 1, 1]]
// my real data is more like a matrix with a lot of rows in [0-1]
// that can be simulated with
// data = [[random.uniform(0, 1) for i in range(10)] for j in range(100000)]

我想得到

[[1, 1], [1, 1, 1, 1]]

最佳答案

查看您的代码,您似乎正在尝试输出每列大于阈值 0.5 的所有元素。这是一种实现这些的方法,并且还可以概括为处理沿行和列的那些 -

def threshold_along_an_axis(a, thresh = 0.5, axis=0):
if axis==0:
A = a.T
else:
A = a
mask = A>thresh
s = mask.sum(1)
s0 = np.r_[0,s.cumsum()]
arr = A[mask].tolist() # Skip .tolist() if list of arrays is needed as o/p
return [arr[s0[i]:s0[i+1]] for i in range(len(s0)-1)]

这里的目的是在循环理解中做最少的工作。

sample 运行-

In [1]: a = np.random.rand(4,5)

In [2]: a
Out[2]:
array([[ 0.45973245, 0.3671334 , 0.12000436, 0.04205402, 0.74729737],
[ 0.55217308, 0.4018889 , 0.55695863, 0.55824384, 0.33435153],
[ 0.32450124, 0.07713855, 0.09126221, 0.13150986, 0.27961361],
[ 0.0876053 , 0.42685005, 0.53034652, 0.15084453, 0.51518185]])

In [3]: threshold_along_an_axis(a, thresh=0.5, axis=0) # per column
Out[3]:
[[0.5521730819881912],
[],
[0.5569586261866918, 0.5303465159370833],
[0.5582438446718111],
[0.7472973699509776, 0.5151818458812673]]

In [4]: threshold_along_an_axis(a, thresh=0.5, axis=1) # per row
Out[4]:
[[0.7472973699509776],
[0.5521730819881912, 0.5569586261866918, 0.5582438446718111],
[],
[0.5303465159370833, 0.5151818458812673]]

关于python - 具有不同结果大小的 numpy apply_along_axis,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43618825/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com