gpt4 book ai didi

python - 使用numpy在没有for循环的情况下更新节点值

转载 作者:太空宇宙 更新时间:2023-11-04 02:02:27 25 4
gpt4 key购买 nike

我正在尝试更新 mesh 上的节点值来自元素值。

在数组 faces 中,我定义了元素节点的 ID(假设我只有两个元素):

faces = np.array([[0, 1, 2], [1, 3, 2]])

比方说,数组 force_el 包含作用在元素每个节点上的力:

force_el = np.array([[0.7, 1.1], [1.2, 0.3]])

现在我想更新节点力force_node:

force_node = np.zeros((4, force_el.shape[1]))
for face, fel in zip(faces, force_el):
force_node[face.ravel(), :] += fel

所以结果是:

>>> force_node
array([[0.7, 1.1],
[1.9, 1.4],
[1.9, 1.4],
[1.2, 0.3]])

由于此更新必须进行多次(大约 100k-1m 次),我正在尝试对其进行优化,但我找不到好的解决方案。

最佳答案

您可以使用一些矩阵乘法 force -

out_nrows = 4 # number of nodes
mask = np.zeros((len(faces),out_nrows),dtype=bool)
np.put_along_axis(mask,faces,True,axis=1)
force_node_out = mask.T.dot(force_el)

force_el 中的列数量较少,我们还可以使用 np.bincount 以获得更好的性能 -

out_nrows = 4 # number of nodes
out = np.zeros((out_nrows, force_el.shape[1]))
n = faces.shape[1]
l = force_el.shape[1]
for i in range(n):
for j in range(l):
out[:,j] += np.bincount(faces[:,i],force_el[:,j],minlength=out_nrows)

时间 -

In [35]: # Setup data (from OP's comments)
...: np.random.seed(0)
...: faces=np.array([np.random.choice(1800,3,replace=0) for i in range(3500)])
...: force_el = np.random.rand(len(faces),3)

In [36]: %%timeit # Original loopy soln
...: out_nrows = 1800
...: force_node = np.zeros((out_nrows, force_el.shape[1]))
...: for face, fel in zip(faces, force_el):
...: force_node[face.ravel(), :] += fel
100 loops, best of 3: 16.1 ms per loop

In [37]: %%timeit # @RafaelC's soln with np.add.at
...: force_node = np.zeros((1800, force_el.shape[1]))
...: np.add.at(force_node, faces, force_el[:,None])
100 loops, best of 3: 2.45 ms per loop

In [38]: %%timeit # Posted in this post that uses matrix-multiplication
...: out_nrows = 1800
...: mask = np.zeros((len(faces),out_nrows),dtype=bool)
...: np.put_along_axis(mask,faces,True,axis=1)
...: force_node_out = mask.T.dot(force_el)
10 loops, best of 3: 38.4 ms per loop

In [39]: %%timeit # Posted in this post that uses bincount
...: out_nrows = 1800
...: out = np.zeros((out_nrows, force_el.shape[1]))
...: n = faces.shape[1]
...: l = force_el.shape[1]
...: for i in range(n):
...: for j in range(l):
...: out[:,j]+=np.bincount(faces[:,i],force_el[:,j],minlength=out_nrows)
10000 loops, best of 3: 149 µs per loop

关于python - 使用numpy在没有for循环的情况下更新节点值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55459372/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com