gpt4 book ai didi

python - 数组中点之间的快速加权欧氏距离

转载 作者:太空狗 更新时间:2023-10-30 02:04:37 24 4
gpt4 key购买 nike

我需要有效地计算给定数组中每个 x,y 点到其他每个 x,y 点的欧几里得加权距离在另一个数组中。这是我的代码,可以按预期工作:

import numpy as np
import random

def rand_data(integ):
'''
Function that generates 'integ' random values between [0.,1.)
'''
rand_dat = [random.random() for _ in range(integ)]

return rand_dat

def weighted_dist(indx, x_coo, y_coo):
'''
Function that calculates *weighted* euclidean distances.
'''
dist_point_list = []
# Iterate through every point in array_2.
for indx2, x_coo2 in enumerate(array_2[0]):
y_coo2 = array_2[1][indx2]
# Weighted distance in x.
x_dist_weight = (x_coo-x_coo2)/w_data[0][indx]
# Weighted distance in y.
y_dist_weight = (y_coo-y_coo2)/w_data[1][indx]
# Weighted distance between point from array_1 passed and this point
# from array_2.
dist = np.sqrt(x_dist_weight**2 + y_dist_weight**2)
# Append weighted distance value to list.
dist_point_list.append(round(dist, 8))

return dist_point_list


# Generate random x,y data points.
array_1 = np.array([rand_data(10), rand_data(10)], dtype=float)

# Generate weights for each x,y coord for points in array_1.
w_data = np.array([rand_data(10), rand_data(10)], dtype=float)

# Generate second larger array.
array_2 = np.array([rand_data(100), rand_data(100)], dtype=float)


# Obtain *weighted* distances for every point in array_1 to every point in array_2.
dist = []
# Iterate through every point in array_1.
for indx, x_coo in enumerate(array_1[0]):
y_coo = array_1[1][indx]
# Call function to get weighted distances for this point to every point in
# array_2.
dist.append(weighted_dist(indx, x_coo, y_coo))

最终列表 dist 包含与第一个数组中的点一样多的子列表,每个子列表中的元素与第二个数组中的点一样多(加权距离)。

我想知道是否有办法让这段代码更有效率,或许可以使用 cdist函数,因为当数组有很多元素(在我的例子中它们有)并且当我必须检查很多数组的距离(我也有)时,这个过程变得非常昂贵

最佳答案

@Evan 和@Martinis Group 走在正确的轨道上 - 扩展 Evan 的答案,这里有一个函数使用广播快速计算 n 维加权欧氏距离,无需 Python 循环:

import numpy as np

def fast_wdist(A, B, W):
"""
Compute the weighted euclidean distance between two arrays of points:

D{i,j} =
sqrt( ((A{0,i}-B{0,j})/W{0,i})^2 + ... + ((A{k,i}-B{k,j})/W{k,i})^2 )

inputs:
A is an (k, m) array of coordinates
B is an (k, n) array of coordinates
W is an (k, m) array of weights

returns:
D is an (m, n) array of weighted euclidean distances
"""

# compute the differences and apply the weights in one go using
# broadcasting jujitsu. the result is (n, k, m)
wdiff = (A[np.newaxis,...] - B[np.newaxis,...].T) / W[np.newaxis,...]

# square and sum over the second axis, take the sqrt and transpose. the
# result is an (m, n) array of weighted euclidean distances
D = np.sqrt((wdiff*wdiff).sum(1)).T

return D

为了检查它是否正常工作,我们将它与使用嵌套 Python 循环的较慢版本进行比较:

def slow_wdist(A, B, W):

k,m = A.shape
_,n = B.shape
D = np.zeros((m, n))

for ii in xrange(m):
for jj in xrange(n):
wdiff = (A[:,ii] - B[:,jj]) / W[:,ii]
D[ii,jj] = np.sqrt((wdiff**2).sum())
return D

首先,让我们确保这两个函数给出相同的答案:

# make some random points and weights
def setup(k=2, m=100, n=300):
return np.random.randn(k,m), np.random.randn(k,n),np.random.randn(k,m)

a, b, w = setup()
d0 = slow_wdist(a, b, w)
d1 = fast_wdist(a, b, w)

print np.allclose(d0, d1)
# True

不用说,使用广播而不是 Python 循环的版本要快几个数量级:

%%timeit a, b, w = setup()
slow_wdist(a, b, w)
# 1 loops, best of 3: 647 ms per loop

%%timeit a, b, w = setup()
fast_wdist(a, b, w)
# 1000 loops, best of 3: 620 us per loop

关于python - 数组中点之间的快速加权欧氏距离,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19277244/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com