gpt4 book ai didi

python - 高效的余弦距离计算

转载 作者:行者123 更新时间:2023-12-01 05:09:26 25 4
gpt4 key购买 nike

我想根据矩阵的行计算向量的最近余弦邻居,并且一直在测试一些 Python 函数的性能来执行此操作。

def cos_loop_spatial(matrix, vector):
"""
Calculating pairwise cosine distance using a common for loop with the numpy cosine function.
"""
neighbors = []
for row in range(matrix.shape[0]):
neighbors.append(scipy.spatial.distance.cosine(vector, matrix[row,:]))
return neighbors

def cos_loop(matrix, vector):
"""
Calculating pairwise cosine distance using a common for loop with manually calculated cosine value.
"""
neighbors = []
for row in range(matrix.shape[0]):
vector_norm = np.linalg.norm(vector)
row_norm = np.linalg.norm(matrix[row,:])
cos_val = vector.dot(matrix[row,:]) / (vector_norm * row_norm)
neighbors.append(cos_val)
return neighbors

def cos_matrix_multiplication(matrix, vector):
"""
Calculating pairwise cosine distance using matrix vector multiplication.
"""
dotted = matrix.dot(vector)
matrix_norms = np.linalg.norm(matrix, axis=1)
vector_norm = np.linalg.norm(vector)
matrix_vector_norms = np.multiply(matrix_norms, vector_norm)
neighbors = np.divide(dotted, matrix_vector_norms)
return neighbors

cos_functions = [cos_loop_spatial, cos_loop, cos_matrix_multiplication]

# Test performance and plot the best results of each function
mat = np.random.randn(1000,1000)
vec = np.random.randn(1000)
cos_performance = {}
for func in cos_functions:
func_performance = %timeit -o func(mat, vec)
cos_performance[func.__name__] = func_performance.best

pd.Series(cos_performance).plot(kind='bar')

result

cos_matrix_multiplication 函数显然是其中最快的,但我想知道您是否有关于进一步提高矩阵向量余弦距离计算效率的建议。

最佳答案

使用scipy.spatial.distance.cdist(mat, vec[np.newaxis,:], metric='cosine') ,基本上计算两个向量集合的每对之间的成对距离,由两个输入矩阵的行表示。

关于python - 高效的余弦距离计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24495406/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com