gpt4 book ai didi

python - 如何调整此代码以同时返回第二个和第三个 "Nearest Neighbors"?

转载 作者:行者123 更新时间:2023-12-01 06:41:58 27 4
gpt4 key购买 nike

基于 calculating average distance of nearest neighbours in pandas dataframe 中的代码,如何调整它以便将第二个和第三个最近邻居返回到新列中?

(或者创建一个可调整参数来定义要返回的邻居数量):

示例代码:

import numpy as np 
from sklearn.neighbors import NearestNeighbors
import pandas as pd

def nn(x):
nbrs = NearestNeighbors(
n_neighbors=2,
algorithm='auto',
metric='euclidean'
).fit(x)
distances, indices = nbrs.kneighbors(x)
return distances, indices

time = [0, 0, 0, 1, 1, 2, 2]
x = [216, 218, 217, 280, 290, 130, 132]
y = [13, 12, 12, 110, 109, 3, 56]
car = [1, 2, 3, 1, 3, 4, 5]
df = pd.DataFrame({'time': time, 'x': x, 'y': y, 'car': car})

#This has the index of the nearest neighbor in the group, as well as the distance
nns = df.drop('car', 1).groupby('time').apply(lambda x: nn(x.as_matrix()))

groups = df.groupby('time')
nn_rows = []
for i, nn_set in enumerate(nns):
group = groups.get_group(i)
for j, tup in enumerate(zip(nn_set[0], nn_set[1])):
nn_rows.append({'time': i,
'car': group.iloc[j]['car'],
'nearest_neighbour': group.iloc[tup[1][1]]['car'],
'euclidean_distance': tup[0][1]})

nn_df = pd.DataFrame(nn_rows).set_index('time')

结果数据框:

>>> nn_df
time car euclidean_distance nearest_neighbour
0 1 1.414214 3
0 2 1.000000 3
0 3 1.000000 2
1 1 10.049876 3
1 3 10.049876 1
2 4 53.037722 5
2 5 53.037722 4

如何获得最近邻 2、3 和 N 的输出并将它们插入到新列中?

最佳答案

这是 NearestNeighbors 的文档方法。

我认为您的问题可以使用n_neighbors参数来解决。该参数指定要返回的最近邻居数量的索引和距离

当我们的目标是找到除点本身之外的单个最近邻时,通常使用的值为2。由于距离为 0,最近邻始终是其自身。

要查找第二个和第三个最近的邻居,n_neighbors 应设置为 4。这将返回点本身,然后是下一个 N-1 个最近的邻居

# Argument
n_neighbor = 4

# Indices
[point_itself, neighbor_1, neighbor_2, neighbor_3]

# Distances
[ 0, distance_1, distance_2, distance_3]

关于python - 如何调整此代码以同时返回第二个和第三个 "Nearest Neighbors"?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59416726/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com