gpt4 book ai didi

Closest pair location on MongoDB in only one query(仅一次查询即可在MongoDB上找到最接近的配对位置)

转载 作者:bug小助手 更新时间:2023-10-25 13:54:47 26 4
gpt4 key购买 nike

I have a nodes database extracted from OpenStreetMap and the data is structured like this:


{_id: 0, location: {type: 'Point', coordinates: [long, lat]}},
{_id: 1, location: {type: 'Point', coordinates: [long, lat]}, ${key}: ${value}},
{_id: 2, location: {type: 'Point', coordinates: [long, lat]}, ${key}: ${value}},
{_id: 3, location: {type: 'Point', coordinates: [long, lat]}},
{_id: 4, location: {type: 'Point', coordinates: [long, lat]}, ${key}: ${value}},

As we know, OpenStreetMap has tons of tags containing keys and values.
I'm trying to query the closest pair of nodes that contains a specific tag in only one query. I couldn't go much further than an simple aggregation function matching the nodes data base. In the example bellow I use the tag power: tower, but it is not mandatory to be that one.


  const result = await client.nodes_collection
$match: {
power: 'tower',
$lookup: {
from: 'nodes',
as: 'closestNode',
pipeline: [{ $match: { power: 'tower' } }],
$unwind: '$closestNode',
$match: {
'closestNode._id': { $ne: '$_id' },

P.S: It is mandatory to be in only one query.





For mongoDB version 5.3 or higher (With the help of @Juliana Aragão):

对于MongoDB 5.3或更高版本(在@Juliana Aragão的帮助下):

{$match: {power: "tower"}},
{$lookup: {
from: "nodes",
as: "closestNode",
let: {coords: "$location.coordinates"},
pipeline: [
{$geoNear: {
near: {
type: "Point",
coordinates: "$$coords"
distanceField: "distFromMe",
spherical: true,
query: {power: "tower"}
{$skip: 1},
{$limit: 1},
{$project: {distFromMe: 1}}
{$set: {closestNode: {$first: "$closestNode"}}},
{$sort: {"closestNode.distFromMe": 1}},
{$limit: 1}

For older versions of mongoDB, $geoNear does not support the coordinates as a parameter, thus one option is to create the calculation by ourselves. This solution includes calculating the distance between all pairs: O(n^2):


{$match: {power: "tower"}},
{$lookup: {
from: "nodes",
as: "closestNode",
let: {
long: {$multiply: [{$first: "$location.coordinates"}, 0.017452778]},
lat: {$multiply: [{$last: "$location.coordinates"}, 0.017452778]}
pipeline: [
{$match: {power: "tower"}},
{$set: {
lat: {$multiply: [{$last: "$location.coordinates"}, 0.017452778]},
long: {$multiply: [{$first: "$location.coordinates"}, 0.017452778]}
{$set: {distFromMe: {
$let: {
vars: {
dlon: {$subtract: ["$long", "$$long"]},
dlat: {$subtract: ["$lat", "$$lat"]},
rlon: {$divide: [
{$multiply: [6378137, {$cos: {$avg: ["$lat", "$$lat"]}}]},
{$sqrt: {$subtract: [
{$multiply: [
{$pow: [{$sin: {$avg: ["$lat", "$$lat"]}}, 2]}
rlat: {$divide: [
{$multiply: [6378137, {$subtract: [1, 0.00669437]}]},
{$pow: [
{$subtract: [
{$multiply: [
{$pow: [
{$sin: {$avg: ["$lat", "$$lat"]}},
in: {$max: [
{$sqrt: [{$add: [
{$pow: [{$multiply: ["$$dlon", "$$rlon"]}, 2]},
{$pow: [{$multiply: ["$$dlat", "$$rlat"]}, 2]}
{$sort: {distFromMe: 1}},
{$skip: 1},
{$limit: 1},
{$project: {distFromMe: 1}}
{$set: {closestNode: {$first: "$closestNode"}}},
{$sort: {"closestNode.distFromMe": 1}},
{$limit: 1}

See how it works on the playground example.


The distance calculation is according to this ref by @jimirwin‏:



Thank you very much, I managed to query based on your solution, but using geoNear to use more mongodb native functions here it is. It won't work on the playground because it doesn't support 2dsphere


@Juliana Aragão, Using $geonear inside the pipeline is much better. I was not aware of the option of newer mongoDB versions to get the coordinates as a parameter. I updated the answer according to your suggestion, so other people will be able to see it easily.

@Juliana Aragão,在管道中使用$geonear要好得多。我不知道更新的MongoDB版本可以将坐标作为参数来获取。我根据你的建议更新了答案,这样其他人就可以很容易地看到它。

Hey @nimrod-serok. I hope this message finds you because I can't tag your username properly. I had to amplificate the query to get the k-closest pair, and I had some Issues with that. The normal behavior is showing both results of a pair based on distance back and forth one underneath other, but somehow it is breaking it in the middle. Here I wrote what I got and the data to reproduce the issue. Any thoughts on that?


I don't see any problem. The code is working as expected, as you can see here, with all the requirements inside the query. If your problem is that it not always pairs, this is a valid case. Consider a line with A-10m-B--20m--C. In this case, the closest point to C will be B, but the closet point to B will be A. Is this what you mean by breaking? What is your expected results in such a case? Only A-C pair? B-C should be drooped?


My bad, the code is really running as expected. I'm running the same query in other databases as well and they are returning always the pair of nodes, because I run a CROSS JOIN and them I filter everything. Now I see you optimized this situation.


26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号