gpt4 book ai didi

node.js - Mongodb 基于具有计数的唯一子集合属性聚合搜索结果?

转载 作者:可可西里 更新时间:2023-11-01 09:47:32 25 4
gpt4 key购买 nike

几个小时以来,我一直在努力思考如何做到这一点,我有一个名为“Jobs”的集合——在内部,它们有一个子集合“Site”,即 Jobs.site。此站点子集合具有属性“UNID”。

我正在尝试基于文本搜索从数据库中检索文档,效果很好。

但我试图仅检索基于该 Job.Site.UNID 的 UNIQUE 文档,并且可能添加了一个计数作为额外属性。结果应如下所示:

工作:{ 站点:{ field1:'EXAMPLE',UNID:'SITEID',计数:5 }}

这意味着作业集合中有 5 个作业具有该 site.UNID。

这是我目前所拥有的:

[
// GETTING DOCS BASED ON TEXT SEARCH RESULTS
{
$match: {
// clientId: req.user.client_id,
$text: { $search: body.searchTerms }
}
},
// SORTING THEM BASED ON TEXTSCORE
{ $sort: { score: { $meta: 'textScore' } } },
// THE PROBLEMATIC GROUPING PART
{ $group: { site: { UPRN: '$UPRN', myCount: { $sum: 1 } } } },
// I ONYL WANT TO GET 20 DOCS AT A TIME
{ $limit: 20 },
// THE DATA THAT I WANT IN MY DOCUMENTS, MAYBE COUNT WOULD COME HERE?
{
$project: {
site: true,
score: { $meta: 'textScore' }
}
},
// GETTING RID OF POOR MATCHES BASED ON A SCORE CALCULATED IN ANOTHER
// FUNCTION BASED ON THE NUMBER OF WORDS IN THE TEXT SEARCH
{
$match: {
score: { $gt: matchScore }
}
}
]

这里让我印象深刻的是 The field 'site' must be an accumulator object

所以我想不出正确处理该子集合属性的语法。

编辑:感谢@Anthony,V2 完美地工作并且已经对其进行了彻底的测试,除了它似乎没有计算工作总数,它总是 1 或我在 $sum 中设置的任何值:但是有 200 多个结果,仍在工作在上面。

 { $match: { $text: { $search: body.searchTerms } } },
{ $sort: { $score: { $meta: 'textScore' } } },
// { $match: { score: { $gt: 0.1 } } },
{
$group: {
_id: '$UNID',
counter: { $sum: 1 },
score: { $first: { $meta: 'textScore' } },
title: { $first: '$title' },
postcode: { $first: '$postcode' },
addressLine1: { $first: '$addressLine1' },
city: { $first: '$city' },
projectName: { $first: '$projectName' },
jobsCount: { $sum: '$counter' }
}
},
{ $limit: 20 },
{
$project: {
UNID: '$_id',
title: '$title',
postcode: '$postcode',
addressLine1: '$addressLine1',
projectName: '$projectName',
city: '$city',
score: 1,
jobsCount: true
}
}

示例数据:


{
"_id": "randomString0",
"title": "Quality",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString1",
"title": "Some2123",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString2",
"title": "Random title",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString3",
"title": "Another unique job",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString4",
"title": "Other thing",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString5",
"title": "Something else",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
}

如您所见,站点数据在所有这 5 个文档下始终是唯一的,但是计数器应该计算有多少文档具有相同的唯一性

最佳答案

$group暂存 _id(您要分组的依据)表达式是必需的表达式。而且只有少数accumulators你可以使用 $group聚合阶段。

所以你的聚合一定是这样的

[
{ "$match": { "$text": { "$search": body.searchTerms }}},
{ "$sort": { "score": { "$meta": "textScore" } } },
{ "$match": { "score": { "$gt": matchScore }}},
{ "$group": {
"_id": "$UPRN",
"myCount": { "$sum": 1 },
"score": { "$first": "$score" }
}},
{ "$limit": 20 },
{ "$project": {
"site": "$_id",
"score": 1,
"myCount": 1
}}
]

关于node.js - Mongodb 基于具有计数的唯一子集合属性聚合搜索结果?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55636225/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com