gpt4 book ai didi

MongoDB嵌套对象聚合计数

转载 作者:IT老高 更新时间:2023-10-28 13:24:27 25 4
gpt4 key购买 nike

我有一个高度嵌套的 mongoDB 对象集,我想计算与给定条件匹配的子文档的数量编辑:(在每个文档中)。例如:

{"_id":{"chr":"20","pos":"14371","ref":"A","alt":"G"},
"studies":[
{
"study_id":"Study1",
"samples":[
{
"sample_id":"NA00001",
"formatdata":[
{"GT":"1|0","GQ":48,"DP":8,"HQ":[51,51]}
]
},
{
"sample_id":"NA00002",
"formatdata":[
{"GT":"0|0","GQ":48,"DP":8,"HQ":[51,51]}
]
}
]
}
]
}
{"_id":{"chr":"20","pos":"14372","ref":"T","alt":"AA"},
"studies":[
{
"study_id":"Study3",
"samples":[
{
"sample_id":"SAMPLE1",
"formatdata":[
{"GT":"1|0","GQ":48,"DP":8,"HQ":[51,51]}
]
},
{
"sample_id":"SAMPLE2",
"formatdata":[
{"GT":"1|0","GQ":48,"DP":8,"HQ":[51,51]}
]
}
]
}
]
}
{"_id":{"chr":"20","pos":"14373","ref":"C","alt":"A"},
"studies":[
{
"study_id":"Study3",
"samples":[
{
"sample_id":"SAMPLE3",
"formatdata":[
{"GT":"0|0","GQ":48,"DP":8,"HQ":[51,51]}
]
},
{
"sample_id":"SAMPLE7",
"formatdata":[
{"GT":"0|0","GQ":48,"DP":8,"HQ":[51,51]}
]
}
]
}
]
}

我想知道有多少子文档包含 GT:"1|0",在这种情况下,第一个文档中为 1,第二个文档中为 2,第三个文档中为 0。我已经尝试过展开和聚合函数,但我显然没有做正确的事情。当我尝试按“GT”字段计算子文档时,mongo 提示:

db.collection.aggregate([{$group: {"$studies.samples.formatdata.GT":1,_id:0}}])

因为我的组名不能包含“.”,但如果我把它们排除在外:

db.collection.aggregate([{$group: {"$GT":1,_id:0}}])

它提示是因为“$GT 不能是运算符(operator)名称”

有什么想法吗?

最佳答案

您需要处理$unwind使用数组时,您需要执行此操作 3 次:

 db.collection.aggregate([

// Un-wind the array's to access filtering
{ "$unwind": "$studies" },
{ "$unwind": "$studies.samples" },
{ "$unwind": "$studies.samples.formdata" },

// Group results to obtain the matched count per key
{ "$group": {
"_id": "$studies.samples.formdata.GT",
"count": { "$sum": 1 }
}}
])

理想情况下,您希望过滤输入。可能使用 $match 执行此操作在处理 $unwind 之前和之后都使用 $regex匹配点的数据以“1”开头的文档。

 db.collection.aggregate([

// Match first to exclude documents where this is not present in any array member
{ "$match": { "studies.samples.formdata.GT": /^1/ } },

// Un-wind the array's to access filtering
{ "$unwind": "$studies" },
{ "$unwind": "$studies.samples" },
{ "$unwind": "$studies.samples.formdata" },

// Match to filter
{ "$match": { "studies.samples.formdata.GT": /^1/ } },

// Group results to obtain the matched count per key
{ "$group": {
"_id": {
"_id": "$_id",
"key": "$studies.samples.formdata.GT"
},
"count": { "$sum": 1 }
}}
])

请注意,在所有情况下,以“美元 $”为前缀的条目都是指文档属性的“变量”。这些是使用右侧输入的“值”。左侧的“keys”必须指定为纯字符串键。没有变量可以用来命名键。

关于MongoDB嵌套对象聚合计数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27914953/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com