gpt4 book ai didi

arrays - 将带有数组的 MongoDB 文档集合分层扁平化为文档

转载 作者:可可西里 更新时间:2023-11-01 10:48:29 26 4
gpt4 key购买 nike

区 block 模型(区 block 0 -> 区 block 1 -> 区 block 2 -> 区 block 3 -> […]):

https://yuml.me/cb86229d.png

示例输入文档 [modulestore.structures 集合中的 700 多个]:

{
_id: ObjectId('5932d50ff8f46c0a8098ab79'),
blocks: [
{
definition: ObjectId('5923556ef8f46c0a787e9c0f'),
block_type: 'chapter',
block_id: '5b053a7f10ba41df85a3221c3ef3956e',
fields: {
format: 'Foo exam',
children: [
[
'sequential',
'9f1e58553ad448818ec8e7915d3d94d3'
],
[
'sequential',
'f052c7aa44274769a4631e95405834e0'
]
]
}
},
{
definition: ObjectId('59235569f8f46c0a7be1debc'),
block_type: 'sequential',
block_id: '9f1e58553ad448818ec8e7915d3d94d3',
fields: {
display_name: 'FooBar'
}
},
{
definition: ObjectId('59317406f8f46c0a8098aaf5'),
block_type: 'sequential',
block_id: 'f052c7aa44274769a4631e95405834e0',
fields: {
display_name: 'CanHaz'
}
}
]
}

我的目标是:

  1. 展平 block ,使所有 block 都处于集合级别;
  2. 游标遍历children数组;
  3. 遍历并修改“树”,使每个 child /孙子/曾孙/*- child 获得一个新属性 top_ancestor_fields,其中包含来自其最顶层的 fields 属性祖先。

示例输出:

[
{
_id: ObjectId('5a00f611f995363c2b63c9a6'),
block_type: 'chapter',
block_id: '5b053a7f10ba41df85a3221c3ef3956e',
fields: {
format: 'Foo exam'
children: [
[
'sequential',
'9f1e58553ad448818ec8e7915d3d94d3'
],
[
'sequential',
'f052c7aa44274769a4631e95405834e0'
]
]
},
top_ancestor_fields: {
format: 'Foo exam'
}
},
{
_id: ObjectId('5a00f611f995363c2b63c9a7'),
block_id: '9f1e58553ad448818ec8e7915d3d94d3',
block_type: 'sequential',
fields: {
display_name: 'FooBar'
},
top_ancestor_fields: {
format: 'Foo exam'
}
},
{
_id: ObjectId('5a00f611f995363c2b63c9a8'),
block_id: 'f052c7aa44274769a4631e95405834e0',
block_type: 'sequential',
fields: {
display_name: 'CanHaz'
},
top_ancestor_fields: {
format: 'Foo exam'
}
},
]

根据@neil-lunn 的建议几乎可以正常工作:

db.modulestore.structures.aggregate([
{ $unwind: '$blocks' },
{ $project: { _id: 0,
block_id: '$blocks.block_id',
children: '$blocks.fields.children',
display_name: '$blocks.fields.display_name',
block_type: '$blocks.block_type',
exam: '$blocks.fields.format',
fields: '$blocks.fields'
}},
{ $out: 'modulestore.mapped0' }
])

db.modulestore.mapped0.aggregate([
{ $graphLookup: {
from: 'modulestore.mapped0',
startWith: '$block_id',
connectToField: 'children',
connectFromField: 'block_id',
as: 'block_ids',
maxDepth: 0
} },
{ $unwind: '$block_ids' },
{ $project: {
name: 1,
_id: 0,
ancestor: '$block_ids.block_id'
} },
{ $out: 'modulestore.mapped1' }
]);

但这只是挂起。我试过配置 maxDepth $graphLookup选项。仅供引用:db.modulestore.mapped0.count() 对我来说是 80772。

每个文档都可能包含一个最多包含 180 个元素的 children 数组。

不确定如何处理这个更大的管道来映射 children 层次结构......

最佳答案

以下应该让你开始:

db.modulestore.structures.aggregate([{
$unwind: '$blocks' // flatten "blocks" array
}, {
$replaceRoot: { // move "blocks" field to top level
newRoot: "$blocks"
}
}, {
$unwind: { // flatten "fields.children" array
path: "$fields.children",
preserveNullAndEmptyArrays: true
}
}, {
// this step is technically not needed but it might speed up things - try running with and without that
$addFields: { // we only keep the second (last, really) entry of all your arrays since this is the only valid join key for the graphLookup
"fields.children": {
$slice: [ "$fields.children", -1 ]
}
}
}, {
$unwind: { // flatten "fields.children" array one more time because it was nested before
path: "$fields.children",
preserveNullAndEmptyArrays: true
}
}, {
$group: { // reduce the number of lookups required later by eliminating duplicate parent-child paths
"_id": "$block_id",
"block_type": { $first: "$block_type" },
"definition": { $first: "$definition" },
"fieldsFormat": { $first: "$fields.format" },
"fieldsChildren": { $addToSet: "$fields.children" }
}
}, {
$project: { // restore original structure
"block_id": "$_id",
"block_type": "$block_type",
"definition": "$definition",
"fields": {
"format": "$fieldsFormat",
"children": "$fieldsChildren"
}
}
}, { // spit out the result into "modulestore.mapped0" collection, overwriting all existing content
$out: 'modulestore.mapped0'
}])

然后

db.modulestore.mapped0.aggregate([{
$graphLookup: {
from: 'modulestore.mapped0',
startWith: '$block_id',
connectToField: 'fields.children',
connectFromField: 'block_id',
as: 'block_ids',
maxDepth: 0
}
}, {
$lookup: {
from: 'modulestore.mapped0',
localField: 'block_ids.fields.children',
foreignField: '_id',
as: 'block_ids.fields.children'
}
}])

关于arrays - 将带有数组的 MongoDB 文档集合分层扁平化为文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47107733/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com