arrays - 将带有数组的 MongoDB 文档集合分层扁平化为文档-6ren

arrays - 将带有数组的 MongoDB 文档集合分层扁平化为文档

转载作者：可可西里更新时间：2023-11-01 10:48:29

区 block 模型(区 block 0 -> 区 block 1 -> 区 block 2 -> 区 block 3 -> […]):

示例输入文档 [modulestore.structures 集合中的 700 多个]:

{
  _id: ObjectId('5932d50ff8f46c0a8098ab79'),
  blocks: [
    {
      definition: ObjectId('5923556ef8f46c0a787e9c0f'),
      block_type: 'chapter',
      block_id: '5b053a7f10ba41df85a3221c3ef3956e',
      fields: {
        format: 'Foo exam',
        children: [ 
          [ 
            'sequential', 
            '9f1e58553ad448818ec8e7915d3d94d3'
          ], 
          [ 
            'sequential', 
            'f052c7aa44274769a4631e95405834e0'
          ]
        ]
      }
    },
    {
      definition: ObjectId('59235569f8f46c0a7be1debc'),
      block_type: 'sequential',
      block_id: '9f1e58553ad448818ec8e7915d3d94d3',
      fields: {
        display_name: 'FooBar'
      }
    },
    {
      definition: ObjectId('59317406f8f46c0a8098aaf5'),
      block_type: 'sequential',
      block_id: 'f052c7aa44274769a4631e95405834e0',
      fields: {
        display_name: 'CanHaz'
      }
    }
  ]
}

我的目标是:

展平 block ，使所有 block 都处于集合级别；
游标遍历children数组；
遍历并修改“树”，使每个 child /孙子/曾孙/*- child 获得一个新属性 top_ancestor_fields，其中包含来自其最顶层的 fields 属性祖先。

示例输出:

[
  {
    _id: ObjectId('5a00f611f995363c2b63c9a6'),
    block_type: 'chapter',
    block_id: '5b053a7f10ba41df85a3221c3ef3956e',
    fields: {
      format: 'Foo exam'
      children: [ 
        [ 
          'sequential',
          '9f1e58553ad448818ec8e7915d3d94d3'
        ], 
        [
          'sequential',
          'f052c7aa44274769a4631e95405834e0'
        ]
      ]
    },
    top_ancestor_fields: {
      format: 'Foo exam'
    }
  },
  {
     _id: ObjectId('5a00f611f995363c2b63c9a7'),
     block_id: '9f1e58553ad448818ec8e7915d3d94d3',
     block_type: 'sequential',
     fields: {
       display_name: 'FooBar'
     },
     top_ancestor_fields: {
       format: 'Foo exam'
     }
  },
  {
     _id: ObjectId('5a00f611f995363c2b63c9a8'),
     block_id: 'f052c7aa44274769a4631e95405834e0',
     block_type: 'sequential',
     fields: {
       display_name: 'CanHaz'
     },
     top_ancestor_fields: {
       format: 'Foo exam'
     }
  },
]

根据@neil-lunn 的建议几乎可以正常工作:

db.modulestore.structures.aggregate([
  { $unwind: '$blocks' },
  { $project: { _id: 0,
                block_id: '$blocks.block_id',
                children: '$blocks.fields.children',
                display_name: '$blocks.fields.display_name',
                block_type: '$blocks.block_type',
                exam: '$blocks.fields.format',
                fields: '$blocks.fields'
               }},
  { $out: 'modulestore.mapped0' }
])

db.modulestore.mapped0.aggregate([
    { $graphLookup: {
        from: 'modulestore.mapped0',
        startWith: '$block_id',
        connectToField: 'children',
        connectFromField: 'block_id',
        as: 'block_ids',
        maxDepth: 0
    } },
    { $unwind: '$block_ids' },
    { $project: {
        name: 1,
        _id: 0,
        ancestor: '$block_ids.block_id'
    } },
    { $out: 'modulestore.mapped1' }
]);

但这只是挂起。我试过配置 maxDepth $graphLookup选项。仅供引用:db.modulestore.mapped0.count() 对我来说是 80772。

每个文档都可能包含一个最多包含 180 个元素的 children 数组。

不确定如何处理这个更大的管道来映射 children 层次结构......

最佳答案

以下应该让你开始:

db.modulestore.structures.aggregate([{
    $unwind: '$blocks' // flatten "blocks" array
}, {
    $replaceRoot: { // move "blocks" field to top level
        newRoot: "$blocks"
    }
}, {
    $unwind: { // flatten "fields.children" array
        path: "$fields.children",
        preserveNullAndEmptyArrays: true
    }
}, {
    // this step is technically not needed but it might speed up things - try running with and without that
    $addFields: { // we only keep the second (last, really) entry of all your arrays since this is the only valid join key for the graphLookup
        "fields.children": {
            $slice: [ "$fields.children", -1 ]
        }
    }
}, {
    $unwind: { // flatten "fields.children" array one more time because it was nested before
        path: "$fields.children",
        preserveNullAndEmptyArrays: true
    }
}, {
    $group: { // reduce the number of lookups required later by eliminating duplicate parent-child paths
        "_id": "$block_id",
        "block_type": { $first: "$block_type" },
        "definition": { $first: "$definition" },
        "fieldsFormat": { $first: "$fields.format" },
        "fieldsChildren": { $addToSet: "$fields.children" }
    }
}, {
    $project: { // restore original structure
        "block_id": "$_id",
        "block_type": "$block_type",
        "definition": "$definition",
        "fields": {
            "format": "$fieldsFormat",
            "children": "$fieldsChildren"
        }
    }
}, { // spit out the result into "modulestore.mapped0" collection, overwriting all existing content
    $out: 'modulestore.mapped0'
}])

然后

db.modulestore.mapped0.aggregate([{
    $graphLookup: {
        from: 'modulestore.mapped0',
        startWith: '$block_id',
        connectToField: 'fields.children',
        connectFromField: 'block_id',
        as: 'block_ids',
        maxDepth: 0
    }
}, { 
    $lookup: { 
        from: 'modulestore.mapped0', 
        localField: 'block_ids.fields.children', 
        foreignField: '_id', 
        as: 'block_ids.fields.children' 
    } 
}])

关于arrays - 将带有数组的 MongoDB 文档集合分层扁平化为文档，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47107733/

文章推荐： windows - 安装boost.log(使用代码块12.11 + gcc 4.7.1的Windows 7)

文章推荐： java - 禁用 JAVA 中的所有键和组合键

文章推荐： windows - Windows 下 Puppet/Vagrant 中的颜色编码？

文章推荐： java - 为 Installanywhere2010 安装程序设置 UAC 'Publisher' 字段

perl - 是否可以在 perl 中将子例程原型(prototype)化为 $$&？
出于好奇，我尝试了一些原型(prototype)制作，但似乎只允许在第一个位置使用子例程的原型(prototype) &。当我写作时 sub test (&$$) { do_somethin
android - 将 ViewPager fragment 化为 Play 商店应用程序？
我需要开发一个类似于 Android Play 商店应用程序或类似 this app 的应用程序.我阅读了很多教程，发现几乎每个教程都有与 this one 类似的例子。 . 我已经开始使用我的应用程
sql - 在 Teradata 中达到阈值后，将一列数字 session 化为 30 组
考虑一个表示“事件之间的时间”的列: (5, 40, 3, 6, 0, 9, 0, 4, 5, 18, 2, 4, 3, 2) 我想将这些分组到 30 个桶中，但桶会重置。期望的结果: (0, 1,

可可西里

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

arrays - 将带有数组的 MongoDB 文档集合分层扁平化为文档