mongodb - 在具有索引的字段上使用 $exists 和 mongodb 的慢查询行为-6ren

mongodb - 在具有索引的字段上使用 $exists 和 mongodb 的慢查询行为

转载作者：可可西里更新时间：2023-11-01 09:08:11

我一直在使用 mongo 3.2.9 安装进行一些实时数据调查。主要的难点是找出文档中缺失数据的记录的一些细节。但是我运行的查询在 robomongo 和 compass 中超时。

我有一个包含超过 300 万条记录的集合 (foo)。我正在搜索所有没有 barId 的记录，这是我在 mongo 上发起的查询:

db.foo.find({barId:{$exists:true}}).explain(true)

这是来自 mongo shell 的执行计划(它在 robomongo 或 compass 中超时)

MongoDB Enterprise > db.foo.find({barId:{$exists:true}}).explain(true)
{
  "queryPlanner" : {
    "plannerVersion" : 1,
    "namespace" : "myDatabase01.foo",
    "indexFilterSet" : false,
    "parsedQuery" : {
      "barId" : {
        "$exists" : true
      }
    },
    "winningPlan" : {
      "stage" : "FETCH",
      "filter" : {
        "barId" : {
          "$exists" : true
        }
      },
      "inputStage" : {
        "stage" : "IXSCAN",
        "keyPattern" : {
          "barId" : 1
        },
        "indexName" : "barId_1",
        "isMultiKey" : false,
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 1,
        "direction" : "forward",
        "indexBounds" : {
          "barId" : [
            "[MinKey, MaxKey]"
          ]
        }
      }
    },
    "rejectedPlans" : [ ]
  },
  "executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 2,
    "executionTimeMillis" : 154716,
    "totalKeysExamined" : 3361040,
    "totalDocsExamined" : 3361040,
    "executionStages" : {
      "stage" : "FETCH",
      "filter" : {
        "barId" : {
          "$exists" : true
        }
      },
      "nReturned" : 2,
      "executionTimeMillisEstimate" : 152060,
      "works" : 3361041,
      "advanced" : 2,
      "needTime" : 3361038,
      "needYield" : 0,
      "saveState" : 27619,
      "restoreState" : 27619,
      "isEOF" : 1,
      "invalidates" : 0,
      "docsExamined" : 3361040,
      "alreadyHasObj" : 0,
      "inputStage" : {
        "stage" : "IXSCAN",
        "nReturned" : 3361040,
        "executionTimeMillisEstimate" : 1260,
        "works" : 3361041,
        "advanced" : 3361040,
        "needTime" : 0,
        "needYield" : 0,
        "saveState" : 27619,
        "restoreState" : 27619,
        "isEOF" : 1,
        "invalidates" : 0,
        "keyPattern" : {
          "barId" : 1
        },
        "indexName" : "barId_1",
        "isMultiKey" : false,
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 1,
        "direction" : "forward",
        "indexBounds" : {
          "barId" : [
            "[MinKey, MaxKey]"
          ]
        },
        "keysExamined" : 3361040,
        "dupsTested" : 0,
        "dupsDropped" : 0,
        "seenInvalidated" : 0
      }
    },
    "allPlansExecution" : [ ]
  },
  "serverInfo" : {
    "host" : "myLinuxMachine",
    "port" : 8080,
    "version" : "3.2.9",
    "gitVersion" : "22ec9e93b40c85fc7cae7d56e7d6a02fd811088c"
  },
  "ok" : 1
}

它看起来像是在使用我的 barId_1 索引，但同时它扫描所有 300 万条记录只返回 2。

我运行了一个类似的查询，但我没有查找字段的存在，而是查找大于 0 的 ID(所有字段)

MongoDB Enterprise > db.foo.find({barId:{$gt:"0"}}).explain(true)
{
  "queryPlanner" : {
    "plannerVersion" : 1,
    "namespace" : "myDatabase01.foo",
    "indexFilterSet" : false,
    "parsedQuery" : {
      "barId" : {
        "$gt" : "0"
      }
    },
    "winningPlan" : {
      "stage" : "FETCH",
      "inputStage" : {
        "stage" : "IXSCAN",
        "keyPattern" : {
          "barId" : 1
        },
        "indexName" : "barId_1",
        "isMultiKey" : false,
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 1,
        "direction" : "forward",
        "indexBounds" : {
          "barId" : [
            "(\"0\", {})"
          ]
        }
      }
    },
    "rejectedPlans" : [ ]
  },
  "executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 2,
    "executionTimeMillis" : 54,
    "totalKeysExamined" : 2,
    "totalDocsExamined" : 2,
    "executionStages" : {
      "stage" : "FETCH",
      "nReturned" : 2,
      "executionTimeMillisEstimate" : 10,
      "works" : 3,
      "advanced" : 2,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "docsExamined" : 2,
      "alreadyHasObj" : 0,
      "inputStage" : {
        "stage" : "IXSCAN",
        "nReturned" : 2,
        "executionTimeMillisEstimate" : 10,
        "works" : 3,
        "advanced" : 2,
        "needTime" : 0,
        "needYield" : 0,
        "saveState" : 0,
        "restoreState" : 0,
        "isEOF" : 1,
        "invalidates" : 0,
        "keyPattern" : {
          "barId" : 1
        },
        "indexName" : "barId_1",
        "isMultiKey" : false,
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 1,
        "direction" : "forward",
        "indexBounds" : {
          "barId" : [
            "(\"1\", {})"
          ]
        },
        "keysExamined" : 2,
        "dupsTested" : 0,
        "dupsDropped" : 0,
        "seenInvalidated" : 0
      }
    },
    "allPlansExecution" : [ ]
  },
  "serverInfo" : {
    "host" : "myLinuxMachine",
    "port" : 8080,
    "version" : "3.2.9",
    "gitVersion" : "22ec9e93b40c85fc7cae7d56e7d6a02fd811088c"
  },
  "ok" : 1
}

这再次对 barId_1 进行了索引扫描。它扫描了 2 条记录，返回 2。

为了完整起见，这里只有 2 条记录，其他 300 万条记录的大小和组成非常相似。

MongoDB Enterprise > db.foo.find({barId:{$gt:"0"}})
{ 
  "_id" : "00002f5d-ee4a-4996-bb27-b54ea84df777", "createdDate" : ISODate("2016-11-16T02:26:48.500Z"), "createdBy" : "Exporter", "lastModifiedDate" : ISODate("2016-11-16T02:26:48.500Z"), "lastModifiedBy" : "Exporter", "rolePlayed" : "LA", "roleType" : "T", "oId" : [ "d7316944-62ed-48dc-8ee4-e3bad8c58b10" ], "barId" : "e45b3160-bbb4-24e5-82b3-ad0c28329555", "cId" : "dcc29053-7a1f-439e-9536-fb4e44ff8a51", "timestamp" : "2017-02-20T16:23:15.795Z" 
}
{ 
  "_id" : "00002f5d-ee4a-4996-bb27-b54ea84df888", "createdDate" : ISODate("2016-11-16T02:26:48.500Z"), "createdBy" : "Exporter", "lastModifiedDate" : ISODate("2016-11-16T02:26:48.500Z"), "lastModifiedBy" : "Exporter", "rolePlayed" : "LA", "roleType" : "T", "oId" : [ "d7316944-62ed-48dc-8ee4-e3bad8c58b10" ], "barId" : "e45b3160-bbb4-24e5-82b3-ad0c28329555", "cId" : "dcc29053-7a1f-439e-9536-fb4e44ff8a51", "timestamp" : "2017-02-20T16:23:15.795Z" 
}

当然，我已经进行了一些谷歌搜索，发现使用索引和 exists 子句曾经存在问题，但在许多线程中，我已经读到这个问题是固定的。是吗？此外，我还发现了以下 Hack，您可以使用它而不是 $exists 子句来在查找字段是否存在时强制“正确”使用索引。

MongoDB Enterprise > db.foo.find({barId:{$ne:null}}).explain(true)
{
  "queryPlanner" : {
    "plannerVersion" : 1,
    "namespace" : "myDatabase01.foo",
    "indexFilterSet" : false,
    "parsedQuery" : {
      "$not" : {
        "barId" : {
          "$eq" : null
        }
      }
    },
    "winningPlan" : {
      "stage" : "FETCH",
      "filter" : {
        "$not" : {
          "barId" : {
            "$eq" : null
          }
        }
      },
      "inputStage" : {
        "stage" : "IXSCAN",
        "keyPattern" : {
          "barId" : 1
        },
        "indexName" : "barId_1",
        "isMultiKey" : false,
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 1,
        "direction" : "forward",
        "indexBounds" : {
          "barId" : [
            "[MinKey, null)",
            "(null, MaxKey]"
          ]
        }
      }
    },
    "rejectedPlans" : [ ]
  },
  "executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 2,
    "executionTimeMillis" : 57,
    "totalKeysExamined" : 3,
    "totalDocsExamined" : 2,
    "executionStages" : {
      "stage" : "FETCH",
      "filter" : {
        "$not" : {
          "barId" : {
            "$eq" : null
          }
        }
      },
      "nReturned" : 2,
      "executionTimeMillisEstimate" : 10,
      "works" : 4,
      "advanced" : 2,
      "needTime" : 1,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "docsExamined" : 2,
      "alreadyHasObj" : 0,
      "inputStage" : {
        "stage" : "IXSCAN",
        "nReturned" : 2,
        "executionTimeMillisEstimate" : 10,
        "works" : 4,
        "advanced" : 2,
        "needTime" : 1,
        "needYield" : 0,
        "saveState" : 0,
        "restoreState" : 0,
        "isEOF" : 1,
        "invalidates" : 0,
        "keyPattern" : {
          "barId" : 1
        },
        "indexName" : "barId_1",
        "isMultiKey" : false,
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 1,
        "direction" : "forward",
        "indexBounds" : {
          "barId" : [
            "[MinKey, null)",
            "(null, MaxKey]"
          ]
        },
        "keysExamined" : 3,
        "dupsTested" : 0,
        "dupsDropped" : 0,
        "seenInvalidated" : 0
      }
    },
    "allPlansExecution" : [ ]
  },
  "serverInfo" : {
    "host" : "myLinuxMachine",
    "port" : 8080,
    "version" : "3.2.9",
    "gitVersion" : "22ec9e93b40c85fc7cae7d56e7d6a02fd811088c"
  },
  "ok" : 1
}

这有效，仅扫描了 2 个文档，仅返回了 2 个文档。

我的问题是这样的。我应该在查询中使用 $exists 吗？它是否适合在现场制作应用程序中使用？如果答案是否定的，为什么 $exist 子句一开始就存在？

总是有可能是 mongo 安装有问题，或者索引设计不当。任何光线都会非常受欢迎，但现在我坚持使用 $ne:null hack。

最佳答案

你应该使用 partial index (首选)或 barId 字段的稀疏索引:

db.foo.createIndex(
   { barId: 1 },
   { partialFilterExpression: { barId: { $exists: true } } }
)

关于mongodb - 在具有索引的字段上使用 $exists 和 mongodb 的慢查询行为，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42378355/

文章推荐： c# - 异步设置 Thread.CurrentPrincipal？

文章推荐： android - 在后台收听传入的文本

文章推荐： c# - 如何直接在 C# 中执行批处理命令？

文章推荐：安卓 "could not find class"错误

MYSQL IF NOT EXISTS/WHERE NOT EXISTS 错误
我在 SQL 查询中使用了一个简单的 IF NOT EXISTS/WHERE NOT EXISTS 语句(我都尝试过)，但我总是收到 mysql 错误，不知道为什么。尝试使用不同的引号，检查我的 My
MySQL - 来自同一表的 NOT EXISTS/EXISTS 值更快
我有 2 个表:tbl1 和 tbl2。我想从 tbl1 返回一行，其中包含以下列:col1、col2、col3、can_be_deleted 、有重要项目。这个想法是，can_be_deleted
sql - 喜欢你的用户 (EXISTS) 但你没有与之聊天 (NOT EXISTS)
如果您是 "t1".persona_1_id = 2，则预期结果应返回 persona_id = 4。 like --- id persona_1_id persona_2_id liked 1 2
SQL - 如何在幂等插入示例中使用连接而不是 EXISTS 和 NOT EXISTS
我遇到了这个用于执行幂等插入的 github SQL 代码示例。完全按照我想要的方式工作。我不想使用 EXISTS，因为我觉得它有点困惑。可以使用联接对相同的操作进行编码吗？下面是我在 github
c# - 检查表是否存在 : Table doesn't exist while it exists
public bool CheckTblExist(string TblName) { try { string cmTxt = "s
sql-server - 如何在一个查询中使用 EXISTS 和 NOT EXISTS？
表1 Id Name DemoID 1 a 33 2 b 44 3 c 33 4 d 33 5 e 44 表2 Id DemoID IsT
sql - SQL中 "IF EXISTS"和 "IF NOT EXISTS"之间的区别？
我对 SQL 非常陌生。我想知道当我使用“IF EXISTS”或“IF NOT EXISTS”时会发生什么。例如:以下两个语句有什么区别: 语句 1:(存在) IF EXISTS( SELECT OR
exist-db - 如何为 exist-db 中的属性创建索引
我正在更新 exist-db 集合中的 XML 文件，我必须检查是否存在 id 以决定是否必须在我的文档中替换或插入某些内容。我注意到随着文件的增长，查询执行时间显着恶化，我决定为我的文件添加一个索
javascript - postgreSQL 错误 : "constraint does not exist" (but it does exist. ..)
我有一个正在尝试更新的数据库，但我不明白为什么会收到有关不存在的列的奇怪错误。当我使用“heroku pg:psql”访问数据库时，我完全可以看到该列。我找到了couple其他questions遇到类
mysql - SELECT EXISTS 和 EXISTS 之间的区别
我有一个这样的查询 SELECT ... FROM ... WHERE (SELECT EXISTS (SELECT...)) which did not return anything th
php - SQL : INSERT if no exist and UPDATE if exist
我有一个可以对数据库执行插入和更新的程序，我从 API 获取数据。这是我得到的示例数据: $uname = $get['userName']; $oname = $get['offerNa
Windows 批处理 : "if exist" -- path exists but it says no -- why?
我的批处理文件中有这个脚本 -- if not exist "%JAVA_HOME%" ( echo JAVA_HOME '%JAVA_HOME%' path doesn't exist) -
c# - 区分大小写 Directory.Exists/File.Exists
有没有办法让 Directory.Exists/File.Existssince 区分大小写 Directory.Exists(folderPath) 和 Directory.Exists(folde
mysql - SQL - EXISTS 和 NOT EXISTS 不等式
考虑使用这两个表和以下查询: SELECT Product. * FROM Product WHERE EXISTS ( SELECT * FROM Codes
eclipse - 在子剪辑 : How do I connect an existing workspace with an existing repository
我正在使用 Subclipse 1.6.18 使用 Eclipse 3.72 (Indigo) 来处理 SVN 1.6 存储库。这一切都在 Ubuntu 下运行。我有一个项目，在我更新我的 Ecli
Azure存储: Error checking for existence of existing Storage Share
我正在尝试使用 Terraform 配置 Azure 存储帐户和文件共享: resource "random_pet" "prefix" {} provider "azurerm" { versi
sql - 编写带有 NOT EXISTS 子句的查询，但不包含 NOT EXISTS 的子查询
我有兴趣为需要使用 NOT EXISTS 的应用程序编写查询。子句来检查一行是否存在。我正在使用 Sybase，但我想知道一般 SQL 中是否有一个示例，您可以在其中编写具有 NOT EXISTS
Azure存储: Error checking for existence of existing Storage Share
我正在尝试使用 Terraform 配置 Azure 存储帐户和文件共享: resource "random_pet" "prefix" {} provider "azurerm" { versi
sql - EXISTS 与 JOIN 以及 EXISTS 子句的使用
下面是代码示例: CREATE TABLE #titles( title_id varchar(20), title varchar(80)
sql - 使用 Exists 1 或 Exists * 的子查询
我曾经这样编写 EXISTS 检查: IF EXISTS (SELECT * FROM TABLE WHERE Columns=@Filters) BEGIN UPDATE TABLE SET

可可西里

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

mongodb - 在具有索引的字段上使用 $exists 和 mongodb 的慢查询行为