gpt4 book ai didi

Mongodb低基数索引

转载 作者:行者123 更新时间:2023-12-03 15:59:40 25 4
gpt4 key购买 nike

从sql背景我知道

The cardinality of an index is the number of unique values within it. Your database table may have a billion rows in it, but if it only has 8 unique values among those rows, your cardinality is very low.

A low cardinality index is not a major efficiency gain. Most SQL indexes are binary search trees (B-Trees). Versus a serial scan of every row in a table to find matching constraints, a B-Tree logarithmically reduces the number of comparisons that have to be made. The gains from executing a search against a B-Tree are very low when the size of the tree is small.

So putting an index on a Boolean field? Or an enumerated value field? A cardinality of a very small number of distinct values among a very large number of rows will not yield noticeable efficiency gains. Save your database indexes for fields with very high cardinality to ensure the gains from scanning a B-Tree are largest versus sequential scans.

mongodb 怎么样?我们必须在经常过滤的低基数字段上创建索引吗?例如一个具有 4 个状态的枚举字段

最佳答案

是的,MongoDB也有同样的问题,它使用B-Trees进行索引。因此,带有索引的低基数值会出现性能问题。

这是一篇关于它的好文章

https://www.percona.com/blog/2018/12/19/using-partial-and-sparse-indexes-in-mongodb/

虽然没有简单或受支持的解决方案,但它为特定情况提供了一些选项:

  • you run queries on a boolean field with an uneven distribution, and you look mostly for the less frequent value
  • you have a low cardinality field and the majority of the queries look for a subset of the values
  • the majority of the queries look for a limited subset of the values in a field
  • you don’t have enough memory to store very large indexes – for example, you have a lot of page evictions from the WiredTiger cache

关于Mongodb低基数索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51579528/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com