gpt4 book ai didi

json - Bitmap Index Scan always Followed by Bitmap Heap Scan for JSON field query

转载 作者:行者123 更新时间:2023-11-29 12:40:47 26 4
gpt4 key购买 nike

我有以下索引:

CREATE INDEX index_c_profiles_on_city_state_name_domain ON 
c_profiles ((data->>'state'), (data->>'city'), name, domain);

我正在使用以下查询:

SELECT mm.name, mm.domain, mm.data ->> 'city' as city, mm.data ->> 
'state' as state
FROM c_profiles as mm
WHERE ((mm.data ->> 'state') = 'AZ')

但是当我使用 EXPLAIN ANALYZE 对此进行测试时,它总是执行位图索引扫描(良好且快速),然后是非常非常慢的位图堆扫描(通常比单独的索引扫描慢 100 倍)。

我也试过只索引 WHERE 条件,结果是一样的,它在使用索引后仍然在做非常慢的位图堆扫描。

为什么 Postgres 这样做?我怎样才能让它只进行索引扫描来加快此查询?

这是一个 EXPLAIN ANALYZE 结果示例:

[
{
"Execution Time": 53.655,
"Planning Time": 0.081,
"Plan": {
"Exact Heap Blocks": 1338,
"Node Type": "Bitmap Heap Scan",
"Actual Total Time": 53.031,
"Shared Hit Blocks": 727,
"Schema": "public",
"Plans": [
{
"Node Type": "Bitmap Index Scan",
"Actual Total Time": 0.455,
"Shared Hit Blocks": 2,
"Shared Read Blocks": 13,
"Temp Written Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Hit Blocks": 0,
"Plan Width": 0,
"Actual Loops": 1,
"Actual Startup Time": 0.455,
"Temp Read Blocks": 0,
"Local Read Blocks": 0,
"Index Name": "index_mattermark_profiles_on_city_state_name_domain",
"Startup Cost": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Written Blocks": 0,
"Plan Rows": 788,
"Index Cond": "((mm.data ->> 'state'::text) = 'AZ'::text)",
"Actual Rows": 1417,
"Parent Relationship": "Outer",
"Total Cost": 34.33
}
],
"Shared Read Blocks": 650,
"Relation Name": "mattermark_profiles",
"Local Hit Blocks": 0,
"Local Dirtied Blocks": 0,
"Temp Written Blocks": 0,
"Plan Width": 1010,
"Actual Loops": 1,
"Rows Removed by Index Recheck": 0,
"Lossy Heap Blocks": 0,
"Alias": "mm",
"Recheck Cond": "((mm.data ->> 'state'::text) = 'AZ'::text)",
"Temp Read Blocks": 0,
"Output": [
"name",
"domain",
"(data ->> 'city'::text)",
"(data ->> 'state'::text)"
],
"Actual Startup Time": 0.703,
"Local Read Blocks": 0,
"Startup Cost": 34.53,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Written Blocks": 0,
"Plan Rows": 788,
"Actual Rows": 1417,
"Total Cost": 2894.17
},
"Triggers": []
}
]

最佳答案

PostgreSQL 选择位图索引扫描而不是普通的索引扫描,因为它认为这样会更快。

当估计的结果行数很高时通常会出现这种情况。

正常的索引扫描必须为找到的每个索引条目访问表,这会导致表上出现大量随机 I/O,并且可能需要多次处理同一个 block 。

位图索引扫描的工作原理是首先找到所有索引条目,按照它们在表中的物理位置的顺序对它们进行排序,然后从表中扫描所需的 block 。这样效率更高,因为它将按顺序扫描表 block 。

第二步,位图堆扫描,作为它自己的节点出现在 EXPLAIN 输出中,通常是更昂贵的步骤。

所以一切看起来都井井有条。

您可以尝试将 enable_bitmapscan 设置为 off 并查看 PostgreSQL 是否正确以及由此产生的计划是否会更昂贵。

关于json - Bitmap Index Scan always Followed by Bitmap Heap Scan for JSON field query,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51409461/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com