sql - 使用 SQL 逻辑 'OR' 和运算符 'LIKE' 时，如何调整 PostgreSQL 查询优化器？-6ren

sql - 使用 SQL 逻辑 'OR' 和运算符 'LIKE' 时，如何调整 PostgreSQL 查询优化器？

转载作者：行者123 更新时间：2023-11-29 12:46:44

当查询包含 SQL OR 时，我如何调整 PostgreSQL 查询计划(或 SQL 查询本身)以更优化地使用可用索引？使用 LIKE 的条件运算符而不是 = ？

例如，考虑以下仅需5 毫秒 即可执行的查询:

explain analyze select *
from report_workflow.request_attribute
where domain = 'externalId'
and scoped_id_value[2] = 'G130324135454100';

"Bitmap Heap Scan on request_attribute  (cost=44.94..6617.85 rows=2081 width=139) (actual time=4.619..4.619 rows=2 loops=1)"
"  Recheck Cond: (((scoped_id_value[2])::text = 'G130324135454100'::text) AND ((domain)::text = 'externalId'::text))"
"  ->  Bitmap Index Scan on request_attribute_accession_number  (cost=0.00..44.42 rows=2081 width=0) (actual time=3.777..3.777 rows=2 loops=1)"
"        Index Cond: ((scoped_id_value[2])::text = 'G130324135454100'::text)"
"Total runtime: 5.059 ms"

如查询计划所示，此查询利用了部分索引 request_attribute_accession_number及索引条件scoped_id_value[2] = 'G130324135454100' .首页 request_attribute_accession_number具有以下定义:

CREATE INDEX request_attribute_accession_number
ON report_workflow.request_attribute((scoped_id_value[2]))
WHERE domain = 'externalId';

(请注意，表 scoped_id_value 中的列 request_attribute 的类型为 character varying[] 。)

但是，当我向同一个查询添加一个额外的 OR 时使用相同数组列元素的条件 scoped_id_value[2] , 但 LIKE运算符而不是 = ，尽管在相同的第一个条件下产生相同的结果，但查询现在需要 7553 毫秒:

explain analyze select *
from report_workflow.request_attribute
where domain = 'externalId'
and (scoped_id_value[2] = 'G130324135454100'
or scoped_id_value[2] like '%G130324135454100%');

"Bitmap Heap Scan on request_attribute  (cost=7664.77..46768.27 rows=2122 width=139) (actual time=142.164..7552.650 rows=2 loops=1)"
"  Recheck Cond: ((domain)::text = 'externalId'::text)"
"  Rows Removed by Index Recheck: 1728712"
"  Filter: (((scoped_id_value[2])::text = 'G130324135454100'::text) OR ((scoped_id_value[2])::text ~~ '%G130324135454100%'::text))"
"  Rows Removed by Filter: 415884"
"  ->  Bitmap Index Scan on request_attribute_accession_number  (cost=0.00..7664.24 rows=416143 width=0) (actual time=136.249..136.249 rows=415886 loops=1)"
"Total runtime: 7553.154 ms"

注意这次查询优化器忽略索引条件 scoped_id_value[2] = 'G130324135454100'当它使用索引 request_attribute_accession_number 执行内部位图索引扫描时并因此生成 415,886 行，而不是像第一个查询那样只生成两行。

在介绍 OR 时条件与 LIKE运算符进入第二个查询，为什么优化器生成的查询计划不如第一个查询计划优化？我如何调整查询优化器或查询以更像第一个查询？

最佳答案

在第二个计划中，你有:

scoped_id_value[2] like '%G130324135454100%'

Postgres(或任何其他数据库)不能使用索引来解决这个问题。它在索引中的什么位置？它甚至不知道从哪里开始，所以它必须进行全表扫描。

对于这种情况，您可以通过在表达式上构建索引来处理这个问题(参见 here )。但是，这将非常特定于字符串 'G130324135454100'。

我应该补充一点，问题不在于like。 Postgres 将在以下位置使用索引:

scoped_id_value[2] like 'G130324135454100%'

关于sql - 使用 SQL 逻辑 'OR' 和运算符 'LIKE' 时，如何调整 PostgreSQL 查询优化器？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/16776154/

文章推荐： mysql - 将表中日期时间的所有列转换为不同的日期格式

文章推荐： ios7 presentviewcontroller 无法设置框架

文章推荐： java - Hibernate xml 列定义

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

sql - 使用 SQL 逻辑 'OR' 和运算符 'LIKE' 时，如何调整 PostgreSQL 查询优化器？