gpt4 book ai didi

sql - Oracle 加入时查询速度慢,拆分成两个查询时速度快

转载 作者:行者123 更新时间:2023-12-04 14:38:49 26 4
gpt4 key购买 nike

我有一个 sql 查询,它在连接两个表时很慢,但是当我第一次查询一个表并用它的输出查询另一个表时很快。

环境/前提条件

甲骨文版本:
Oracle 数据库 11g 企业版 11.2.0.3.0 版 - 64 位生产

涉及对象:
cfc_materialized(物化 View ,21,5 Mio 行)
联系人(表,12,6 Mio 行)

涉及的索引:
CREATE INDEX CONTACT_CLIENT ON CONTACT(CLIENT);
在 CFC_MATERIALIZED (ASSOCIATION_TYPE, SOURCEID, TARGETID) 上创建索引 CFC_MATERIALIZED_A_S_T;

我已经用级联 => true 重新计算了 2 个表的统计信息:

BEGIN
SYS.DBMS_STATS.GATHER_TABLE_STATS (
OwnName => '...'
,TabName => '...'
,Estimate_Percent => 0
,Degree => 4
,Cascade => TRUE
,No_Invalidate => FALSE);
END;
/

问题

我有以下查询:
SELECT c.*
FROM contact c
WHERE c.client in (
SELECT cfc.targetid
FROM cfc_materialized cfc
WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
AND cfc.association_type = 'ContactDataSharing'
)
AND c.deleted = 0;

解释计划:
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11M| 2214M| 38976 (1)| 00:07:48 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | | | | | Q1,01 | P->S | QC (RAND) |
| 3 | NESTED LOOPS | | | | | | Q1,01 | PCWP | |
| 4 | NESTED LOOPS | | 11M| 2214M| 38976 (1)| 00:07:48 | Q1,01 | PCWP | |
| 5 | SORT UNIQUE | | 2164 | 196K| 10672 (1)| 00:02:09 | Q1,01 | PCWP | |
| 6 | PX RECEIVE | | 2164 | 196K| 10672 (1)| 00:02:09 | Q1,01 | PCWP | |
| 7 | PX SEND HASH | :TQ10000 | 2164 | 196K| 10672 (1)| 00:02:09 | Q1,00 | P->P | HASH |
| 8 | PX BLOCK ITERATOR | | 2164 | 196K| 10672 (1)| 00:02:09 | Q1,00 | PCWC | |
|* 9 | MAT_VIEW ACCESS FULL | CFC_MATERIALIZED | 2164 | 196K| 10672 (1)| 00:02:09 | Q1,00 | PCWP | |
|* 10 | INDEX RANGE SCAN | CONTACT_CLIENT | 5500 | | 37 (0)| 00:00:01 | Q1,01 | PCWP | |
|* 11 | TABLE ACCESS BY INDEX ROWID| CONTACT | 5474 | 550K| 973 (0)| 00:00:12 | Q1,01 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

9 - filter("CFC"."SOURCEID"='e95027f0-a83e-11e3-a0ae-005056aebabc' AND "CFC"."ASSOCIATION_TYPE"='ContactDataSharing')
10 - access("C"."CLIENT"="CFC"."TARGETID")
11 - filter("C"."DELETED"=0)enter code here

这需要很长时间,我想知道为什么对 cfc_materialized 有完全访问权限。
当我将查询拆分为 2 个查询时,速度要快得多。
第一个查询:
SELECT cfc.targetid
FROM cfc_materialized cfc
WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
AND cfc.association_type = 'ContactDataSharing';

此查询返回 2 个目标 ID。
解释计划:
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2164 | 196K| 36 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| CFC_MATERIALIZED_A_S_T | 2164 | 196K| 36 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("CFC"."ASSOCIATION_TYPE"='ContactDataSharing' AND
"CFC"."SOURCEID"='e95027f0-a83e-11e3-a0ae-005056aebabc')

这里是第二个查询。我使用第一个查询的输出作为 IN 参数的输入:
SELECT *
FROM contact c
WHERE c.client in (
'e95027f0-a83e-11e3-a0ae-005056aebabc',
'eb37d3b0-a83e-11e3-a0ae-005056aebabc'
)
AND c.deleted = 0;

解释计划:
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2814 | 283K| 505 (0)| 00:00:07 |
| 1 | INLIST ITERATOR | | | | | |
|* 2 | TABLE ACCESS BY INDEX ROWID| CONTACT | 2814 | 283K| 505 (0)| 00:00:07 |
|* 3 | INDEX RANGE SCAN | CONTACT_CLIENT | 2827 | | 23 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("C"."DELETED"=0)
3 - access("C"."CLIENT"='e95027f0-a83e-11e3-a0ae-005056aebabc' OR
"C"."CLIENT"='eb37d3b0-a83e-11e3-a0ae-005056aebabc')

问题

所以我的问题是,为什么 CBO 没有像手动执行 2 个查询那样以类似的方式执行查询?我还尝试了带有提示的查询,以使用与 2 个查询中相同的索引:
SELECT /*+ index(c CONTACT_CLIENT) */ c.*
FROM contact c
WHERE c.client in (
SELECT /*+ index(cfc CFC_MATERIALIZED_A_S_T) */ cfc.targetid
FROM cfc_materialized cfc
WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
AND cfc.association_type = 'ContactDataSharing'
)
AND c.deleted = 0;

但后来我得到了一个更糟糕的解释计划:
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11M| 2214M| 305K (1)| 01:01:09 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 11M| 2214M| 305K (1)| 01:01:09 |
| 3 | SORT UNIQUE | | 2164 | 196K| 36 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | CFC_MATERIALIZED_A_S_T | 2164 | 196K| 36 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | CONTACT_CLIENT | 5500 | | 37 (0)| 00:00:01 |
|* 6 | TABLE ACCESS BY INDEX ROWID| CONTACT | 5474 | 550K| 973 (0)| 00:00:12 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("CFC"."ASSOCIATION_TYPE"='ContactDataSharing' AND
"CFC"."SOURCEID"='e95027f0-a83e-11e3-a0ae-005056aebabc')
5 - access("C"."CLIENT"="CFC"."TARGETID")
6 - filter("C"."DELETED"=0)

我还尝试使用连接而不是 Alexander 和 StanislavL 建议的 IN 子句:
SELECT c.*
FROM contact c
JOIN cfc_materialized cfc ON c.client = cfc.targetid
WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
AND cfc.association_type = 'ContactDataSharing'
AND c.deleted = 0;

并得到了以下解释计划,这同样很慢(比两个单独的查询慢):
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11M| 2214M| 51745 (1)| 00:10:21 |
|* 1 | HASH JOIN | | 11M| 2214M| 51745 (1)| 00:10:21 |
|* 2 | INDEX RANGE SCAN | CFC_MATERIALIZED_A_S_T | 2164 | 196K| 36 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| CONTACT | 12M| 1237M| 51649 (1)| 00:10:20 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("C"."CLIENT"="CFC"."TARGETID")
2 - access("CFC"."ASSOCIATION_TYPE"='ContactDataSharing' AND
"CFC"."SOURCEID"='e95027f0-a83e-11e3-a0ae-005056aebabc')
3 - filter("C"."DELETED"=0)

最佳答案

SELECT c.*
FROM contact c
join (
SELECT cfc.targetid as client
FROM cfc_materialized cfc
WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
AND cfc.association_type = 'ContactDataSharing'
) sub ON c.client=sub.client
AND c.deleted = 0;

使用 join 而不是 IN避免为 WHERE 中的每一行计算相同的查询

关于sql - Oracle 加入时查询速度慢,拆分成两个查询时速度快,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22629630/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com