gpt4 book ai didi

sql - 窗口函数排序成本很高,我们能克服它吗?

转载 作者:行者123 更新时间:2023-12-02 11:27:47 24 4
gpt4 key购买 nike

我的要求:识别 DEPT_NUM 的前 10 个帐户,按帐号升序排列。

查询:

SELECT * FROM
(
select acctnum,dept_num,row_number() OVER (PARTITION BY DEPT_NUM ORDER BY ACCTNUM) as row_identifier
FROM MYTABLE
)
WHERE row_identifier between 1 and 10;

追踪:

    7532 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1480074522

--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 577K| 15M| | 3855 (1)| 00:00:47 |
|* 1 | VIEW | | 577K| 15M| | 3855 (1)| 00:00:47 |
|* 2 | WINDOW SORT PUSHED RANK| | 577K| 7890K| 13M| 3855 (1)| 00:00:47 |
| 3 | INDEX FAST FULL SCAN | IMTAB05 | 577K| 7890K| | 987 (1)| 00:00:12 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ROW_IDENTIFIER">=1 AND "ROW_IDENTIFIER"<=5)
2 - filter(ROW_NUMBER() OVER ( PARTITION BY "DEPT_NUM" ORDER BY "ACCTNUM")<=5)


Statistics
----------------------------------------------------------
0 recursive calls
2 db block gets
4298 consistent gets
0 physical reads
0 redo size
144367 bytes sent via SQL*Net to client
486 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
7532 rows processed

索引:

索引扫描显示,DEPT_NUM列上的INDEX STORAGE

强制全表扫描使成本从 3855 降低到 11092

表中总行数为632667;

<小时/>

以上均为测试区结果。产量实际上是两倍。

我的数据库是Exadata,Quarter RAC。运行 Oracle 11g R2。数据库足够强大,可以立即执行,但是DBA不愿意使用13M的tempSpc。据商业报道,该报告的频率为每小时 4 次。主要的是,这个表进行了大量的实时插入/更新

我们可以即兴创作这个过程吗
1)增加 session 的PGA?(不确定是否真的可能?)
2)额外的索引会有帮助吗?

只是希望用不同的眼光来看待这个问题,因为我们的团队完全只关注 DBA 参数。

感谢您的任何建议!

最佳答案

分析函数的性能可能取决于索引列的顺序。将索引从 (ACCTNUM,DEPT_NUM) 更改为 (DEPT_NUM,ACCTNUM) 可能会降低成本并消除对临时表空间的需求。

partition by COL_2 order by COL_1 => INDEX FAST FULL SCAN|WINDOW SORT PUSHED RANK
partition by COL_1 order by COL_2 => INDEX FULL SCAN|WINDOW NOSORT

INDEX FAST FULL SCAN 使用更快的多 block IO,但它还需要对数据进行排序,并且可能需要为排序区域提供临时表空间。

INDEX FULL SCAN 使用较慢的单 block IO,但它按顺序返回数据并避免排序。

示例架构和数据

--drop table mytable;
create table mytable(dept_num number not null, acctnum number not null
,a number, b number, c number, d number, e number);
insert into mytable
select 1 dept_num, 1 acctnum, 0,0,0,0,0 from dual union all
select 1 dept_num, 2 acctnum, 0,0,0,0,0 from dual union all
select 1 dept_num, 3 acctnum, 0,0,0,0,0 from dual union all
select 2 dept_num, 1 acctnum, 0,0,0,0,0 from dual union all
select 2 dept_num, 2 acctnum, 0,0,0,0,0 from dual union all
select 3 dept_num, 1 acctnum, 0,0,0,0,0 from dual;
--Create 600K similar rows.
insert into mytable
select dept_num + rownumber*3, acctnum, a,b,c,d,e
from mytable
cross join (select level rownumber from dual connect by level <= 100000);
begin
dbms_stats.gather_table_stats(user, 'mytable');
end;
/

(ACCTNUM,DEPT_NUM) = 窗口排序推送排名

create index mytable_idx on mytable(acctnum, dept_num);

explain plan for
select dept_num, acctnum from
(
select dept_num, acctnum
,row_number() over (partition by dept_num order by acctnum) as row_identifier
from mytable
)
where row_identifier between 1 and 10;

select * from table(dbms_xplan.display);

Plan hash value: 952182109

------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 600K| 22M| | 1625 (3)| 00:00:23 |
|* 1 | VIEW | | 600K| 22M| | 1625 (3)| 00:00:23 |
|* 2 | WINDOW SORT PUSHED RANK| | 600K| 4687K| 9424K| 1625 (3)| 00:00:23 |
| 3 | INDEX FAST FULL SCAN | MYTABLE_IDX | 600K| 4687K| | 239 (3)| 00:00:04 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ROW_IDENTIFIER">=1 AND "ROW_IDENTIFIER"<=10)
2 - filter(ROW_NUMBER() OVER ( PARTITION BY "DEPT_NUM" ORDER BY "ACCTNUM")<=10)

(DEPT_NUM,ACCTNUM) = 窗口不排序

drop index mytable_idx;
create index mytable_idx on mytable(dept_num, acctnum);

explain plan for
select dept_num, acctnum from
(
select dept_num, acctnum
,row_number() over (partition by dept_num order by acctnum) as row_identifier
from mytable
)
where row_identifier between 1 and 10;

select * from table(dbms_xplan.display);

Plan hash value: 1773829932

---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 600K| 22M| 792 (2)| 00:00:12 |
|* 1 | VIEW | | 600K| 22M| 792 (2)| 00:00:12 |
|* 2 | WINDOW NOSORT | | 600K| 4687K| 792 (2)| 00:00:12 |
| 3 | INDEX FULL SCAN| MYTABLE_IDX | 600K| 4687K| 792 (2)| 00:00:12 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ROW_IDENTIFIER">=1 AND "ROW_IDENTIFIER"<=10)
2 - filter(ROW_NUMBER() OVER ( PARTITION BY "DEPT_NUM" ORDER BY
"ACCTNUM")<=10)

关于sql - 窗口函数排序成本很高,我们能克服它吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24620405/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com