gpt4 book ai didi

performance - 尽管进行了 STOPKEY 优化,但前 N 个查询还是做了太多工作

转载 作者:行者123 更新时间:2023-12-03 22:46:00 26 4
gpt4 key购买 nike

这会很长,所以这里有一个简短的总结来吸引你:我的
使用 COUNT STOPKEY 的前 N ​​个查询和 ORDER BY STOPKEY在它的计划中是
仍然无缘无故地缓慢。

现在,细节。它从一个缓慢的函数开始。在现实生活中它
涉及使用正则表达式的字符串操作。出于演示目的,
这是一个故意愚蠢的递归斐波那契算法。我找到它了
对于高达大约 25 的输入,速度非常快,大约 30,并且
35岁可笑。

-- I repeat: Please no advice on how to do Fibonacci correctly.
-- This is slow on purpose!
CREATE OR REPLACE FUNCTION tmp_fib (
n INTEGER
)
RETURN INTEGER
AS
BEGIN
IF n = 0 OR n = 1 THEN
RETURN 1;
END IF;
RETURN tmp_fib(n-2) + tmp_fib(n-1);
END;
/

现在一些输入:名称和数字的列表。
CREATE TABLE tmp_table (
name VARCHAR2(20) UNIQUE NOT NULL,
num NUMBER(2,0)
);
INSERT INTO tmp_table (name,num)
SELECT 'Alpha', 10 FROM dual UNION ALL
SELECT 'Bravo', 11 FROM dual UNION ALL
SELECT 'Charlie', 33 FROM dual;

下面是一个慢查询的例子:使用慢 Fibonacci 函数来
选择 num 生成具有双位数的斐波那契数的行。
SELECT p.name, p.num
FROM tmp_table p
WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1')
ORDER BY p.name;

这对于 11 和 33 是​​正确的,所以 BravoCharlie在输出中。
运行大概需要5秒,几乎都是慢的 tmp_fib(33)的计算.所以我想做一个更快的版本
通过将其转换为 top-N 查询来降低查询速度。当 N=1 时,看起来像
这个:
SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1')
ORDER BY p.name
)
WHERE ROWNUM <= 1;

现在它返回最高结果, Bravo .但是还是需要5秒
运行!唯一的解释是它仍在计算 tmp_fib(33) ,即使该计算的结果无关紧要
结果。应该可以决定 Bravo当时正在
要输出,因此无需测试 WHERE 条件
table 的其余部分。

我想也许优化器只需要被告知 tmp_fib价格昂贵。所以我试着告诉它,像这样:
ASSOCIATE STATISTICS WITH FUNCTIONS tmp_fib DEFAULT COST (999999999,0,0);

这改变了计划中的一些成本数字,但并没有
查询运行得更快。
SELECT * FROM v$version 的输出如果这是依赖于版本的:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for 64-bit Windows: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

这是 top-1 查询的自动跟踪。它似乎声称
查询花了 1 秒,但事实并非如此。它运行了大约 5
秒。
NAME                        NUM
-------------------- ----------
Bravo 11


Execution Plan
----------------------------------------------------------
Plan hash value: 548796432

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 55 | 4 (25)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 55 | 4 (25)| 00:00:01 |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 55 | 4 (25)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | TMP_TABLE | 1 | 55 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - filter(ROWNUM<=1)
4 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("P"."NUM")),'(.)\1'))

Note
-----
- dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
27 recursive calls
0 db block gets
25 consistent gets
0 physical reads
0 redo size
593 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

更新 :正如我在评论中提到的,一个 INDEX提示对这个查询有很大帮助。被接受为正确答案就足够了,即使它不能很好地转化为我的真实世界场景。具有讽刺意味的是,甲骨文似乎从经验中吸取了教训,现在选择了 INDEX。默认计划;我必须告诉它 NO_INDEX重现原来的缓慢行为。

在实际场景中,我应用了更复杂的解决方案,将查询重写为 PL/SQL 函数。这是我的技术的外观,应用于 fib问题:
CREATE OR REPLACE PACKAGE tmp_package IS
TYPE t_namenum IS TABLE OF tmp_table%ROWTYPE;
FUNCTION get_interesting_names (howmany INTEGER) RETURN t_namenum PIPELINED;
END;
/

CREATE OR REPLACE PACKAGE BODY tmp_package IS
FUNCTION get_interesting_names (howmany INTEGER) RETURN t_namenum PIPELINED IS
CURSOR c IS SELECT name, num FROM tmp_table ORDER BY name;
rec c%ROWTYPE;
outcount INTEGER;
BEGIN
OPEN c;
outcount := 0;
WHILE outcount < howmany LOOP
FETCH c INTO rec;
EXIT WHEN c%NOTFOUND;
IF REGEXP_LIKE(tmp_fib(rec.num), '(.)\1') THEN
PIPE ROW(rec);
outcount := outcount + 1;
END IF;
END LOOP;
END;
END;
/

SELECT * FROM TABLE(tmp_package.get_interesting_names(1));

感谢阅读问题并运行测试并帮助我理解执行计划的响应者,我将按照他们的建议处理这个问题。

最佳答案

后续评论,因为这太大了。在 11.2.0.3 (OEL) 下运行,您的查询:

SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1')
ORDER BY p.name
)
WHERE ROWNUM <= 1;

NAME NUM
-------------------- ----------
Bravo 11

Elapsed: 00:00:00.094
Plan hash value: 1058933870

----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 4 (25)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
|* 2 | VIEW | | 3 | 75 | 4 (25)| 00:00:01 |
| 3 | SORT ORDER BY | | 3 | 75 | 4 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL| TMP_TABLE | 3 | 75 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
2 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("NUM")),'(.)\1'))

Note
-----
- dynamic sampling used for this statement (level=2)

请注意 SORT ORDER BY 中的变化从你看到的,以及相应的 rows值。将 order-by 移动到子选择中看起来更像你的:
SELECT * FROM (
SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
ORDER BY p.name
)
WHERE REGEXP_LIKE(tmp_fib(num), '(.)\1')
)
WHERE ROWNUM <= 1;

NAME NUM
-------------------- ----------
Bravo 11

Elapsed: 00:00:07.894
Plan hash value: 548796432

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 171 (99)| 00:00:03 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 25 | 171 (99)| 00:00:03 |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 25 | 171 (99)| 00:00:03 |
|* 4 | TABLE ACCESS FULL | TMP_TABLE | 1 | 25 | 170 (99)| 00:00:03 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - filter(ROWNUM<=1)
4 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("P"."NUM")),'(.)\1'))

Note
-----
- dynamic sampling used for this statement (level=2)

不知道这在您的实际场景中会有多大帮助或实用性,但在这种情况下(无论如何,在我的环境中),在所有获取的列中添加索引 - 以获得完整索引扫描而不是完整表扫描 - 似乎改变行为:
CREATE INDEX tmp_index ON tmp_table(name, num);

index TMP_INDEX created.

SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1')
ORDER BY p.name
)
WHERE ROWNUM <= 1;

NAME NUM
-------------------- ----------
Bravo 11

Elapsed: 00:00:00.093
Plan hash value: 1841475998

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 1 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
|* 2 | VIEW | | 3 | 75 | 1 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN| TMP_INDEX | 3 | 75 | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
2 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("NUM")),'(.)\1'))

Note
-----
- dynamic sampling used for this statement (level=2)

SELECT * FROM (
SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
ORDER BY p.name
)
WHERE REGEXP_LIKE(tmp_fib(num), '(.)\1')
)
WHERE ROWNUM <= 1;

NAME NUM
-------------------- ----------
Bravo 11

Elapsed: 00:00:00.093
Plan hash value: 1841475998

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 1 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 25 | 1 (0)| 00:00:01 |
|* 3 | INDEX FULL SCAN| TMP_INDEX | 1 | 25 | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("P"."NUM")),'(.)\1'))

Note
-----
- dynamic sampling used for this statement (level=2)

顺便说一句,在我用任何一个 rownum 跑了几次之后我最终开始得到的变体 ORA-01000: maximum open cursors exceeded错误。我在每次运行结束时放下对象但保持连接。我认为这表明某处存在另一个错误,尽管可能与您看到的内容无关,因为即使使用索引扫描也会发生。

关于performance - 尽管进行了 STOPKEY 优化,但前 N 个查询还是做了太多工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16679808/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com