gpt4 book ai didi

Oracle - 优化查询、大型数据库表、CLOB 字段

转载 作者:行者123 更新时间:2023-12-04 02:49:21 25 4
gpt4 key购买 nike

因此,我一直在为此绞尽脑汁,但不可否认,我对 Oracle 的了解不是很好。我们有一个表,其中包含大约 6000 万条记录,其中存储了建筑物的值。在我认为合适的地方添加了适当的索引,但性能仍然很差。这是应该有帮助的查询:

  SELECT count(*)
FROM viewBuildings
INNER JOIN tblValues
ON viewBuildings.bldg_id = tblValues.bldg_id
WHERE bldg_deleted = 0
AND (bldg_summary = 1
OR (bldg_root = 0 AND bldg_def = 0)
OR bldg_parent = 1)
AND field_id IN (207)
AND UPPER(dbms_lob.substr(v_value, 2000, 1)) = UPPER('2320')

因此,以上只是可以构造的查询的一个示例。它在 v_value CLOB 字段的 tblValues 中查找“2320”的匹配项。它是大写的,因为它可以搜索数字和文本值。 tblValues 有 6000 万条记录。它由建筑物 ID 和字段 ID 索引。

我可能需要提供更多信息,但就统计数据而言,跳到我面前的数字是“一致获取”。 Consistent gets = 74069。这是一个很大的数字吗?

任何建议都很好,主要是在处理大型数据库表上的 CLOB 字段时。无法使用上下文类型索引,因为我需要精确匹配,并且要查找的数据可以是数字或字符串。

编辑(更多信息):tblBuildings 是 viewBuildings(一个 View )的一部分,有 80,000 条记录tblValues 有每个建筑物的值(value),有 68,000,000 条记录tblValues 每个建筑有大约 550 个字段 (field_id)

期望的结果:查询在 < 5 秒内返回结果。这不合理吗?有时它会无限期地运行,有时可能是 80 秒。

解释计划结果

Plan hash value: 1480138519
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------------------|
| 0 | SELECT STATEMENT | | 1 | 192 | 32 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 192 | | |
| 2 | NESTED LOOPS | | 1 | 192 | 15 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 183 | 12 (0)| 00:00:01 |
|* 4 | FILTER | | | | | |
| 5 | NESTED LOOPS OUTER | | 1 | 64 | 10 (0)| 00:00:01 |
|* 6 | TABLE ACCESS BY INDEX ROWID | TBLBUILDINGS | 1 | 60 | 9 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | SAA_4 | 17 | | 3 (0)| 00:00:01 |
| 8 | NESTED LOOPS | | 1 | 21 | 3 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID| TBLBUILDINGSTATUSES | 1 | 15 | 2 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | IDX_BUILDINGSTATUS_EXCLUDEQUERY | 1 | | 1 (0)| 00:00:01 |
|* 11 | INDEX RANGE SCAN | IDX_BUILDING_STATUS_ASID_DELETED | 1 | 6 | 1 (0)| 00:00:01 |
| 12 | TABLE ACCESS BY INDEX ROWID | TBLBUILDINGSTATUSES | 1 | 4 | 1 (0)| 00:00:01 |
|* 13 | INDEX UNIQUE SCAN | PK_TBLBUILDINGSTATUS | 1 | | 0 (0)| 00:00:01 |
|* 14 | TABLE ACCESS BY INDEX ROWID | TBLVALUES | 1 | 119 | 2 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_SAA_6 | 1 | | 1 (0)| 00:00:01 |
| 16 | INLIST ITERATOR | | | | | |
|* 17 | INDEX RANGE SCAN | SAA_7 | 1 | 9 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

4 - filter("TBLBUILDINGSTATUSES"."BUILDING_STATUS_HIDE_REPORTS" IS NULL OR
"TBLBUILDINGSTATUSES"."BUILDING_STATUS_HIDE_REPORTS"=0)
6 - filter("TBLBUILDINGS"."BLDG_SUMMARY"=1 OR "TBLBUILDINGS"."BLDG_SUB_BUILDING_PARENT"=1 OR
"TBLBUILDINGS"."BLDG_BUILDING_DEF"=0 AND "TBLBUILDINGS"."BLDG_ROOT"=0)
7 - access("TBLBUILDINGS"."BLDG_DELETED"=0)
filter( NOT EXISTS (SELECT 0 FROM "TBLBUILDINGSTATUSES" "TBLBUILDINGSTATUSES","TBLBUILDINGS" "TBLBUILDINGS" WHERE
"TBLBUILDINGS"."BLDG_ID"=:B1 AND "TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID" AND
"TBLBUILDINGSTATUSES"."BUILDING_STATUS_EXCLUDE_QUERY"=1))
10 - access("TBLBUILDINGSTATUSES"."BUILDING_STATUS_EXCLUDE_QUERY"=1)
11 - access("TBLBUILDINGS"."BLDG_ID"=:B1 AND "TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID")
filter("TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID")
13 - access("TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"(+)="TBLBUILDINGS"."BUILDING_STATUS_ID")
14 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLVALUES"."V_VALUE",2000,1))=U'2320')
15 - access("TBLVALUES"."FE_ID"=207 AND "TBLBUILDINGS"."BLDG_ID"="TBLVALUES"."BLDG_ID")
17 - access("TBLINSPECTORBUILDINGMAP"."IN_ID"=1 AND ("TBLINSPECTORBUILDINGMAP"."IAM_BUILDING_ACCESS_LEVEL"=0 OR
"TBLINSPECTORBUILDINGMAP"."IAM_BUILDING_ACCESS_LEVEL"=1) AND "TBLBUILDINGS"."BLDG_ID"="TBLINSPECTORBUILDINGMAP"."BLDG_ID")

44 rows selected

Plan hash value: 2137789089

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR PICKLER FETCH| DISPLAY | 8168 | 16336 | 29 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

好的,我按照您的建议收集了统计数据,然后这是 plan_table_output。看起来 IDX_CURVAL_FE_ID 是这里的问题?这是字段 ID 值表上的索引。

SQL_ID  d4aq8nsr1p6uw, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ count(*) FROM
viewAssetsForUser1 INNER JOIN tblCurrentValues ON
viewAssetsForUser1.as_id = tblCurrentValues.as_id WHERE as_deleted =
:"SYS_B_0" AND (as_summary = :"SYS_B_1" OR (as_root =
:"SYS_B_2" AND as_asset_def = :"SYS_B_3") OR
as_sub_asset_parent = :"SYS_B_4") AND fe_id IN (:"SYS_B_5")
AND UPPER(dbms_lob.substr(cv_value, :"SYS_B_6", :"SYS_B_7")) =
UPPER(:"SYS_B_8")

Plan hash value: 4033422776

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:08:43.19 | 56589 | 56084 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:08:43.19 | 56589 | 56084 | | | |
|* 2 | FILTER | | 1 | | 0 |00:08:43.19 | 56589 | 56084 | | | |
| 3 | NESTED LOOPS | | 1 | | 0 |00:08:43.19 | 56589 | 56084 | | | |
| 4 | NESTED LOOPS | | 1 | 115 | 0 |00:08:43.19 | 56589 | 56084 | | | |
|* 5 | FILTER | | 1 | | 0 |00:08:43.19 | 56589 | 56084 | | | |
|* 6 | HASH JOIN RIGHT OUTER | | 1 | 82 | 0 |00:08:43.19 | 56589 | 56084 | 1348K| 1348K| 742K (0)|
| 7 | TABLE ACCESS FULL | TBLASSETSTATUSES | 1 | 4 | 4 |00:00:00.01 | 3 | 0 | | | |
| 8 | NESTED LOOPS | | 1 | | 0 |00:08:43.19 | 56586 | 56084 | | | |
| 9 | NESTED LOOPS | | 1 | 163 | 0 |00:08:43.19 | 56586 | 56084 | | | |
|* 10 | TABLE ACCESS BY INDEX ROWID | TBLCURRENTVALUES | 1 | 163 | 0 |00:08:43.19 | 56586 | 56084 | | | |
|* 11 | INDEX RANGE SCAN | IDX_CURVAL_FE_ID | 1 | 16283 | 61357 |00:00:05.98 | 132 | 132 | | | |
|* 12 | INDEX RANGE SCAN | SAA_1 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
|* 13 | TABLE ACCESS BY INDEX ROWID | TBLASSETS | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
|* 14 | INDEX UNIQUE SCAN | PK_TBLINSPECTORBRIDGEMAP2 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
|* 15 | TABLE ACCESS BY GLOBAL INDEX ROWID| TBLINSPECTORASSETMAP | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(:SYS_B_0=0)
5 - filter(("TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS" IS NULL OR "TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS"=0))
6 - access("TBLASSETSTATUSES"."ASSET_STATUS_ID"="TBLASSETS"."ASSET_STATUS_ID")
10 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLCURRENTVALUES"."CV_VALUE",:SYS_B_6,:SYS_B_7))=SYS_OP_C2C(UPPER(:SYS_B_8)))
11 - access("TBLCURRENTVALUES"."FE_ID"=:SYS_B_5)
12 - access("TBLASSETS"."AS_DELETED"=:SYS_B_0 AND "TBLASSETS"."AS_ID"="TBLCURRENTVALUES"."AS_ID")
13 - filter((("TBLASSETS"."AS_ROOT"=:SYS_B_2 AND "TBLASSETS"."AS_ASSET_DEF"=:SYS_B_3) OR "TBLASSETS"."AS_SUMMARY"=:SYS_B_1 OR
"TBLASSETS"."AS_SUB_ASSET_PARENT"=:SYS_B_4))
14 - access("TBLASSETS"."AS_ID"="TBLINSPECTORASSETMAP"."AS_ID" AND "TBLINSPECTORASSETMAP"."IN_ID"=1)
15 - filter(("TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=0 OR "TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=1))

最佳答案

糟糕的索引成本 如果统计数据是最新的,并且优化器有相对较好的基数估计,为什么它会选择一个糟糕的计划?也许有一个参数使索引看起来人为便宜。看看:select * from v$parameter where name in ('optimizer_index_cost_adj', 'optimizer_index_caching'); 它们与默认值 100 和 0 有明显不同吗?

此外,请查看 select * from sys.aux_stats$; 也许您的系统统计信息使全表扫描看起来过于昂贵。某些版本的 Oracle 存在工作负载统计信息错误,其中的数字有几个数量级的错误。

或者您的表可能非常大,16K 索引读取是最佳访问路径。查看 DBA_SEGMENTS.BYTES 以找到您的表和 LOB 段的大小。

即使表是中型的,并且计划更改为全表扫描,也可能不会将运行时间控制在 5 秒以内。但是结合你的分区想法,可能就够了。

LOB 存储 根据您的示例,我假设大多数 CLOB 都相对较小?也许您有一个不寻常的 LOB 设置浪费了大量空间,例如 DISABLE STORAGE IN ROW。您可能想检查您的表 DDL,或将其全部张贴在这里。或者,如果您可以将 CLOB 替换为 VARCHAR2,那就更好了。

FBI CLOB 上基于函数的索引可能会显着加快处理速度。但它可能是一个非常大的索引:create index TBLCURRENTVALUES_FBI on TBLCURRENTVALUES(UPPER(dbms_lob.substr(v_value, 2000, 1)));

CURSOR_SHARING 查询发生了一些变化,这使得调整变得困难。看起来这个最新版本有 CURSOR_SHARING=FORCE,这很不寻常。对于昂贵的查询,使用文字可能是一件好事 - 构建查询计划所花费的额外时间可能是值得的。如果系统参数无法更改,请查看提示 /*+ cursor_sharing_exact */

关于Oracle - 优化查询、大型数据库表、CLOB 字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18116398/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com