gpt4 book ai didi

sql - 如何优化此 LIKE JOIN 查询?

转载 作者:行者123 更新时间:2023-12-03 16:21:00 25 4
gpt4 key购买 nike

此查询查找域的后缀:

        SELECT
DISTINCT ON ("companyDomain".id)
"companyDomain".domain,
"publicSuffix".suffix
FROM
"companyDomain"
INNER JOIN
"publicSuffix"
ON
REVERSE("companyDomain".domain) LIKE REVERSE("publicSuffix".suffix) || '%'
ORDER BY
"companyDomain".id, LENGTH("publicSuffix".suffix) DESC
编辑:注意这也适用于子域。
你可以摆弄这个例子 here并使用 pev 可视化计划.我已经尝试向表中添加覆盖索引,但它们最终没有被查询规划器使用。也许还有另一个查询可能更有效?

最佳答案

您是否考虑过使用 gin指数 ?
我对您的示例 DML 进行了以下修改:

CREATE EXTENSION IF NOT EXISTS pg_trgm;
...
CREATE INDEX companyDomain_domain_reverse ON "companyDomain" USING gin (REVERSE(domain) gin_trgm_ops);
...
CREATE INDEX publicSuffix_suffix_reverse ON "publicSuffix" USING gin (REVERSE(suffix) gin_trgm_ops);
这是查询计划:
+--------------------------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN |
+--------------------------------------------------------------------------------------------------------------------------------------------------------+
|Unique (cost=40802.07..41004.44 rows=908 width=31) (actual time=98.229..98.356 rows=908 loops=1) |
| -> Sort (cost=40802.07..40903.26 rows=40474 width=31) (actual time=98.228..98.264 rows=1006 loops=1) |
| Sort Key: "companyDomain".id, (length(("publicSuffix".suffix)::text)) DESC |
| Sort Method: quicksort Memory: 103kB |
| -> Nested Loop (cost=0.05..37704.86 rows=40474 width=31) (actual time=1.655..97.976 rows=1006 loops=1) |
| -> Seq Scan on "publicSuffix" (cost=0.00..151.15 rows=8915 width=12) (actual time=0.011..0.728 rows=8915 loops=1) |
| -> Bitmap Heap Scan on "companyDomain" (cost=0.05..4.15 rows=5 width=15) (actual time=0.010..0.010 rows=0 loops=8915) |
| Recheck Cond: (reverse((domain)::text) ~~ (reverse(("publicSuffix".suffix)::text) || '%'::text)) |
| Rows Removed by Index Recheck: 0 |
| Heap Blocks: exact=301 |
| -> Bitmap Index Scan on companydomain_domain_reverse (cost=0.00..0.05 rows=5 width=0) (actual time=0.010..0.010 rows=0 loops=8915)|
| Index Cond: (reverse((domain)::text) ~~ (reverse(("publicSuffix".suffix)::text) || '%'::text)) |
|Planning Time: 0.150 ms |
|Execution Time: 98.439 ms |
+--------------------------------------------------------------------------------------------------------------------------------------------------------+

作为奖励 - 您甚至不需要 REVERSE()索引和查询中的文本:
create index companydomain_domain
on "companyDomain" using gin(domain gin_trgm_ops);



SELECT DISTINCT ON ("companyDomain".id) "companyDomain".domain, "publicSuffix".suffix
FROM "companyDomain"
INNER JOIN "publicSuffix" ON "companyDomain".domain LIKE '%' || "publicSuffix".suffix
ORDER BY "companyDomain".id, LENGTH("publicSuffix".suffix) DESC
查询花费相同的时间并且仍然使用 gin 索引:
+------------------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------------+
|Unique (cost=40556.91..40759.28 rows=908 width=31) (actual time=96.170..96.315 rows=908 loops=1) |
| -> Sort (cost=40556.91..40658.10 rows=40474 width=31) (actual time=96.169..96.209 rows=1006 loops=1) |
| Sort Key: "companyDomain".id, (length(("publicSuffix".suffix)::text)) DESC |
| Sort Method: quicksort Memory: 103kB |
| -> Nested Loop (cost=0.05..37459.70 rows=40474 width=31) (actual time=1.764..95.919 rows=1006 loops=1) |
| -> Seq Scan on "publicSuffix" (cost=0.00..151.15 rows=8915 width=12) (actual time=0.009..0.711 rows=8915 loops=1) |
| -> Bitmap Heap Scan on "companyDomain" (cost=0.05..4.12 rows=5 width=15) (actual time=0.010..0.010 rows=0 loops=8915) |
| Recheck Cond: ((domain)::text ~~ ('%'::text || ("publicSuffix".suffix)::text)) |
| Rows Removed by Index Recheck: 0 |
| Heap Blocks: exact=301 |
| -> Bitmap Index Scan on companydomain_domain (cost=0.00..0.05 rows=5 width=0) (actual time=0.010..0.010 rows=0 loops=8915)|
| Index Cond: ((domain)::text ~~ ('%'::text || ("publicSuffix".suffix)::text)) |
|Planning Time: 0.132 ms |
|Execution Time: 96.393 ms |
+------------------------------------------------------------------------------------------------------------------------------------------------+

PS:我猜你只需要一个索引 - 在这种情况下: companyDomain_domain_reverse

关于sql - 如何优化此 LIKE JOIN 查询?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62485496/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com