gpt4 book ai didi

PostgreSQL:快速检查 LTREE [] <@ LTREE[] 的所有元素是否

转载 作者:行者123 更新时间:2023-11-29 13:45:14 24 4
gpt4 key购买 nike

我有下表(简化):

CREATE TABLE groups
( id PRIMARY KEY,
path ltree,
...
);

CREATE TABLE items
( id bigserial,
path ltree,
...
PRIMARY KEY (id, path)
);

对于每个项目,也有一个项目所属组的列表。组由其完整路径表示。可能有多达 1000 万个项目,每个项目属于大约 20 个组。

我需要设计以下查询。给定(a)一个“父”组和(b)最多 10 个附加组的列表,找到“父”组的直接后代,它们的子树中至少有一个项目包含在每个组中搜索条件。

例如,给定父组“NorthAmerica.USA”和其他组 [“CandyLovers.ChocolateLovers”、“Athletes.Footballers”],如果存在类似“George” 属于 ["NorthAmerica.USA.CA.LosAngeles", "Athletes.Footballers", "CandyLovers.ChocolateLovers.ChocolateDonutLovers"]

我尝试了几种不同的方式来编写查询,但它们的扩展性很差:需要几分钟才能返回一组 100 万个项目和搜索条件中 3-4 条路径的结果。例如:

    EXPLAIN ANALYZE
SELECT *
FROM groups
WHERE path ~ CAST ('1.2.22' || '.*{1}' AS lquery)
AND EXISTS
(SELECT 1
FROM
(SELECT array_agg(DISTINCT leaf_paths_sans_result_path.path) AS paths_of_a_match,
max(path_count) AS path_count
FROM items,

(SELECT path,
count(*) OVER() AS path_count
FROM (
VALUES (groups.path) , ('1.3'),('1.4')) t (path)) leaf_paths_sans_result_path
WHERE 1 = 1
AND items.path <@ leaf_paths_sans_result_path.path
GROUP BY id) items_by_id
WHERE cardinality(paths_of_a_match) = path_count );

结果如下:

     Index Scan using idx_groups__path__gist on groups  (cost=0.28..37013.74 rows=38 width=469) (actual time=11.735..322285.421 rows=950 loops=1)
Index Cond: (path ~ '1.2.22.*{1}'::lquery)
Filter: (SubPlan 1)
Rows Removed by Filter: 3
SubPlan 1
-> Subquery Scan on items_by_id (cost=0.55..1809359.86 rows=3752 width=0) (actual time=338.162..338.162 rows=1 loops=953)
-> GroupAggregate (cost=0.55..1809322.34 rows=3752 width=65) (actual time=338.159..338.159 rows=1 loops=953)
Group Key: ibt.id
Filter: (cardinality(array_agg(DISTINCT "*VALUES*".column1)) >= max(3))
Rows Removed by Filter: 7845
-> Nested Loop (cost=0.55..1809228.54 rows=3752 width=65) (actual time=0.044..307.087 rows=20423 loops=953)
Join Filter: (ibt.path <@ "*VALUES*".column1)
Rows Removed by Join Filter: 651228
-> Index Scan using idx_items__id on items (cost=0.55..1752954.06 rows=1250543 width=193) (actual time=0.007..110.517 rows=223884 loops=953)
-> Materialize (cost=0.00..0.05 rows=3 width=32) (actual time=0.000..0.000 rows=3 loops=213361141)
-> Values Scan on "*VALUES*" (cost=0.00..0.04 rows=3 width=32) (actual time=0.002..0.003 rows=3 loops=953)
Planning time: 3.151 ms
Execution time: 322286.404 ms
(18 rows)

我可以根据需要更改数据模型以优化此查询。我正在运行 PostgreSQL v9.5

非常感谢!很抱歉问了一个乱七八糟的问题。

最佳答案

看起来您正在使用 ltree module ?以下查询避免了中间 array_agg 数组:

select  *
from items i
join groups g
on i.groups = g.id
where g.path ~ '1.2.22.*' and
(
i.path ~ '*.1.3.*' or
i.path ~ '*.1.4.*'
)
group by
g.id
having count(distinct
case
when i.path ~ '*.1.3.*' then 1
when i.path ~ '*.1.4.*' then 2
end) = 2

count 构造断言两个条件都满足,而不仅仅是匹配相同模式的两行。

关于PostgreSQL:快速检查 LTREE [] <@ LTREE[] 的所有元素是否,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49485196/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com