gpt4 book ai didi

algorithm - Postgres有向图上下遍历

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:14:53 26 4
gpt4 key购买 nike

我遇到了一个问题,我可以解决小型数据集,但无法解决具有(可能)不干净数据的大型数据集。

该数据库是 PostgreSQL 中非循环(希望)图的实现。用三个表

vertex_elements: id
edges: id, parent_id, child_id
element_associations: id, user_id, object_id (both are vertex elements, but it unconnected graphs)

我有一组 user_ids,我从中派生出 element_associations 和图中的起始 vertex_element,我想找到所有的 child 可以从具有 user_ids 之一的 element_association 访问节点。如果一个节点或其一个祖先element_association的候选object_ids之一,则该节点被认为是可访问的。

该图的形状相对呈三角形(根节点很少,叶节点很多),并且从顶点元素开始,我的策略如下:

  1. 根据候选 element_associations 列表检查当前 vertext_element;如果好,则所有后代都可以访问,否则转到...
  2. 检查当前vertex_element 的祖先是否在候选element_associations 列表中。与(1)类似,如果命中,则所有祖先都可访问,否则去...
  3. 遍历每个子vertex_element(广度优先搜索)并按照步骤 1 和 2 进行操作。

当我想避免重复检查同一个祖先 vertex_elements 时,问题就出现了。主要查询是向下遍历,使用一组候选 element_associations

检查每个后代的可访问性
        WITH RECURSIVE edges_recursive(child_id, parent_id, matching_element_association_id) AS (
(
SELECT e1.child_id, e1.parent_id, ea.id
FROM edges e1
LEFT OUTER JOIN element_associations ea ON e1.child_id = ea.object_id
AND ea.id IN (?)
WHERE parent_id = ?
)
UNION
(
SELECT e2.child_id, e2.parent_id, ea.id
FROM edges e2
INNER JOIN assignments_recursive
ON edges_recursive.child_id = e2.parent_id
LEFT OUTER JOIN element_associations ea
ON edges_recursive.child_id = ea.object_id
AND ea.id IN (?)
WHERE edges_recursive.matching_element_association_id IS NULL
)
)

SELECT edges_recursive.child_id
FROM edges_recursive
WHERE edges_recursive.matching_element_association_id IS NOT NULL

但是,还有一个附加递归子查询,它检查 LEFT OUTER JOIN element_associations 中的每个 vertex_element ,看起来像

ea.id IN (
WITH RECURSIVE parent_edges_recursive(child_id, parent_id, matching_element_association_id) AS (
(
SELECT edges.child_id, edges.parent_id, ea.id
FROM edges
LEFT OUTER JOIN element_associations ea
ON ea.id IN (?) AND edges.parent_id = ea.object_id
WHERE edges.child_id = e1.parent_id AND edges.parent_id != e1.parent_id
)
UNION
(
SELECT edges.child_id, edges.parent_id. ea.id
FROM edges
JOIN parent_edges_recursive
ON parent_edges_recursive.parent_id = edges.child_id
LEFT OUTER JOIN element_associations ea
ON ea.id IN (?) AND edges.parent_id = ea.object_id
WHERE parent_edges_recursive.matching_element_association_id IS NULL
)
SELECT parent_edges_recursive.matching_element_association_id
FROM parent_edges_recursive
WHERE parent_edges_recursive.matching_element_association_id IS NOT NULL
LIMIT 1
)
)

问题是子查询倾向于避免遍历同一个父顶点两次;但是,不能保证当我们向下遍历图的后代时,我们不会重新读取先前评估的祖先。对于小数据集,这很好,性能还可以;然而,它的可扩展性非常可笑,并且对周期极度缺乏弹性。

我需要做的是保留关于我已经在子查询之间遍历的父 vertex_elements 的信息,这样我就可以避免重读步骤;但是,我一直在研究如何在单个查询中执行此操作。

最佳答案

What I need to do is preserve the information about what parent vertex_elements I have already traversed between subqueries so that I avoid retreading steps;

无需详细研究您的查询:您可以通过在数组 中收集 ID 来做到这一点。代码示例:

关于algorithm - Postgres有向图上下遍历,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48697763/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com