gpt4 book ai didi

sql - 将 ActiveRecord 查询重写为递归 SQL

转载 作者:行者123 更新时间:2023-12-04 02:37:47 25 4
gpt4 key购买 nike

我有一个类似于事件记录结构的树,带有一个自引用对象 - 例如,该对象可以是同一类的另一个对象的父级或子级。我需要一种在代码中有效地映射此结构的方法。到目前为止,我一直在使用事件记录 ORM 在 ruby​​ 中做它,它的效率非常低。

这是 pod.rb 模型的样子:

    has_many :pod_parents, class_name: "PodPod", dependent: :delete_all
has_many :parents, through: :pod_parents, :foreign_key => 'parent_id', :source => 'parent'
has_many :pod_children, class_name: "PodPod", :foreign_key => 'parent_id'
has_many :children, through: :pod_children, :source => 'pod'

scope :active, -> {
where(pod_state: "active").where(pod_type: ["standard","readonly"])
}

这是相关的数据库架构:
table "pods"
t.string "intention"
t.integer "user_id"
t.string "slug"
t.string "url_handle"
t.index ["slug"], name: "index_pods_on_slug"
t.index ["url_handle"], name: "index_pods_on_url_handle"

table "pod_pods"
t.integer "parent_id"
t.integer "pod_id"
t.index ["parent_id", "pod_id"], name: "index_pod_pods_on_parent_id_and_pod_id", unique: true
t.index ["parent_id"], name: "index_pod_pods_on_parent_id"
t.index ["pod_id"], name: "index_pod_pods_on_pod_id"

以下是我正在优化的特定功能:
def get_all_parents
parents = []
self.parents.active.each do |parent|
parents << parent
parents.concat(parent.get_all_parents)
end
return parents
end

def get_all_children
children = []
self.children.each do |child|
children.concat(child.get_all_children)
end
return children
end

def get_all_parents_and_children
pod_array = self.get_all_parents
pod_array.concat(self.get_all_children)
return pod_array
end

def get_all_relations(inclusive = false)
circles_array = self.get_all_parents
circles_array.each do |parent|
circles_array = circles_array.concat(parent.get_all_children)
end
circles_array = circles_array.concat(self.get_all_children)
unique_ids = circles_array.compact.map(&:id).uniq - [self.id]
circles = Pod.where(id: unique_ids)
end

据我研究,Postgres 支持一种递归 SQL 查询。我一直在使用这些文章来指明方向: 1 , 2 .

这是我得到的:
def get_all_parents2
sql =
<<-SQL
WITH RECURSIVE pod_tree(id, path) AS (
SELECT id, ARRAY[id]
FROM pods
WHERE id = #{self.id}
UNION ALL
SELECT pods.id, path
FROM pod_tree
JOIN pods ON pods.id=pod_tree.id
JOIN pod_pods ON pod_pods.parent_id = pods.id
WHERE NOT pods.id = ANY(path)
)
SELECT * FROM pod_tree
ORDER BY path;
SQL
sql.chomp
Pod.find_by_sql(sql)
end

我的 SQL 不是特别好,我不知道如何向上和向下导航树结构,以便能够将我上面提到的函数重写为递归 SQL。如果您对此有所帮助,我将不胜感激。谢谢你。

最佳答案

您尝试完成的任务绝对可以通过递归 CTE 实现。我将介绍您拥有的前两个场景,因为其他两个只是前两个的扩展。

在所有 SQL 示例中,我将使用 id 1 来说明您在模型级别替换的值。由于您编写了该查询,因此我将假设您对递归 CTE 有所了解,并尝试寻找解决方案。
get_all_children
让我们采取方法get_all_children第一的。这种方法涉及沿着树向下走,一层一层地覆盖我们遇到的节点。

由于 pod_pods 包含有关层次结构的所有信息,并且在获取 child 时不涉及范围,因此我们可以为 child 递归 pod_pods。

-- Snippet #1
WITH RECURSIVE pod_tree AS (
SELECT pod_id -- Get the pod_id of the children of the base case node
FROM pod_pods
WHERE parent_id = 1 -- Base case
UNION ALL -- Recurse on this and do a union with the previous step
SELECT p.pod_id
FROM pod_pods p
INNER JOIN pod_tree ptree
ON ptree.pod_id = p.parent_id -- Get the children nodes for nodes found at the previous recursion step.
)

SELECT * FROM pods
WHERE id IN (SELECT DISTINCT(pod_id) FROM pod_tree);

您的 Ruby 代码没有涵盖由于循环而发生无限循环的可能性,但如果有可能发生,您将解决此问题的方法是跟踪您已经看到的 id。

-- Snippet #2
WITH RECURSIVE pod_tree(pod_id, rtree) AS ( -- Extra rtree parameter to keep track of visited nodes
SELECT pod_id, ARRAY[pod_id] -- Make the base case array with pod_id
FROM pod_pods
WHERE parent_id = 1 -- Base case
UNION ALL
SELECT p.pod_id, rtree || p.pod_id -- Add the current pod_id to array
FROM pod_pods p
INNER JOIN pod_tree ptree
ON ptree.pod_id = p.parent_id
WHERE NOT (p.pod_id = ANY(rtree)) -- Exclude nodes which have already been seen
)

SELECT * FROM pods
WHERE id IN (SELECT DISTINCT(pod_id) FROM pod_tree);

如果你可以在 pod_pods 中有孤儿关系并且想忽略它们,那么 pod 之间需要一个连接。

-- Snippet #3
WITH RECURSIVE pod_tree(id, rtree) AS (
SELECT p1.id, ARRAY[p1.id]
FROM pods p1 INNER JOIN pod_pods p2 ON p1.id = p2.pod_id
WHERE parent_id = 1
UNION ALL
SELECT p1.id, rtree || p1.id
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.pod_id
INNER JOIN pod_tree ptree ON p2.parent_id = ptree.id
WHERE NOT (p1.id = ANY(ptree.rtree))
)

SELECT * FROM pods WHERE id IN (SELECT DISTINCT(id) FROM pod_tree);

如果您没有孤立链接,我的建议是使用 Snippet #1 或 #2,因为它们比 #3 更快,因为它涉及额外的连接。
get_all_parents
首先,为了简单起见,让我们添加由于稍后激活而被添加的范围字段。首先,我们沿着 pod_pods 表的树向下走,获取所有父 ID,然后我们应用范围。

-- Snippet #4
WITH RECURSIVE pod_tree AS (
SELECT parent_id -- Get the parent_id of the parents of the base case node
FROM pod_pods
WHERE pod_id = 1 -- Base case
UNION ALL -- Recurse on this and do a union with the previous step
SELECT p.parent_id
FROM pod_pods p
INNER JOIN pod_tree ptree
ON ptree.parent_id = p.pod_id -- Get the parent nodes for nodes found at the previous recursion step.
)

SELECT * FROM pods
WHERE
id IN (SELECT DISTINCT(parent_id) FROM pod_tree)
AND pod_state = 'active'
AND pod_type IN ('standard', 'readonly')
;

但是,这仅在获取所有节点后才应用事件过滤器。这可能并不理想,因为它可能会走比所需更多的树,甚至可能返回非事件节点的父节点。为了使它像 Ruby 代码中的方法一样,我们需要将它与 pod 连接起来。我在这里添加了无限递归避免步骤,并且您现在对此有所了解。

-- Snippet #5
WITH RECURSIVE pod_tree(id, rtree) AS (
SELECT p1.id, ARRAY[p1.id]
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
WHERE pod_id = 1
AND p1.pod_state = 'active'
AND p1.pod_type IN ('standard', 'readonly')
UNION ALL
SELECT p1.id, rtree || p1.id
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
INNER JOIN pod_tree ptree ON p2.pod_id = ptree.id
WHERE p1.pod_state = 'active'
AND p1.pod_type IN ('standard', 'readonly')
AND NOT (p1.id = ANY(ptree.rtree))
)

SELECT * FROM pods WHERE id IN (SELECT DISTINCT(id) FROM pod_tree);

在基于您的 stub 方法的 Rails 中,代码段 #5 的代码将如下所示

def get_all_parents
sql =
<<-SQL
WITH RECURSIVE pod_tree(id, rtree) AS (
SELECT p1.id, ARRAY[p1.id]
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
WHERE pod_id = #{self.id}
AND p1.pod_state = 'active'
AND p1.pod_type IN ('standard', 'readonly')
UNION ALL
SELECT p1.id, rtree || p1.id
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
INNER JOIN pod_tree ptree ON p2.pod_id = ptree.id
WHERE p1.pod_state = 'active'
AND p1.pod_type IN ('standard', 'readonly')
AND NOT (p1.id = ANY(ptree.rtree))
)

SELECT * FROM pods WHERE id IN (SELECT DISTINCT(id) FROM pod_tree);
SQL
# IMP!
# sql = sql_sanitize(sql)
# Add some sanitize step here
sql.chomp
Pod.find_by_sql(sql)
end

这应该涵盖您的前两个用例。如前所述,另外两个是这两个的扩展,因此您可以使用这些扩展到那些。

笔记:
  • 如果你没有循环,你可以避免无限递归列,因为它是额外的簿记。
  • 如果您没有孤立链接,则更喜欢仅在 pod_pods 上进行迭代对于 child ,因为它避免了不必要的连接
  • rtree在上面的 sql 查询中包含层次结构。如果您需要该信息,您可以选择将其传回。我跳过了它,因为你无论如何最终都会使结果变平。
  • 我正在获取独特的节点。如果一个节点被多次访问,您的 Rails 代码当前将获取多次出现的节点。如果你想要这个,加上树的顺序,你可以有这样的行为:

  • -- Example for getting all parents
    WITH RECURSIVE pod_tree(id, slug, pod_type, parent_id, rtree) AS (
    SELECT p1.id, p1.slug, p1.pod_type, p2.parent_id, ARRAY[p1.id] -- Select the fields you need
    FROM pods p1 INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
    WHERE pod_id = 1
    AND p1.pod_state = 'active' AND p1.pod_type IN ('standard', 'readonly')
    UNION ALL
    SELECT p1.id, p1.slug, p1.pod_type, p2.parent_id, rtree || p1.id
    FROM pods p1 INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
    INNER JOIN pod_tree ptree ON p2.pod_id = ptree.id
    WHERE p1.pod_state = 'active' AND p1.pod_type IN ('standard', 'readonly')
    AND NOT (p1.id = ANY(ptree.rtree))
    )

    SELECT * FROM pod_tree;

    关于sql - 将 ActiveRecord 查询重写为递归 SQL,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60841892/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com