我有一个类似于事件记录结构的树,带有一个自引用对象 - 例如,该对象可以是同一类的另一个对象的父级或子级。我需要一种在代码中有效地映射此结构的方法。到目前为止,我一直在使用事件记录 ORM 在 ruby 中做它,它的效率非常低。
这是 pod.rb 模型的样子:
has_many :pod_parents, class_name: "PodPod", dependent: :delete_all
has_many :parents, through: :pod_parents, :foreign_key => 'parent_id', :source => 'parent'
has_many :pod_children, class_name: "PodPod", :foreign_key => 'parent_id'
has_many :children, through: :pod_children, :source => 'pod'
scope :active, -> {
where(pod_state: "active").where(pod_type: ["standard","readonly"])
}
这是相关的数据库架构:
table "pods"
t.string "intention"
t.integer "user_id"
t.string "slug"
t.string "url_handle"
t.index ["slug"], name: "index_pods_on_slug"
t.index ["url_handle"], name: "index_pods_on_url_handle"
table "pod_pods"
t.integer "parent_id"
t.integer "pod_id"
t.index ["parent_id", "pod_id"], name: "index_pod_pods_on_parent_id_and_pod_id", unique: true
t.index ["parent_id"], name: "index_pod_pods_on_parent_id"
t.index ["pod_id"], name: "index_pod_pods_on_pod_id"
以下是我正在优化的特定功能:
def get_all_parents
parents = []
self.parents.active.each do |parent|
parents << parent
parents.concat(parent.get_all_parents)
end
return parents
end
def get_all_children
children = []
self.children.each do |child|
children.concat(child.get_all_children)
end
return children
end
def get_all_parents_and_children
pod_array = self.get_all_parents
pod_array.concat(self.get_all_children)
return pod_array
end
def get_all_relations(inclusive = false)
circles_array = self.get_all_parents
circles_array.each do |parent|
circles_array = circles_array.concat(parent.get_all_children)
end
circles_array = circles_array.concat(self.get_all_children)
unique_ids = circles_array.compact.map(&:id).uniq - [self.id]
circles = Pod.where(id: unique_ids)
end
据我研究,Postgres 支持一种递归 SQL 查询。我一直在使用这些文章来指明方向:1 , 2 .
这是我得到的:
def get_all_parents2
sql =
<<-SQL
WITH RECURSIVE pod_tree(id, path) AS (
SELECT id, ARRAY[id]
FROM pods
WHERE id = #{self.id}
UNION ALL
SELECT pods.id, path
FROM pod_tree
JOIN pods ON pods.id=pod_tree.id
JOIN pod_pods ON pod_pods.parent_id = pods.id
WHERE NOT pods.id = ANY(path)
)
SELECT * FROM pod_tree
ORDER BY path;
SQL
sql.chomp
Pod.find_by_sql(sql)
end
我的 SQL 不是特别好,我不知道如何向上和向下导航树结构,以便能够将我上面提到的函数重写为递归 SQL。如果您对此有所帮助,我将不胜感激。谢谢你。
最佳答案
您尝试完成的任务绝对可以通过递归 CTE 实现。我将介绍您拥有的前两个场景,因为其他两个只是前两个的扩展。
在所有 SQL 示例中,我将使用 id 1 来说明您在模型级别替换的值。由于您编写了该查询,因此我将假设您对递归 CTE 有所了解,并尝试寻找解决方案。get_all_children
让我们采取方法get_all_children
第一的。这种方法涉及沿着树向下走,一层一层地覆盖我们遇到的节点。
由于 pod_pods 包含有关层次结构的所有信息,并且在获取 child 时不涉及范围,因此我们可以为 child 递归 pod_pods。
-- Snippet #1
WITH RECURSIVE pod_tree AS (
SELECT pod_id -- Get the pod_id of the children of the base case node
FROM pod_pods
WHERE parent_id = 1 -- Base case
UNION ALL -- Recurse on this and do a union with the previous step
SELECT p.pod_id
FROM pod_pods p
INNER JOIN pod_tree ptree
ON ptree.pod_id = p.parent_id -- Get the children nodes for nodes found at the previous recursion step.
)
SELECT * FROM pods
WHERE id IN (SELECT DISTINCT(pod_id) FROM pod_tree);
您的 Ruby 代码没有涵盖由于循环而发生无限循环的可能性,但如果有可能发生,您将解决此问题的方法是跟踪您已经看到的 id。
-- Snippet #2
WITH RECURSIVE pod_tree(pod_id, rtree) AS ( -- Extra rtree parameter to keep track of visited nodes
SELECT pod_id, ARRAY[pod_id] -- Make the base case array with pod_id
FROM pod_pods
WHERE parent_id = 1 -- Base case
UNION ALL
SELECT p.pod_id, rtree || p.pod_id -- Add the current pod_id to array
FROM pod_pods p
INNER JOIN pod_tree ptree
ON ptree.pod_id = p.parent_id
WHERE NOT (p.pod_id = ANY(rtree)) -- Exclude nodes which have already been seen
)
SELECT * FROM pods
WHERE id IN (SELECT DISTINCT(pod_id) FROM pod_tree);
如果你可以在 pod_pods 中有孤儿关系并且想忽略它们,那么 pod 之间需要一个连接。
-- Snippet #3
WITH RECURSIVE pod_tree(id, rtree) AS (
SELECT p1.id, ARRAY[p1.id]
FROM pods p1 INNER JOIN pod_pods p2 ON p1.id = p2.pod_id
WHERE parent_id = 1
UNION ALL
SELECT p1.id, rtree || p1.id
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.pod_id
INNER JOIN pod_tree ptree ON p2.parent_id = ptree.id
WHERE NOT (p1.id = ANY(ptree.rtree))
)
SELECT * FROM pods WHERE id IN (SELECT DISTINCT(id) FROM pod_tree);
如果您没有孤立链接,我的建议是使用 Snippet #1 或 #2,因为它们比 #3 更快,因为它涉及额外的连接。
get_all_parents
首先,为了简单起见,让我们添加由于稍后激活而被添加的范围字段。首先,我们沿着 pod_pods 表的树向下走,获取所有父 ID,然后我们应用范围。
-- Snippet #4
WITH RECURSIVE pod_tree AS (
SELECT parent_id -- Get the parent_id of the parents of the base case node
FROM pod_pods
WHERE pod_id = 1 -- Base case
UNION ALL -- Recurse on this and do a union with the previous step
SELECT p.parent_id
FROM pod_pods p
INNER JOIN pod_tree ptree
ON ptree.parent_id = p.pod_id -- Get the parent nodes for nodes found at the previous recursion step.
)
SELECT * FROM pods
WHERE
id IN (SELECT DISTINCT(parent_id) FROM pod_tree)
AND pod_state = 'active'
AND pod_type IN ('standard', 'readonly')
;
但是,这仅在获取所有节点后才应用事件过滤器。这可能并不理想,因为它可能会走比所需更多的树,甚至可能返回非事件节点的父节点。为了使它像 Ruby 代码中的方法一样,我们需要将它与 pod 连接起来。我在这里添加了无限递归避免步骤,并且您现在对此有所了解。
-- Snippet #5
WITH RECURSIVE pod_tree(id, rtree) AS (
SELECT p1.id, ARRAY[p1.id]
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
WHERE pod_id = 1
AND p1.pod_state = 'active'
AND p1.pod_type IN ('standard', 'readonly')
UNION ALL
SELECT p1.id, rtree || p1.id
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
INNER JOIN pod_tree ptree ON p2.pod_id = ptree.id
WHERE p1.pod_state = 'active'
AND p1.pod_type IN ('standard', 'readonly')
AND NOT (p1.id = ANY(ptree.rtree))
)
SELECT * FROM pods WHERE id IN (SELECT DISTINCT(id) FROM pod_tree);
在基于您的 stub 方法的 Rails 中,代码段 #5 的代码将如下所示
def get_all_parents
sql =
<<-SQL
WITH RECURSIVE pod_tree(id, rtree) AS (
SELECT p1.id, ARRAY[p1.id]
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
WHERE pod_id = #{self.id}
AND p1.pod_state = 'active'
AND p1.pod_type IN ('standard', 'readonly')
UNION ALL
SELECT p1.id, rtree || p1.id
FROM pods p1
INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
INNER JOIN pod_tree ptree ON p2.pod_id = ptree.id
WHERE p1.pod_state = 'active'
AND p1.pod_type IN ('standard', 'readonly')
AND NOT (p1.id = ANY(ptree.rtree))
)
SELECT * FROM pods WHERE id IN (SELECT DISTINCT(id) FROM pod_tree);
SQL
# IMP!
# sql = sql_sanitize(sql)
# Add some sanitize step here
sql.chomp
Pod.find_by_sql(sql)
end
这应该涵盖您的前两个用例。如前所述,另外两个是这两个的扩展,因此您可以使用这些扩展到那些。
笔记:
pod_pods
上进行迭代对于 child ,因为它避免了不必要的连接 rtree
在上面的 sql 查询中包含层次结构。如果您需要该信息,您可以选择将其传回。我跳过了它,因为你无论如何最终都会使结果变平。 -- Example for getting all parents
WITH RECURSIVE pod_tree(id, slug, pod_type, parent_id, rtree) AS (
SELECT p1.id, p1.slug, p1.pod_type, p2.parent_id, ARRAY[p1.id] -- Select the fields you need
FROM pods p1 INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
WHERE pod_id = 1
AND p1.pod_state = 'active' AND p1.pod_type IN ('standard', 'readonly')
UNION ALL
SELECT p1.id, p1.slug, p1.pod_type, p2.parent_id, rtree || p1.id
FROM pods p1 INNER JOIN pod_pods p2 ON p1.id = p2.parent_id
INNER JOIN pod_tree ptree ON p2.pod_id = ptree.id
WHERE p1.pod_state = 'active' AND p1.pod_type IN ('standard', 'readonly')
AND NOT (p1.id = ANY(ptree.rtree))
)
SELECT * FROM pod_tree;
关于sql - 将 ActiveRecord 查询重写为递归 SQL,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60841892/