问题/示例/期望值
我需要确定一个 Strahler number或 Strahler stream order 用于表示流网络的有向图。我可以导出信息forwards and backwards using WITH RECURSIVE
queries ,但似乎我需要做一些不同的事情来确定 Strahler 数。
例如,这里有一个 19 段流网络,有 10 个支流和一个出水口。每个段的上游部分由一个节点 ID 表示。
和表结构中的相同数据,其中段通过to_node
连接,对于流域导出为null。
CREATE TABLE streams (
node integer PRIMARY KEY,
to_node integer REFERENCES streams(node),
expected_order integer
);
INSERT INTO streams(node, to_node, expected_order) VALUES
(1, NULL, 4),
(2, 1, 4),
(3, 2, 3),
(4, 2, 3),
(5, 4, 3),
(6, 3, 2),
(7, 3, 2),
(8, 5, 2),
(9, 5, 2),
(10, 6, 1),
(11, 6, 1),
(12, 7, 1),
(13, 7, 1),
(14, 8, 1),
(15, 8, 1),
(16, 9, 1),
(17, 9, 1),
(18, 4, 1),
(19, 1, 1);
此处显示了 Strahler 数的预期结果 (expected_order
):
共有三个规则(来自 GRASS 7.0 Manual ):
- 如果该节点没有子节点,则其 Strahler 阶数为 1。
- 如果该节点有一个且只有一个具有 Strahler 最大阶 i 的支流,并且所有其他支流的阶都小于 i,则该阶保持 i。
- 如果该节点有两条或多条最大阶i的支流,则该节点的Strahler阶为i + 1。
我发现/尝试过的
从我挖掘解决这个问题的过程中发现,这个计算can be done with SQL (除了我认为他们的“SQL 脚本”是为 MS SQL Server 编写的)。但是,我还没有找到可以用 PostgreSQL 9.1 完成的事情。
我最好的尝试之一是计算每个节点的上游节点的数量,它正确地识别支流(一阶),而不是其他:
WITH RECURSIVE search_graph AS (
SELECT node AS start_node, node
FROM streams
-- Connect downstream towards outlet(s)
UNION ALL
SELECT sg.start_node, n.node
FROM streams n
JOIN search_graph sg ON n.to_node = sg.node
)
SELECT start_node, count(sg.node) as upstream_nodes, expected_order
FROM search_graph sg
JOIN streams s ON sg.start_node = s.node
GROUP BY start_node, expected_order
ORDER BY upstream_nodes DESC, start_node;
start_node | upstream_nodes | expected_order
------------+----------------+----------------
1 | 19 | 4
2 | 17 | 4
4 | 9 | 3
3 | 7 | 3
5 | 7 | 3
6 | 3 | 2
7 | 3 | 2
8 | 3 | 2
9 | 3 | 2
10 | 1 | 1
11 | 1 | 1
12 | 1 | 1
13 | 1 | 1
14 | 1 | 1
15 | 1 | 1
16 | 1 | 1
17 | 1 | 1
18 | 1 | 1
19 | 1 | 1
(19 rows)
一个想法是使用nth_value(value any, nth integer)
window function用 appropriately set window frame range .但是,我不确定如何设置它,或者是否可以设置它来识别 Strahler 号码。另一个 [不太令人兴奋] 的想法是为每个 Strahler 数手动运行迭代,我希望我的真实世界数据在五到八个订单(迭代)之间。这可以通过 DO
statement 来完成.但我们欢迎任何更好的想法。
最佳答案
我遇到了 CTE 的限制。递归 CTE 无法对自身执行 LEFT JOIN。只是在功能上完成了它。
现场测试:https://www.db-fiddle.com/f/8z58LCVhD62YvkeJjriW8d/5
create or replace function strahler(_parent int) returns table(
node int, strahler_order int
)
as
$$
select
s.node,
case
-- If the node is a leaf (has no children), its Strahler number is one.
when count(st.*) = 0 then
1
when count(st.*) >= 2 then
case
-- If the node has one child with Strahler number i,
-- and all other children have Strahler numbers less than i,
-- then the Strahler number of the node is i again.
when min(st.strahler_order) < max(st.strahler_order) then
max(st.strahler_order)
-- If the node has two or more children with Strahler number i,
-- and no children with greater number,
-- then the Strahler number of the node is i + 1.
when min(st.strahler_order) = max(st.strahler_order) then
max(st.strahler_order) + 1
end
end
from streams s
left join lateral strahler(s.node) st on true
where _parent = 0 or s.to_node = _parent
group by s.node
$$ language 'sql';
select st.node, s.expected_order, st.strahler_order
from strahler(0) st
join streams s on st.node = s.node
order by st.node;
测试:
select st.node, s.expected_order, st.strahler_order
from strahler(0) st
join streams s on st.node = s.node
order by st.node;
输出:
| node | expected_order | strahler_order |
| ---- | -------------- | -------------- |
| 1 | 4 | 4 |
| 2 | 4 | 4 |
| 3 | 3 | 3 |
| 4 | 3 | 3 |
| 5 | 3 | 3 |
| 6 | 2 | 2 |
| 7 | 2 | 2 |
| 8 | 2 | 2 |
| 9 | 2 | 2 |
| 10 | 1 | 1 |
| 11 | 1 | 1 |
| 12 | 1 | 1 |
| 13 | 1 | 1 |
| 14 | 1 | 1 |
| 15 | 1 | 1 |
| 16 | 1 | 1 |
| 17 | 1 | 1 |
| 18 | 1 | 1 |
| 19 | 1 | 1 |
这是最初的计划
现场测试:https://www.db-fiddle.com/f/8z58LCVhD62YvkeJjriW8d/1
with recursive search_graph as (
select node as start_node, node
from streams
union all
select sg.start_node, n.node
from streams n
join search_graph sg on n.to_node = sg.node
)
, get_kids as
(
select
s.node as kid,
count(sg.*) - 1 as kid_kids,
s.expected_order
from streams s
join search_graph sg on s.node = sg.start_node
group by s.node, s.expected_order
order by kid_kids
)
, order_safe as
(
select
row_number() over(s) eo,
gk.kid,
gk.kid_kids,
gk_kid.to_node as parent,
gk_p.kid_kids as siblings
from get_kids gk
left join streams gk_kid on gk.kid = gk_kid.node
left join get_kids gk_p on gk_kid.to_node = gk_p.kid
window s as (order by gk_p.kid_kids /* siblings */, gk_kid.to_node /* parent */)
)
select * from order_safe;
输出:
| eo | kid | kid_kids | parent | siblings |
| --- | --- | -------- | ------ | -------- |
| 1 | 11 | 0 | 6 | 2 |
| 2 | 10 | 0 | 6 | 2 |
| 3 | 12 | 0 | 7 | 2 |
| 4 | 13 | 0 | 7 | 2 |
| 5 | 15 | 0 | 8 | 2 |
| 6 | 14 | 0 | 8 | 2 |
| 7 | 17 | 0 | 9 | 2 |
| 8 | 16 | 0 | 9 | 2 |
| 9 | 6 | 2 | 3 | 6 |
| 10 | 7 | 2 | 3 | 6 |
| 11 | 9 | 2 | 5 | 6 |
| 12 | 8 | 2 | 5 | 6 |
| 13 | 5 | 6 | 4 | 8 |
| 14 | 18 | 0 | 4 | 8 |
| 15 | 3 | 6 | 2 | 16 |
| 16 | 4 | 8 | 2 | 16 |
| 17 | 19 | 0 | 1 | 18 |
| 18 | 2 | 16 | 1 | 18 |
| 19 | 1 | 18 | | |
最初的计划是以安全顺序评估每个节点(将由 eo 字段促进),从具有较少兄弟节点的节点开始,直到具有许多兄弟节点的节点。然后在每个将被评估的节点上,还将检查其直接子节点(递归 CTE 将对其自身执行 LEFT JOIN),然后执行必要的 Strahler 的三个条件。但是,CTE 有一个局限性,递归 CTE 不能对自身进行 LEFT JOIN。
关于postgresql - 如何确定流网络有向图上的 Strahler 数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29202822/