sql - 不跳行的Postgresql GROUP BY？

假设我在表中有这些数据:

 id | thing | operation | timestamp
----+-------+-----------+-----------
  0 | foo   |       add |         0
  0 | bar   |       add |         1
  1 | baz   |    remove |         2
  1 | dim   |       add |         3
  0 | foo   |    remove |         4
  0 | dim   |       add |         5

有什么方法可以构建一个 Postgres SQL 查询，该查询将按 id 和操作进行分组，但不会将具有较高时间戳值的行与具有较低时间戳值的行进行分组？我想从查询中得到这个:

 id |  things  | operation
----+----------+-----------
  0 | foo, bar |       add
  1 |      baz |    remove
  1 |      dim |       add
  0 |      foo |    remove
  0 |      dim |       add

基本上分组依据，但仅限于按时间戳排序的相邻行。

最佳答案

这是一个 gaps and islands问题(虽然这篇文章是针对 SQL-Server 的，但它很好地描述了问题，所以仍然适用于 Postgresql)，并且可以使用排名函数解决:

SELECT  id,
        thing,
        operation,
        timestamp,
        ROW_NUMBER() OVER(ORDER BY timestamp) - 
                ROW_NUMBER() OVER(PARTITION BY id, operation ORDER BY Timestamp) AS groupingSet,
        ROW_NUMBER() OVER(ORDER BY timestamp) AS PositionInSet,
        ROW_NUMBER() OVER(PARTITION BY id, operation ORDER BY Timestamp) AS PositionInGroup
FROM    T
ORDER BY timestamp;

正如您所见，通过获取集合中的整体位置，并减去组中的位置，您可以识别岛屿，其中 (id, operation, groupingset) 的每个唯一组合代表一个岛屿:

id  thing   operation   timestamp   groupingSet PositionInSet   PositionInGroup
0   foo     add         0           0           1               1
0   bar     add         1           0           2               2           
1   baz     remove      2           2           3               1
1   dim     add         3           3           4               1
0   foo     remove      4           4           5               1
0   dim     add         5           3           6               3

然后你只需要把它放在一个子查询中，并按相关字段分组，然后使用 string_agg 连接你的东西:

SELECT  id, STRING_AGG(thing) AS things, operation
FROM    (   SELECT  id,
                    thing,
                    operation,
                    timestamp,
                    ROW_NUMBER() OVER(ORDER BY timestamp) - 
                            ROW_NUMBER() OVER(PARTITION BY id, operation ORDER BY Timestamp) AS groupingSet
            FROM    T
        ) AS t
GROUP BY id, operation, groupingset;

关于sql - 不跳行的Postgresql GROUP BY？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28560389/

sql - 不跳行的Postgresql GROUP BY？

上一篇：postgresql - 使用 ASCII 31 字段分隔符作为 Postgresql COPY 分隔符

下一篇：postgresql - 仅当大于最小值时才减少 postgres 中的整数