实际上，我在下面的线程中得到了关于类似问题的很好的答案，但我需要针对不同数据集的另一个解决方案。

How to get the latest 2 rows ( PostgreSQL )

数据集有历史数据，我只想在最新的 gather_time 获取组的 sum(value)。最终结果应该如下:

 name  | col1 |     gather_time     | sum
-------+------+---------------------+-----
 first | 100  | 2016-01-01 23:12:49 |   6
 first | 200  | 2016-01-01 23:11:13 |   4

但是，我只能通过下面的查询看到一组(first-100)的数据，这意味着没有第二组(first-200)的数据。问题是我需要为每个组获取一行。组的数量可以变化。

select name,col1,gather_time,sum(value) 
from testtable
group by name,col1,gather_time
order by gather_time desc
limit 2;

 name  | col1 |     gather_time     | sum
-------+------+---------------------+-----
 first | 100  | 2016-01-01 23:12:49 |   6
 first | 100  | 2016-01-01 23:11:19 |   6
(2 rows)

你能建议我完成这个要求吗？

数据集

create table testtable
(
name varchar(30),
col1 varchar(30),
col2 varchar(30),
gather_time timestamp,
value integer
);


insert into testtable values('first','100','q1','2016-01-01 23:11:19',2);
insert into testtable values('first','100','q2','2016-01-01 23:11:19',2);
insert into testtable values('first','100','q3','2016-01-01 23:11:19',2);
insert into testtable values('first','200','t1','2016-01-01 23:11:13',2);
insert into testtable values('first','200','t2','2016-01-01 23:11:13',2);
insert into testtable values('first','100','q1','2016-01-01 23:11:11',2);
insert into testtable values('first','100','q1','2016-01-01 23:12:49',2);
insert into testtable values('first','100','q2','2016-01-01 23:12:49',2);
insert into testtable values('first','100','q3','2016-01-01 23:12:49',2);

select * 
from testtable 
order by name,col1,gather_time;

 name  | col1 | col2 |     gather_time     | value
-------+------+------+---------------------+-------
 first | 100  | q1   | 2016-01-01 23:11:11 |     2
 first | 100  | q2   | 2016-01-01 23:11:19 |     2
 first | 100  | q3   | 2016-01-01 23:11:19 |     2
 first | 100  | q1   | 2016-01-01 23:11:19 |     2
 first | 100  | q3   | 2016-01-01 23:12:49 |     2
 first | 100  | q1   | 2016-01-01 23:12:49 |     2
 first | 100  | q2   | 2016-01-01 23:12:49 |     2
 first | 200  | t2   | 2016-01-01 23:11:13 |     2
 first | 200  | t1   | 2016-01-01 23:11:13 |     2

最佳答案

一个选项是将您的原始表连接到一个表，该表仅包含每个 name、col1 组具有最新 gather_time 的记录。然后你可以对每组的value列求和，得到你想要的结果集。

SELECT t1.name, t1.col1, MAX(t1.gather_time) AS gather_time, SUM(t1.value) AS sum
FROM testtable t1 INNER JOIN
(
    SELECT name, col1, col2, MAX(gather_time) AS maxTime
    FROM testtable
    GROUP BY name, col1, col2
) t2
ON t1.name = t2.name AND t1.col1 = t2.col1 AND t1.col2 = t2.col2 AND
    t1.gather_time = t2.maxTime
GROUP BY t1.name, t1.col1

如果您想在 WHERE 子句中使用子查询，正如您在 OP 中尝试的那样，限制为仅具有最新 gather_time 的记录，那么您可以尝试以下:

SELECT name, col1, gather_time, SUM(value) AS sum
FROM testtable t1
WHERE gather_time =
(
    SELECT MAX(gather_time) 
    FROM testtable t2
    WHERE t1.name = t2.name AND t1.col1 = t2.col1
)
GROUP BY name, col1

关于postgresql - 如何在 PostgreSQL 中获取每个组(名称，col1)的最新 gather_time 的总和(值)？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34586614/

postgresql - 如何在 PostgreSQL 中获取每个组(名称，col1)的最新 gather_time 的总和(值)？

数据集

上一篇：postgresql - 从 postgres 中的 2 个表创建表

下一篇：json - 从 JSON 对象中检索数据