sql - "distinct on"与 postgres 组

我有以下记录:

id  run_hour               performance_hour      value
2  "2017-06-25 09:00:00"  "2017-06-25 07:00:00"    6
2  "2017-06-25 09:00:00"  "2017-06-25 08:00:00"    5
1  "2017-06-25 09:00:00"  "2017-06-25 08:00:00"    5
2  "2017-06-25 08:00:00"  "2017-06-25 07:00:00"    5
1  "2017-06-25 08:00:00"  "2017-06-25 07:00:00"    5

我们每小时运行一次以查看当前小时和之前小时的每个 id 的结果。

只有当与前一小时的运行相比发生变化时，我们才会插入一个新的 reocrd (我们不想覆盖该值，因为如果在 1 小时或 2 小时等后查看，我们想测量该值。

我想在最新的可用值(按 run_hour 排序)中对每个 id 求和 - 值。

在上面的示例中，9:00 运行和 7:00 表演时间的广告 1 没有记录 - 因为它与 8:00 运行和 7:00 表演时间相同

在上面的示例中，如果我要求 run 2017-06-25 09:00:00 的值总和，我希望得到:

id, value
1   10
2   11

对于 id 1，计算为 10:(run_hour<2017-06-25 08:00:00> + run_hour<2017-06-25 09:00:00>) 对于 id 2，计算为 11:( run_hour<2017-06-25 09:00:00> + run_hour<2017-06-25 09:00:00>) 我写了以下查询:

select distinct on (id, run_hour) id, sum(value) from metrics where  run_hour <= '2017-06-25 09:00' and performance_hour >= '2017-06-25 07:00' and  performance_hour < '2017-06-25 09:00'
group by id
order by id, run_hour

但是我得到一个错误，run_hour 也必须在 GROUP BY 子句中。 - 但如果我添加它，我会得到不正确的数据 - 还有我不需要的前几小时的数据 - 我需要有数据的最新一小时。

如何对分组依据使用“distinct on”？

最佳答案

任务很复杂。假设您希望从以下数据中获取 7:00 到 9:00 的表演时间:

id  run_hour               performance_hour      value
2   "2017-06-25 09:00:00"  "2017-06-25 06:00:00"    6
2   "2017-06-25 09:00:00"  "2017-06-25 10:00:00"    5

The expected result would be 18 (6 for 7:00 + 6 for 8:00 + 6 for 9:00) all based on the 6:00 record which itself is outside the desired time range.

We need a recursive CTE starting from the first wanted performance hour per id till the last wanted one. Thus we build records that don't exist and that we can sum up later.

with recursive cte(id, run_hour, performance_hour, value) as
(
  select *
  from
  (
    select distinct on (id) 
      id, 
      run_hour,
      greatest(performance_hour, timestamp '2017-06-25 07:00') as performance_hour, 
      value
    from metrics
    where run_hour = timestamp '2017-06-25 09:00' 
      and performance_hour <= timestamp '2017-06-25 07:00'
    order by id, metrics.performance_hour desc
  ) start_by_id
  union all
  select 
    cte.id, 
    cte.run_hour,
    cte.performance_hour + interval '1 hour' as performance_hour,
    coalesce(m.value, cte.value) as value
  from cte
  left join metrics m on m.id = cte.id
                      and m.run_hour = cte.run_hour
                      and m.performance_hour = cte.performance_hour + interval '1 hour'
  where cte.performance_hour < timestamp '2017-06-25 09:00'
)
select id, sum(value)
from cte
group by id;

Rextester 链接:http://rextester.com/PHC88770

关于sql - "distinct on"与 postgres 组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44805001/

sql - "distinct on"与 postgres 组

上一篇：php - 使用 laravel 处理预订系统中的并发

下一篇：sql - 如何从 postgres 函数中选择特定的列？