sql - 如何插入非分组数据

标签 sql postgresql group-by

受此 great answer 的启发,我编写了以下查询,返回去年根据 5 分钟间隔计算的 AVG。

我想要的是所有 5 分钟的间隔,如果没有行适合特定时间跨度,则设置为 null

with intervals as (select
                     (select min("timestamp") from public.hst_energy_d) + n AS start_timestamp,
                     (select min("timestamp") from public.hst_energy_d) + n + 299 AS end_timestamp
                   from generate_series(extract(epoch from now())::BIGINT - 10596096000, extract(epoch from now())::BIGINT, 300) n)
(SELECT AVG(meas."Al1") as "avg", islots.start_timestamp AS "timestamp"
FROM public.hst_energy_d meas
  RIGHT OUTER JOIN intervals islots
    on meas.timestamp >= islots.start_timestamp and meas.timestamp <= islots.end_timestamp
WHERE
  meas.idinstrum = 4
  AND
  meas.id_device = 122
  AND
  meas.timestamp > extract(epoch from now()) - 10596096000
GROUP BY islots.start_timestamp, islots.end_timestamp
ORDER BY timestamp);

最佳答案

我想我明白你想做什么了,我想知道自由地使用 interval '5 minutes' 是否会是一种更好、更容易遵循的方法:

with times as (  -- find the first date in the dataset, up to today
  select
    date_trunc ('minutes', min("timestamp")) - 
    mod (extract ('minutes' from min("timestamp"))::int, 5) * interval '1 minute' as bt,
    date_trunc ('minutes', current_timestamp) - 
    mod (extract ('minutes' from current_timestamp)::int, 5) * interval '1 minute' as et
  from hst_energy_d
  where
    idinstrum = 4 and
    id_device = 122
), -- generate every possible range between these dates
ranges as (
  select
    generate_series(bt, et, interval '5 minutes') as range_start
  from times
), -- normalize your data to which 5-minut interval it belongs to
rounded_hst as (
  select
    date_trunc ('minutes', "timestamp") - 
    mod (extract ('minutes' from "timestamp")::int, 5) * interval '1 minute' as round_time,
    *
  from hst_energy_d
  where
    idinstrum = 4 and
    id_device = 122  
)
select
  r.range_start, r.range_start + interval '5 minutes' as range_end,
  avg (hd."Al1")
from
  ranges r
  left join rounded_hst hd on
    r.range_start = hd.round_time
group by
  r.range_start
order by
  r.range_start

顺便说一句,眼尖的人可能想知道为什么要为 CTE rounded_hst 而烦恼,为什么不在连接中使用“between”。从我测试和观察到的所有内容来看,数据库将排除所有可能性,然后在相当于 where 子句的条件下测试 between 条件——一个过滤的笛卡尔。对于这么多间隔,这肯定是一个 killer 。

将每个数据截断到最接近的五分钟允许标准 SQL 连接。我鼓励您对两者都进行测试,我想您会明白我的意思。

-- 编辑 2016 年 11 月 17 日 --

考虑到时间的 OP 解决方案是数字,而不是日期:

with times as (  -- find the first date in the dataset, up to today
    select
      date_trunc('minutes', to_timestamp(min("timestamp"))::timestamp) -
      mod(extract ('minutes' from to_timestamp(min("timestamp"))::timestamp)::int, 5) * interval '1 minute' as bt,
      date_trunc('minutes', current_timestamp::timestamp) -
      mod(extract ('minutes' from (current_timestamp)::timestamp)::int, 5) * interval '1 minute' as et
    from hst_energy_d
    where
      idinstrum = 4 and
      id_device = 122
), -- generate every possible range between these dates
    ranges as (
      select
        generate_series(bt, et, interval '5 minutes') as range_start
      from times
  ), -- normalize your data to which 5-minute interval it belongs to
    rounded_hst as (
      select
        date_trunc ('minutes', to_timestamp("timestamp")::timestamp)::timestamp -
        mod (extract ('minutes' from (to_timestamp("timestamp")::timestamp))::int, 5) * interval '1 minute' as round_time,
        *
      from hst_energy_d
      where
        idinstrum = 4 and
        id_device = 122
  )
select
  extract('epoch' from r.range_start)::bigint, extract('epoch' from r.range_start + interval '5 minutes')::bigint as range_end,
  avg (hd."Al1")
from
  ranges r
  left join rounded_hst hd on
                             r.range_start = hd.round_time
group by
  r.range_start
order by
  r.range_start;

关于sql - 如何插入非分组数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40636984/

相关文章:

MySQL : Load Data Infile with AI PRIMARY Key and On Duplicate Key Update

javascript - 将 SQL 命令与数据库匹配

php - 不总结整列

postgresql - plpgsql 中 WHILE 循环的 "END LOOP;"部分出现语法错误

sql - 如何在没有 ARRAY_AGG 的情况下获取 GROUP BY 中的第一个(或任何单个)值?

mysql - 查询 SELECT DISTINCT count()

php - 将 pg_query 的结果存储在表中

linux - 为什么行元组在 Postgres 中是不可变的

python - 使用新数据更新 Pandas 数据框,同时保留现有 ID 号

python - Pandas - 在 groupby 中获取作为频率的值