我正在使用 Postgresql 9,当没有计算行数时,我正在与计数和分组作斗争。
让我们假设以下架构:
create table views {
date_event timestamp with time zone ;
event_id integer;
}
让我们想象以下内容:
2012-01-01 00:00:05 2
2012-01-01 01:00:05 5
2012-01-01 03:00:05 8
2012-01-01 03:00:15 20
我想按小时分组,统计行数。我希望我能检索到以下内容:
2012-01-01 00:00:00 1
2012-01-01 01:00:00 1
2012-01-01 02:00:00 0
2012-01-01 03:00:00 2
2012-01-01 04:00:00 0
2012-01-01 05:00:00 0
.
.
2012-01-07 23:00:00 0
我的意思是,对于每个时间范围槽,我计算表中日期对应的行数,否则,我返回计数为零的行。
以下肯定行不通(只会产生计数行数 > 0 的行)。
SELECT extract ( hour from date_event ),count(*)
FROM views
where date_event > '2012-01-01' and date_event <'2012-01-07'
GROUP BY extract ( hour from date_event );
请注意,我可能还需要按分钟、小时、天、月或年进行分组(当然可以进行多个查询)。
我只能使用普通的旧 sql,并且由于我的 View 表可能非常大(>100M 记录),所以我尽量将性能放在心上。
如何实现?
谢谢!
最佳答案
鉴于表格中没有日期,您需要一种生成日期的方法。您可以使用 generate_series
功能:
SELECT * FROM generate_series('2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts;
这将产生如下结果:
ts
---------------------
2012-01-01 00:00:00
2012-01-01 01:00:00
2012-01-01 02:00:00
2012-01-01 03:00:00
...
2012-01-07 21:00:00
2012-01-07 22:00:00
2012-01-07 23:00:00
(168 rows)
剩下的任务是像这样使用外部连接连接两个选择:
select extract ( day from ts ) as day, extract ( hour from ts ) as hour,coalesce(count,0) as count from
(
SELECT extract ( day from date ) as day , extract ( hour from date ) as hr ,count(*)
FROM sr
where date>'2012-01-01' and date <'2012-01-07'
GROUP BY extract ( day from date ) , extract ( hour from date )
) AS cnt
right outer join ( SELECT * FROM generate_series ( '2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts ) as dtetable on extract ( hour from ts ) = cnt.hr and extract ( day from ts ) = cnt.day
order by day,hour asc;
关于postgresql - 按日期分组,当 count() 不产生任何行时为 0,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9428287/