sql - 使用 generate_series 在 PostgreSQL9.5 中对时间序列进行分箱?

标签 sql postgresql time-series binning generate-series

我正在尝试通过将持续时间(ended_at 和 started_at 之间的差异)分配到 1 小时的 bin 中来从下表中提取数据。

 id |       started_at        |        ended_at         |        friendly_name        |    ssid
----+-------------------------+-------------------------+-----------------------------+-------------
  1 | 2016-03-16 09:40:34.796 | 2016-03-16 11:32:22.536 | Wireless Network Connection | LINCS
  2 | 2016-03-24 09:44:28.266 | 2016-03-24 10:14:56.598 | Wireless Network Connection | LINCS
  3 | 2016-03-21 15:12:51.63  | 2016-03-21 15:13:39.815 | Wireless Network Connection | router
  4 | 2016-03-21 15:13:43.609 | 2016-03-21 17:14:32.686 | Wireless Network Connection | LINCS
  5 | 2016-03-21 12:56:14.014 | 2016-03-21 13:12:23.778 | Wireless Network Connection | router
  6 | 2016-03-21 12:56:10.158 | 2016-03-21 12:56:13.576 | Wireless Network Connection | router
  7 | 2016-03-21 13:12:28.104 | 2016-03-21 15:12:18.715 | Wireless Network Connection | LINCS
  8 | 2016-03-15 15:09:36.917 | 2016-03-15 16:03:00.881 | Wireless Network Connection | LINCS
  9 | 2016-03-15 15:17:36.318 | 2016-03-15 16:03:00.881 | Local Area Connection 2     |
 10 | 2016-03-18 09:57:06.436 | 2016-03-18 09:59:12.266 | Wireless Network Connection | LINCS
 11 | 2016-03-18 10:00:10.774 | 2016-03-18 10:00:11.665 | Wireless Network Connection | LINCS-guest
 12 | 2016-03-18 10:00:56.452 | 2016-03-18 10:01:58.145 | Wireless Network Connection | LINCS
 13 | 2016-03-18 10:00:11.961 | 2016-03-18 10:00:55.64  | Wireless Network Connection | LINCS-guest
 14 | 2016-03-18 10:02:45.959 | 2016-03-18 10:03:31.015 | Wireless Network Connection | LINCS
 15 | 2016-03-18 10:03:36.617 | 2016-03-18 10:03:38.879 | Wireless Network Connection | router
 16 | 2016-03-18 10:03:39.83  | 2016-03-18 12:31:31.554 | Wireless Network Connection | router
 17 | 2016-03-22 11:39:37.575 | 2016-03-22 12:33:28.507 | Wireless Network Connection | LINCS
 18 | 2016-03-22 11:26:14.581 | 2016-03-22 11:39:31.441 | Wireless Network Connection | LINCS
 19 | 2016-03-22 09:52:28.27  | 2016-03-22 11:25:47.034 | Wireless Network Connection | LINCS
 20 | 2016-03-17 13:11:09.64  | 2016-03-17 17:54:02.132 | Local Area Connection 2     |
 22 | 2016-03-23 09:45:08.519 | 2016-03-23 12:36:53.584 | Wireless Network Connection | LINCS
 31 | 2016-03-17 11:58:19.477 | 2016-03-17 11:59:19.555 | Local Area Connection 2     |
 36 | 2016-03-21 09:34:28.488 | 2016-03-21 12:30:43.361 | Wireless Network Connection | LINCS
 37 | 2016-03-24 11:13:28.319 | 2016-03-24 12:36:27.777 | Wireless Network Connection | LINCS
 41 | 2016-03-22 12:57:14.685 | 2016-03-22 17:51:06.866 | Wireless Network Connection | LINCS
 21 | 2016-03-17 13:09:59.749 | 2016-03-17 17:54:02.132 | Wireless Network Connection | LINCS
 23 | 2016-03-16 16:46:32.688 | 2016-03-16 17:36:59.534 | Local Area Connection 2     |
 24 | 2016-03-16 15:40:47.063 | 2016-03-16 17:36:59.534 | Wireless Network Connection | LINCS
 27 | 2016-03-16 11:57:24.468 | 2016-03-16 12:28:37.461 | Wireless Network Connection | LINCS
 28 | 2016-03-16 12:54:03.419 | 2016-03-16 14:34:52.477 | Wireless Network Connection | LINCS
 30 | 2016-03-17 11:02:26.223 | 2016-03-17 11:58:18.497 | Local Area Connection 2     |
 38 | 2016-03-25 13:05:04.641 | 2016-03-25 14:39:41.54  | Wireless Network Connection | LINCS
 39 | 2016-03-18 12:55:56.748 | 2016-03-18 18:04:34.032 | Wireless Network Connection | LINCS
 40 | 2016-03-22 12:56:33.444 | 2016-03-22 12:57:07.444 | Wireless Network Connection | LINCS
 42 | 2016-03-24 13:05:05.484 | 2016-03-24 18:35:11.764 | Wireless Network Connection | LINCS
 43 | 2016-03-25 09:43:46.038 | 2016-03-25 09:53:21.127 | Wireless Network Connection | LINCS
 44 | 2016-03-25 09:53:27.911 | 2016-03-25 12:33:28.556 | Wireless Network Connection | LINCS
 25 | 2016-03-15 17:14:43.024 | 2016-03-15 18:06:59.491 | Wireless Network Connection | LINCS
 26 | 2016-03-23 13:02:42.408 | 2016-03-23 15:45:55.124 | Wireless Network Connection | LINCS
 32 | 2016-03-17 11:59:23.153 | 2016-03-17 12:07:27.004 | Local Area Connection 2     |
 29 | 2016-03-16 15:16:11.326 | 2016-03-16 15:31:57.623 | Wireless Network Connection | LINCS
 33 | 2016-03-17 09:57:04.246 | 2016-03-17 11:58:59.182 | Wireless Network Connection | LINCS
 34 | 2016-03-17 12:07:27.095 | 2016-03-17 12:37:40.311 | Local Area Connection 2     |
 35 | 2016-03-17 11:59:20.515 | 2016-03-17 12:37:40.311 | Wireless Network Connection | LINCS
 45 | 2016-03-25 14:40:44.5   | 2016-03-25 18:24:08.555 | Wireless Network Connection | LINCS
 46 | 2016-03-31 10:16:31.8   | 2016-03-31 12:33:59.123 | Wireless Network Connection | LINCS
(46 rows)

我想要如下输出(下面的输出只是一个模型):

bin    |    duration    |    time
  1           3203.9       2016-03-15 15:00:00
  2           3136.46      2016-03-15 17:00:00
  3           2548.52      2016-03-16 09:00:00
  4           3004.00      2016-03-16 10:00:00
  5           1800.08      2016-03-16 11:00:00

到目前为止我尝试了什么:

select width_bucket(c.started_at::timestamp, array[d]::timestamp[]) as Bin, EXTRACT(EPOCH FROM SUM(c.ended_at-c.started_at)) as value,
 date_trunc('hour', c.started_at) as time
from connections c, (
            select date_trunc('hour', generate_series(c.started_at, c.ended_at, '1 hour'))
            from connections c
            --where c.friendly_name LIKE 'Wireless Network%'
            --AND c.ssid LIKE 'LINCS'
            group by 1
            order by 1
        ) d
where c.ssid!=''
AND c.started_at BETWEEN '2016-03-15 15:00:00' AND '2016-03-18 19:00:00'
AND c.friendly_name LIKE 'Wireless Network%'
group by 1,3
order by 1

我收到以下错误:

ERROR:  cannot cast type record to timestamp without time zone
LINE 1: ...elect width_bucket(c.started_at::timestamp, array[d]::timest...
                                                         ^
********** Error **********

ERROR: cannot cast type record to timestamp without time zone
SQL state: 42846
Character: 52

虽然我对 SQL/PostgreSQL 还很陌生,但我知道该错误与类型转换有关。但我不确定是否可以在 width_bucket() 函数的 array[] 字段中使用 generate_series()?

请帮忙!

到目前为止有效的方法(但不实用,因为填写“数组”字段是手动的):

select width_bucket(c.started_at::timestamp, array['2016-03-15', '2016-03-16', '2016-03-17', '2016-03-18']::timestamp[]) as Bin, EXTRACT(EPOCH FROM SUM(c.ended_at-c.started_at)) as value,
date_trunc('hour', c.started_at) as time
from connections c
where c.ssid!=''
AND c.started_at BETWEEN '2016-03-15 15:00:00' AND '2016-03-18 19:00:00'
AND c.friendly_name LIKE 'Wireless Network%'
group by 1,3
order by 1

最佳答案

ARRAY[] 构造函数采用固定的值列表,例如数组[1,2,3]

ARRAY() 构造函数采用子查询,并将每一行转换为数组元素,例如

ARRAY(SELECT generate_series(
  '2016-03-15 15:00:00'::timestamp,
  '2016-03-18 19:00:00'::timestamp,
  '1 hour'
))

关于sql - 使用 generate_series 在 PostgreSQL9.5 中对时间序列进行分箱?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37856000/

相关文章:

sql - Postgres SQL Group By 直到下一行不同

python - 按总和日期分组,并用过去日期中的事故填充所有缺失值,直到计数 = 1

reactjs - 在 ESNet React-timeseries-charts 图表上启用缩放

sql - SQL Server 2016 中的 R

sql - 从 Slick 3.x 中的分组依据中按列选取最大项目

sql - Postgresql更新字符串,前后添加字符

postgresql - 主键在postgresql中自动索引?

r - 为什么在带有 xts/zoo 的 R 中没有 apply.hourly?

mysql - Zeoslib : How to tell when query execution is complete?

mysql - 具有联合和连接的 select 语句并添加不同的字段来定义行类型