sql - 合成范围内的 SQL 行

标签 sql sql-server

使用 SQL 服务器,我有一个类似于以下内容的表:

id | time                | measurement
---+---------------------+-------------
1  | 2014-01-01T05:00:00 | 1.0
1  | 2014-01-01T06:45:00 | 2.0
1  | 2014-01-01T09:30:00 | 3.0
1  | 2014-01-01T11:00:00 | NULL
1  | 2014-02-05T03:00:00 | 1.0
1  | 2014-02-05T05:00:00 | NULL

在为同一 ID 提供新值之前,假定存储的测量值是准确的;给定 id 的最后一次测量是序列的结尾。

如果这些跨度不存在(并且前一个点既不是 0 也不是 NULL),我有兴趣创建一个查询或 View ,在由这些跨度定义的每个小时合成新数据点,因此:
id | time                | measurement
---+---------------------+-------------
1  | 2014-01-01T05:00:00 | 1.0
1  | 2014-01-01T06:00:00 | 1.0
1  | 2014-01-01T06:45:00 | 2.0
1  | 2014-01-01T07:00:00 | 2.0
1  | 2014-01-01T08:00:00 | 2.0
1  | 2014-01-01T09:00:00 | 2.0
1  | 2014-01-01T09:30:00 | 3.0
1  | 2014-01-01T10:00:00 | 3.0
1  | 2014-02-05T03:00:00 | 1.0
1  | 2014-02-05T04:00:00 | 1.0

这可行吗?

如果每个输入行都有一个“持续时间”,指定其测量有效的时间量,是否更可行? (在这种情况下,我们将有效地解包 SQL 中的游程长度编码)。 [我的目标是 SQL Server 2012,它具有 LEAD 和 LAG 功能,可以轻松构建]。

要以 SQL Server 可使用的格式提供该数据:
select id, cast(stime as datetime) as [time], measurement 
from 
(values
    (1, '2014-01-01T05:00:00', 1.0), 
    (1, '2014-01-01T05:00:00', 1.0), 
    (1, '2014-01-01T06:45:00', 2.0), 
    (1, '2014-01-01T09:30:00', 3.0), 
    (1, '2014-01-01T11:00:00', NULL), 
    (1, '2014-02-05T03:00:00', 1.0), 
    (1, '2014-02-05T05:00:00', NULL)
) t(id, stime, measurement) 

最佳答案

它很复杂,但有效(对于您提供的数据集)

;WITH cte AS (
SELECT *
FROM (VALUES
(1, '2014-01-01T05:00:00', '1.0'),(1, '2014-01-01T06:45:00', '2.0'),
(1, '2014-01-01T09:30:00', '3.0'),(1, '2014-01-01T11:00:00', NULL),
(1, '2014-02-05T03:00:00', '1.0'),(1, '2014-02-05T05:00:00', NULL)
) as t (id, [time], measurement)
)
--Get intervals for every date
, dates AS (
SELECT MIN([time]) [min], DATEADD(hour,-1,MAX([time])) [max]
FROM cte
GROUP BY CAST([time] as date)
)
--Create table with gaps datetimes
, add_dates AS (
SELECT CAST([min] as datetime) as date_
FROM dates
UNION ALL
SELECT DATEADD(hour,1,a.date_)
FROM add_dates a
INNER JOIN dates d 
    ON a.date_ between d.[min] and d.[max]
WHERE a.date_ < d.[max]
)
--Get intervals of datetimes with ids and measurements
, res AS (
SELECT  id,
        [time],
        LEAD([time],1,NULL) OVER (ORDER BY [time])as [time1],
        measurement
FROM cte
)
--Final select
SELECT DISTINCT *
FROM (
    SELECT  r.id,
            a.date_,
            r.measurement
    FROM add_dates a
    LEFT JOIN res r
        ON a.date_ between r.time and r.time1
    WHERE measurement IS NOT NULL
    UNION ALL
    SELECT * 
    FROM cte
    WHERE measurement IS NOT NULL
) as t
ORDER BY t.date_

输出:
id  date_                   measurement
1   2014-01-01 05:00:00.000 1.0
1   2014-01-01 06:00:00.000 1.0
1   2014-01-01 06:45:00.000 2.0
1   2014-01-01 07:00:00.000 2.0
1   2014-01-01 08:00:00.000 2.0
1   2014-01-01 09:00:00.000 2.0
1   2014-01-01 09:30:00.000 3.0
1   2014-01-01 10:00:00.000 3.0
1   2014-02-05 03:00:00.000 1.0
1   2014-02-05 04:00:00.000 1.0

编辑

第一部分

如果用 dates 更改此部分对此:
, dates AS (
SELECT DATEADD(hour,DATEPART(hour,MIN([time])),CAST(CAST(MIN([time]) as date) as datetime)) [min], DATEADD(hour,-1,MAX([time])) [max]
FROM cte
GROUP BY CAST([time] as date)
)

This truncates minute and second values from dates.



第二部分

And adding partition by id in the LEAD statement keeps different data items from being munged together


, res AS (
SELECT  id,
        [time],
        LEAD([time],1,NULL) OVER (PARTITION BY id ORDER BY [time])as [time1],
        measurement
FROM cte
)

对于原始数据集输出将是相同的。

关于sql - 合成范围内的 SQL 行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37377790/

相关文章:

sql - SQL中Join后计算具有相同值的行

sql-server - 将存储为文本数据类型的数字转换为 int

sql - 如何在两个表之间的组中实现递归连接?

sql - LINQ to SQL 中的嵌套事务

php - 如何在 PHP/MYSQL 查询中使用今天日期的 UNIX 时间戳值?

sql - 在下一个未使用的索引处有序插入,通用 SQL

sql - DENSE_RANK() OVER(按 UniqueIdentifier 排序)问题

sql-server - 如何强制文件流垃圾收集器以最高优先级完成其工作?

sql - 别名派生表是两个选择的并集

mysql - 如何比较sql中的多个值