我有一个包含这种格式记录的表:
SPECIAL_ID | OTHER_ID | NAME | TIMESTAMP
我需要基于此创建一个新表,但只获取与前一个记录相隔至少 25 分钟的记录。间隔不一致。因此,如果我从记录 1 开始,我需要获取距离记录 1 的时间戳至少 25 分钟的下一条记录。这将是新表中的记录 2。然后是提取的前一条记录后 25 分钟的下一条记录。因此,这是一个信息示例:
AAA | 1 | WHATEVER2 | 2016-11-20 00:00:00
BCD | 2 | WHATEVER00 | 2016-11-20 00:02:00
AAA | 3 | WHATEVER01 | 2016-11-20 00:09:00
AAA | 4 | WHATEVER55 | 2016-11-20 00:20:00
XYZ | 5 | WHATEVER | 2016-11-20 00:24:00
AAA | 6 | WHATEVER11 | 2016-11-20 00:45:00
QRS | 7 | WHATEVER | 2016-11-20 00:46:00
QRS | 8 | WHATEVER12 | 2016-11-20 00:59:00
AAA | 9 | WHATEVER12 | 2016-11-20 01:02:00
AAA |10 | WHATEVER12 | 2016-11-20 01:17:00
我想做的是:
AAA | 1 | WHATEVER2 | 2016-11-20 00:00:00
AAA | 6 | WHATEVER11 | 2016-11-20 00:45:00
AAA |10 | WHATEVER12 | 2016-11-20 01:17:00
我设法使用游标并在一小组记录上对其进行了测试。它有效....但我有 数百万 的记录需要以这种方式进行分析。看来,游标只是自找麻烦。
有更好的方法吗?
我被 SQL Server 2008 困住了,所以 lead()
和 lag()
是不可能的。
非常感谢任何帮助。
最佳答案
对于 Sql Server 2008,这可能是一个可能的(计算成本高的)解决方案:
第一步,确定第一条记录。在第二步,查询为每个现有记录确定最近的 25 分钟记录。 随后(步骤 3),数据记录被缩减为具有最低 OTHER_ID 的记录。当然,这只适用于 OTHER_ID 唯一且随时间同步增加的字段。 对于数百万条记录,应该对查询中使用的字段进行索引并限制搜索。
-- test script
SET dateformat ymd
;WITH testdata AS (
SELECT 'AAA' AS SPECIAL_ID,
1 AS OTHER_ID,
'WHATEVER2' AS NAME ,
CONVERT(DATETIME, '2016-11-20 00:00:00') AS [TIMESTAMP]
UNION SELECT 'BCD' , 2 , 'WHATEVER00' , CONVERT(DATETIME, '2016-11-20 00:02:00')
UNION SELECT 'AAA' , 3 , 'WHATEVER01' , CONVERT(DATETIME, '2016-11-20 00:02:01')
UNION SELECT 'AAA' , 4 , 'WHATEVER55' , CONVERT(DATETIME, '2016-11-20 00:20:00')
UNION SELECT 'XYZ' , 5 , 'WHATEVER' , CONVERT(DATETIME, '2016-11-20 00:24:00')
UNION SELECT 'AAA' , 6 , 'WHATEVER11' , CONVERT(DATETIME, '2016-11-20 00:45:00')
UNION SELECT 'QRS' , 7 , 'WHATEVER' , CONVERT(DATETIME, '2016-11-20 00:46:00')
UNION SELECT 'QRS' , 8 , 'WHATEVER12' , CONVERT(DATETIME, '2016-11-20 00:59:00')
UNION SELECT 'AAA' , 9 , 'WHATEVER12' , CONVERT(DATETIME, '2016-11-20 01:02:00')
UNION SELECT 'AAA' ,10 , 'WHATEVER12' , CONVERT(DATETIME, '2016-11-20 01:17:00')
UNION SELECT 'QRS' ,11 , 'WHATEVER13' , CONVERT(DATETIME, '2016-11-20 01:30:00')
), firstRecord AS (
SELECT SPECIAL_ID, MIN(OTHER_ID) AS OTHER_ID
FROM testdata
GROUP BY SPECIAL_ID
), nextRecord1 AS (
SELECT I1.SPECIAL_ID, I1.OTHER_ID AS OTHER_ID, MIN(I2.OTHER_ID) AS next_OTHER_ID
FROM testdata I1
INNER JOIN testdata I2
ON I1.SPECIAL_ID = I2.SPECIAL_ID
AND I1.OTHER_ID < I2.OTHER_ID
AND I2.[TIMESTAMP] >= DATEADD(minute, 25, I1.[TIMESTAMP])
GROUP BY I1.SPECIAL_ID, I1.OTHER_ID
), nextRecord2 AS (
SELECT SPECIAL_ID, MIN(OTHER_ID) AS OTHER_ID, next_OTHER_ID
FROM nextRecord1
GROUP BY SPECIAL_ID, next_OTHER_ID
)
SELECT T2.*
FROM firstRecord T1
INNER JOIN testdata T2
ON T1.OTHER_ID = T2.OTHER_ID
UNION
SELECT T2.*
FROM nextRecord2 T1
INNER JOIN testdata T2
ON T1.next_OTHER_ID = T2.OTHER_ID
关于sql - 此过程的游标替代方案?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40707886/