sql - 在 SQL 中,是否可以比我使用游标的算法(参见代码)更有效地完成顺序范围选择?

标签 sql tsql sql-server-2008 performance algorithm

我需要将多个序列号范围(1 个或多个)折叠成它们的最小值和最大值集。我有唯一的整数(没有重复)存储在表列中。

(对我而言)解决此问题的明显方法是使用游标(请参阅下面我的算法)并遍历每个整数。但是,这对我来说似乎效率低下,所以我想知道是否有更有效的算法。也许有一种方法可以通过递归使用公用表表达式。不过我有超过 32767 个整数,所以任何解决方案都需要使用 option (MAXRECURSION 0) 来设置无限递归。

以下是我现有算法使用光标的简化测试用例。它将输出每个序列号范围(例如 1-3、9-11、13-13、15-16)的最小值和最大值。

我使用的是 MS SQL Server 2008。请注意注释以两个破折号 (--) 开头。

declare @minInt int, @maxInt int
declare @nextInt int, @prevInt int
--need a temporary table to store the ranges that were found
declare @rangeTable table (minInt int, maxInt int)
declare mycursor cursor for
select * from
(
    select 1 as id  union
    select 2 as id  union
    select 3 as id  union
    select 9 as id  union
    select 10 as id union
    select 11 as id union
    select 13 as id union
    select 15 as id union
    select 16 as id
) tblRanges
order by id--order is needed for this algorithm if used with generic data
open mycursor
--initialise new sequence
fetch next from mycursor into @minInt
select @maxInt = @minInt--set the min and max to the smallest value
select @prevInt = @minInt--store the last int
declare @sequenceFound int
while @@FETCH_STATUS=0
begin

    select @sequenceFound=1--set the default flag value to true
    --loop while sequence found
    while @@FETCH_STATUS=0 and @sequenceFound = 1
    begin

        fetch next from mycursor into @nextInt
        if @nextInt = (@prevInt + 1)
        begin
            select @sequenceFound = 1
        end
        else
        begin
            select @sequenceFound = 0
        end
        select @prevInt = @nextInt--store the current value as the previous value for the next comparison
        if @sequenceFound = 1 --if the nextInt is part of a sequence, then store the new maxInt
            and @maxInt < @nextInt--should always be true for ordered output containing no duplicates
        begin
            select @maxInt = @nextInt
        end

    end--while sequenceFound
    --store the sequence range and then check for more sequences
    insert into @rangeTable (minInt,maxInt) values (@minInt,@maxInt)
    --store the current value as the new minInt and maxInt for the next sequence iteration
    select @minInt = @nextInt
    select @maxInt = @nextInt
end--while more table rows found
select * from @rangeTable

close mycursor
deallocate mycursor

最佳答案

由 Itzik Ben-Gan 提供:

WITH tblRanges AS
( 
    SELECT 1 AS ID  UNION 
    SELECT 2 AS ID  UNION 
    SELECT 3 AS ID  UNION 
    SELECT 9 AS ID  UNION 
    SELECT 10 AS ID UNION 
    SELECT 11 AS ID UNION 
    SELECT 13 AS ID UNION 
    SELECT 15 AS ID UNION 
    SELECT 16 AS ID 
),
StartingPoints AS
(
SELECT ID, ROW_NUMBER() OVER(ORDER BY ID) AS rownum
FROM tblRanges AS A
WHERE NOT EXISTS
(SELECT *
FROM tblRanges AS B
WHERE B.ID = A.ID - 1)
),
EndingPoints AS
(
SELECT ID, ROW_NUMBER() OVER(ORDER BY ID) AS rownum
FROM tblRanges AS A
WHERE NOT EXISTS
(SELECT *
FROM tblRanges AS B
WHERE B.ID = A.ID + 1)
)
SELECT S.ID AS start_range, E.ID AS end_range
FROM StartingPoints AS S
JOIN EndingPoints AS E
ON E.rownum = S.rownum;

您可以阅读他在 SQL Sever MVP Deep Dives 中名为 Gaps and Islands 的章节中的完整解释。 .他解释了各种技术(包括游标)并比较了它们的性能。

关于sql - 在 SQL 中,是否可以比我使用游标的算法(参见代码)更有效地完成顺序范围选择?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3411192/

相关文章:

sql - 如何将值分配到桶中并找到包含值的桶?

sql - 检查图像列是否为空

简单联合的情况下的SQL查询性能

mysql - COUNT() 是否需要 GROUP BY?

SQL 查询和插入

sql - 交叉表查询按年和月排序

sql - 在 View 中动态更新

sql-server - SQL 2005 在数据库之间复制单列

sql - 计算sql server中的平均评分

sql - 2 个 "same"WHERE 语句返回不同的记录数