我有一张类似于所示的表。它包含一个用户 ID 列表、一天中每个小时的小时值和一个 Avail 标志,以确定该用户在那个小时是否有空。
我需要列出在定义为 @n 的连续几个小时内可用的所有用户 ID
#####################
# UID # Avail # Hour#
#####################
# 123 # 1 # 0 #
# 123 # 1 # 1 #
# 123 # 0 # 2 #
# 123 # 0 # 3 #
# 123 # 0 # 4 #
# 123 # 1 # 5 #
# 123 # 1 # 6 #
# 123 # 1 # 7 #
# 123 # 1 # 8 #
# 341 # 1 # 0 #
# 341 # 1 # 1 #
# 341 # 0 # 2 #
# 341 # 1 # 3 #
# 341 # 1 # 4 #
# 341 # 0 # 5 #
# 341 # 1 # 6 #
# 341 # 1 # 7 #
# 341 # 0 # 8 #
######################
这应该会导致@n=3 的以下输出
#######
# UID #
#######
# 123 #
#######
我曾尝试使用
ROW_NUMBER() over (partition by UID,Avail ORDER BY UID,Hour)
为由 UID 分区的每一行分配一个数字,以及它们是否被标记为可用。但是,这不起作用,因为可用性时间段一天可能会更改多次,并且 ROW_NUMBER() 函数仅根据可用性标志为每个用户保留两个计数。
最佳答案
如果您使用的是 SQL Server 2012+,您可以使用窗口 SUM,但您必须提前指定窗口框架中的行数,因为它不接受变量,因此它不是那么灵活:
;with cte as
(
select distinct
UID,
SUM(avail) over (partition by uid
order by hour
rows between current row and 2 following
) count
from table1
)
select uid from cte where count = 3;
如果您想要灵活性,您可以将其设为存储过程并使用动态 SQL 来构建和执行语句,如下所示:
create procedure testproc (@n int) as
declare @sql nvarchar(max)
set @sql = concat('
;with cte as
(
select distinct
UID,
SUM(avail) over (partition by uid
order by hour
rows between current row and ', @n - 1 , ' following
) count
from table1
)
select uid from cte where count = ' , @n , ';')
exec sp_executesql @sql
并使用
execute testproc 3
执行它一个更不灵活的解决方案是使用相关子查询,但是您必须为每个添加的计数添加另一个子查询:
select distinct uid
from Table1 t1
where Avail = 1
and exists (select 1 from Table1 where Avail = 1 and UID = t1.UID and Hour = t1.Hour + 1)
and exists (select 1 from Table1 where Avail = 1 and UID = t1.UID and Hour = t1.Hour + 2);
还有另一种方法,使用 row_number 查找岛屿,然后按每个岛屿的总和进行过滤:
;with c as (
select
uid, avail,
row_number() over (partition by uid order by hour)
- row_number() over (partition by uid, avail order by hour) grp
from table1
)
select uid from c
group by uid, grp
having sum(avail) >= 3
关于SQL 查找组中的连续数字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31837628/