我在 BigQuery 中有一个表,如下所示:
Caller_Number | month | day| call_time
--------------|--------|-----|----------
1 | 5 | 15 | 12:56:17
我想为 BigQuery 编写一个 SQL 查询,它将允许我计算至少进行一次调用的连续小时数(按 caller_number 排序),以及至少连续 10 小时发生调用的连续天数(按 caller_number 排序)。我一直在查看有关间隙和岛屿的现有资源,但似乎无法弄清楚如何将其应用于连续的日期和时间。
最佳答案
以下是连续几个小时的工作示例
步骤是
1.从call_time中“提取”小时
HOUR(TIMESTAMP(CURRENT_DATE() + ' ' + call_time))
2.查找前一小时
LAG([hour]) OVER(PARTITION BY Caller_Number, [month], [day] ORDER BY [hour])
3.计算连续小时组的开始 - 1 - 开始,0 - 组继续
IFNULL(INTEGER([hour] - prev_hour > 1), 1)
4.给每个组分配组号
SUM(seq) OVER(PARTITION BY Caller_Number, [month], [day] ORDER BY [hour])
5.最后——按组号分组并计算通话次数和小时数
希望这能为您在连续数小时结果之上连续几天实现类似逻辑提供良好的开端
SELECT Caller_Number, [month], [day], seq_group,
EXACT_COUNT_DISTINCT([hour]) AS hours_count, COUNT(1) AS calls_count
FROM (
SELECT Caller_Number, [month], [day], [hour],
SUM(seq) OVER(PARTITION BY Caller_Number, [month], [day]
ORDER BY [hour]) AS seq_group
FROM (
SELECT Caller_Number, [month], [day], [hour],
IFNULL(INTEGER([hour] - prev_hour > 1), 1) AS seq
FROM (
SELECT Caller_Number, [month], [day], [hour],
LAG([hour]) OVER(PARTITION BY Caller_Number, [month], [day]
ORDER BY [hour]) AS prev_hour
FROM (
SELECT Caller_Number, [month], [day],
HOUR(TIMESTAMP(CURRENT_DATE() + ' ' + call_time)) AS [hour]
FROM YourTable
)
)
)
)
GROUP BY Caller_Number, [month], [day], seq_group
关于sql - 使用间隙和孤岛查找连续的小时/日期- SQL/BigQuery,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36141237/