问题
我无法弄清楚如何创建一个查询,该查询可以判断任何用户条目之前是否有 7 天没有任何事件 (secondsPlayed == 0),如果是,则用值 1 表示它,否则用 0 表示。
这也意味着,如果用户的条目少于 7 个,则所有条目的值将为 0。
Input table: +------------------------------+-------------------------+---------------+ | userid | estimationDate | secondsPlayed | +------------------------------+-------------------------+---------------+ | a | 2016-07-14 00:00:00 UTC | 192.5 | | a | 2016-07-15 00:00:00 UTC | 357.3 | | a | 2016-07-16 00:00:00 UTC | 0 | | a | 2016-07-17 00:00:00 UTC | 0 | | a | 2016-07-18 00:00:00 UTC | 0 | | a | 2016-07-19 00:00:00 UTC | 0 | | a | 2016-07-20 00:00:00 UTC | 0 | | a | 2016-07-21 00:00:00 UTC | 0 | | a | 2016-07-22 00:00:00 UTC | 0 | | a | 2016-07-23 00:00:00 UTC | 0 | | a | 2016-07-24 00:00:00 UTC | 0 | | ---------------------------- | ---------------------- | ---- | | b | 2016-07-02 00:00:00 UTC | 31.2 | | b | 2016-07-03 00:00:00 UTC | 42.1 | | b | 2016-07-04 00:00:00 UTC | 41.9 | | b | 2016-07-05 00:00:00 UTC | 43.2 | | b | 2016-07-06 00:00:00 UTC | 91.5 | | b | 2016-07-07 00:00:00 UTC | 0 | | b | 2016-07-08 00:00:00 UTC | 0 | | b | 2016-07-09 00:00:00 UTC | 239.1 | | b | 2016-07-10 00:00:00 UTC | 0 | +------------------------------+-------------------------+---------------+
预期的输出表应如下所示:
Output table: +------------------------------+-------------------------+---------------+----------+ | userid | estimationDate | secondsPlayed | inactive | +------------------------------+-------------------------+---------------+----------+ | a | 2016-07-14 00:00:00 UTC | 192.5 | 0 | | a | 2016-07-15 00:00:00 UTC | 357.3 | 0 | | a | 2016-07-16 00:00:00 UTC | 0 | 0 | | a | 2016-07-17 00:00:00 UTC | 0 | 0 | | a | 2016-07-18 00:00:00 UTC | 0 | 0 | | a | 2016-07-19 00:00:00 UTC | 0 | 0 | | a | 2016-07-20 00:00:00 UTC | 0 | 0 | | a | 2016-07-21 00:00:00 UTC | 0 | 0 | | a | 2016-07-22 00:00:00 UTC | 0 | 1 | | a | 2016-07-23 00:00:00 UTC | 0 | 1 | | a | 2016-07-24 00:00:00 UTC | 0 | 1 | | ---------------------------- | ----------------------- | ----- | ----- | | b | 2016-07-02 00:00:00 UTC | 31.2 | 0 | | b | 2016-07-03 00:00:00 UTC | 42.1 | 0 | | b | 2016-07-04 00:00:00 UTC | 41.9 | 0 | | b | 2016-07-05 00:00:00 UTC | 43.2 | 0 | | b | 2016-07-06 00:00:00 UTC | 91.5 | 0 | | b | 2016-07-07 00:00:00 UTC | 0 | 0 | | b | 2016-07-08 00:00:00 UTC | 0 | 0 | | b | 2016-07-09 00:00:00 UTC | 239.1 | 0 | | b | 2016-07-10 00:00:00 UTC | 0 | 0 | +------------------------------+-------------------------+---------------+----------+
想法
起初我考虑使用偏移量为 7 的 Lag 函数,但这显然与中间的任何主题都无关。
我还在考虑创建一个 7 天的滚动窗口/平均值,并评估它是否高于 0。但这可能有点超出我的技能水平。
任何人都有机会解决这个问题。
最佳答案
假设您每天都有数据(这似乎是一个合理的假设),您可以对窗口函数求和:
select t.*,
(case when sum(secondsplayed) over (partition by userid
order by estimationdate
rows between 6 preceding and current row
) = 0 and
row_number() over (partition by userid order by estimationdate) >= 7
then 1
else 0
end) as inactive
from t;
除了日期中没有漏洞之外,还假设 secondsplayed
永远不会为负数。 (负值可以很容易地合并到逻辑中,但这似乎没有必要。)
关于基于最后七个条目设置列的 SQL 查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52277324/