sql - 如何使用 Oracle 确定滑动窗口中不同事件的数量

标签 sql oracle window

我的任务是确定不同帐户上三个事件的事实是在 1 小时窗口内。

解决方案可能是这样的
count(distinct account_id) over (order by time_key range between 20 PRECEDING and CURRENT ROW)
并检查 count() > 3

但是 Oracle 不能在 order by 子句中使用不同的函数:

ORA-30487: 此处不允许 ORDER BY

我有下面的解决方案,但似乎很难

with t_data as (
select 1 as account_id, 1000 as time_key from dual union
select 1 as account_id, 1010 as time_key from dual union
select 1 as account_id, 1020 as time_key from dual union
select 1 as account_id, 1030 as time_key from dual union
select 2 as account_id, 1040 as time_key from dual union
select 3 as account_id, 1050 as time_key from dual union
select 3 as account_id, 1060 as time_key from dual union
select 3 as account_id, 1070 as time_key from dual union
select 3 as account_id, 1080 as time_key from dual union
select 3 as account_id, 1090 as time_key from dual
order by time_key
)

select *
from (
  select  account_id,
          time_key,
          max(
              case 
               when account_id = 1 then 1
               else 0
              end
          ) over (order by time_key range between 20 PRECEDING and CURRENT ROW) as m1,
          max(
              case 
               when account_id = 2 then 1
               else 0
              end
          ) over (order by time_key range between 20 PRECEDING and CURRENT ROW) as m2,
          max(
              case 
               when account_id = 3 then 1
               else 0
              end
          ) over (order by time_key range between 20 PRECEDING and CURRENT ROW) as m3
  from t_data
)
where m1 = 1 and m2 = 1 and m3 = 1

确定滑动窗口中不同事件数量的更简单方法是什么?

最佳答案

你如何用窗口函数做到这一点对我来说并不是很明显。您可以使用相关子查询:

select t.*,
       (select count(distinct t2.account_id)
        from t_data t2
        where t2.time_key >= t.time_key - 20 and t2.time_key <= t.time_key
       )
from t_data t;

另一种可能具有更好性能的方法是将问题视为间隙和孤岛问题。以下版本返回每个时间键的同时不同帐户的数量:
with t as (
      select account_id, min(time_key) as min_time_key, max(time_key + 20) as max_time_key
      from (select t.*, sum(case when time_key - prev_time_key <= 20 then 0 else 1 end) over (order by time_key) as grp
            from (select t.*, lag(time_key) over (partition by account_id order by time_key) as prev_time_key
                  from t_data t
                 ) t
           ) t
      group by account_id
     )
select td.account_id, td.time_key, count(distinct t.account_id) as num_distinct
from t_data td join
     t
     on td.time_key between t.min_time_key and t.max_time_key
group by td.account_id, td.time_key;

最后,如果您只想找到 3 个(或 2 个)帐户 ID,并且您只关心获取达到最大值的一些示例,那么您可以执行以下操作:
select t.*
from (select t.*,
             min(account_id) over (order by time_key range between 20 preceding and 1 preceding) as min_account_id,
             max(account_id) over (order by time_key range between 20 preceding and 1 preceding) as max_account_id
      from t_data t
     ) t
where min_account_id <> max_account_id and
      account_id <> min_account_id and
      account_id <> max_account_id;

这将从前 20 行中获取最大和最小帐户 ID——不包括当前行。如果这些与当前值不同,那么您将拥有三个不同的值。

关于sql - 如何使用 Oracle 确定滑动窗口中不同事件的数量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51886674/

相关文章:

sql - 索引是否用于嵌套选择中的 'outer' 和 'inner' where 子句?

mysql - 为什么数组不能保存在sql中

database - Oracle 数据类型列表

visual-studio - 如何在 Visual Studio 上以管理员身份运行最近的项目或解决方案

javascript - 检测 Firefox 中打开的窗口/选项卡

Java 数据库 MySQL

sql - 如何在 XML 列上使用 xpath 来选择行?

java - java.sql.Timestamp 时区是否特定?

java - 在 Oracle 上使用 Hibernate 的死锁事务

java - 使用 KeyEvent 在 Java 中调整窗口大小