sql - Postgres - 按顺序跨月进行队列分析,而不是在任何以后的月份中存在

标签 sql postgresql cluster-analysis analytics

我正在进行同类群组分析,可以让这组用户进行检查,然后查看他们是否在接下来的几个月内进行了交易。但我想要这样:

在 12 月交易的那组人中,他们在 1 月进行了交易;来自 12 月的 Jan 组,他们在 2 月进行了交易。基本上我正在跟踪客户群的衰退

我不要的是12月之后任意一个月返回的,就是这样的:

WITH start_sample AS (
SELECT
  user_fk,
  created_at AS start_sample_date
  FROM transactions
    WHERE created_at >= '2016-11-01' AND created_at < '2016-12-01'
      GROUP BY user_fk,
        start_sample_date),

start_sample_min AS (
SELECT
  user_fk,
  MIN(start_sample_date) AS first_transaction
    FROM start_sample
      GROUP BY user_fk
  )

SELECT
  DATE_TRUNC('month', created_at) AS transacting_month,
  COUNT(DISTINCT user_fk)
    FROM transactions
        WHERE created_at >= '2016-11-01'
        AND t.user_fk IN(SELECT user_fk FROM start_sample_min)
          GROUP BY transacting_month
            ORDER BY transacting_month;

然后我制作了一个流失模型,看看它是否能满足我的需求,但它没有:

WITH monthly_users AS (
    SELECT
      user_fk AS monthly_user_fk,
      DATE_TRUNC('month', created_at) AS month
        FROM transactions
          WHERE created_at >= '2016-11-01' AND created_at < '2017-12-01'
            GROUP BY monthly_user_fk, month
            ORDER BY monthly_user_fk, month
),

lag_lead AS (
  SELECT
    monthly_user_fk,
    month,
    LAG(month,1) OVER (PARTITION BY monthly_user_fk ORDER BY month) AS lag,
    LEAD(month,1) OVER (PARTITION BY monthly_user_fk ORDER BY month) AS lead
      FROM monthly_users),

lag_lead_with_diffs AS (
  SELECT
    monthly_user_fk,
    month,
    lag AS previous_month,
    lead AS next_month,
    EXTRACT(EPOCH FROM (month - lag)/86400)::INT AS lag_size,
    EXTRACT(EPOCH FROM (lead - month)/86400)::INT AS lead_size
      FROM lag_lead
  ),

calculated AS (
      SELECT
      month,
      CASE WHEN previous_month IS NULL THEN 'ACTIVATION'
          WHEN lag_size <= 31 THEN 'ACTIVE'
          WHEN lag_size > 31 THEN 'RETURN' END AS this_month_values,
      CASE WHEN (lead_size > 31 OR lead_size IS NULL) THEN 'CHURN' ELSE NULL END AS next_month_churn,
      COUNT(DISTINCT monthly_user_fk) AS c_d_users
   FROM lag_lead_with_diffs
  GROUP BY month, 2, 3
)

SELECT
  month,
  this_month_values,
  SUM(c_d_users) AS distinct_users
  FROM calculated
  GROUP BY month, this_month_values
UNION
SELECT month + INTERVAL '1 month',
  'CHURN',
  SUM(c_d_users)
  FROM calculated
    WHERE next_month_churn IS NOT NULL
      GROUP BY month + INTERVAL '1 month', 2
        HAVING (EXTRACT(EPOCH FROM (month + INTERVAL '1 month'))) < 1512086400
          ORDER BY month, this_month_values;

然而,这在初始组中并不固定。 Active 组逐月滚动。

我知道上面的内容可能比我问的更复杂,但我似乎无法理解它

提前致谢

最佳答案

也许这就是您正在寻找的:

with Monthly_Users as (
select user_fk
     , date_trunc('month',created_at) as month
     , (date_part('year', created_at) - 2016) * 12
     + date_part('month', created_at) - 11 as Months_Between
  from transactions
 where created_at between date '2016-11-01'
                      and date '2017-12-01'
 group by user_fk, month, months_between
), t2 as (
select Monthly_Users.*
     , count(*) over (partition by user_fk
                          order by month rows between unbounded preceding
                                                  and 1 preceding) prev_rec_cnt
  from Monthly_Users
)
select month
     , count(*)
  from t2
 where Months_Between = Prev_Rec_Cnt
 group by month
 order by month;

在此查询中,Monthly_Users CTE 与您的一样,但添加了 Months_Between 数量的计算,created_at 日期和您的初始开始日期。在第二个公用表表达式中,我计算了当前 month 记录之前每个 user_fk 的出现次数。最后,在输出查询中,我将结果限制为仅那些 Months_Between 值与 Prev_Rec_Cnt 值匹配的记录。任何错过的月份都会导致 Prev_Rec_Cnt 值与 Months_Between 值不匹配,因此您将能够看到 user_fk 值逐月下降。

关于sql - Postgres - 按顺序跨月进行队列分析,而不是在任何以后的月份中存在,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47658511/

相关文章:

sql - 将 SQL 结果分组/聚合到 1 小时的桶中

python - Web.py SQL 查询提供奇数数据

algorithm - 接受任意距离函数的聚类算法

php - 无法在centos 7中使用php连接postgres数据库

python - pandas 数据框对象将与 sklearn kmeans 聚类一起使用吗?

r - 了解 R 中的 Biclust 类

mysql - 减去/求和自己的表格

c# - SQL异常 : String or binary data would be truncated

c# - 如何获取表的主键列(如果是复合主键)

database-design - 引用 PK 的外键是否需要 NOT NULL 约束?