MySQL - 查找 ROW - LAG 和 LEAD 之间的时间差不起作用

标签 mysql

我在图像 screenshot of my table 上有一个表格,如下所示

customer_id   purchase_date     Difference between purchases (in days) 
                                     = WHAT I AM TRYING TO GET
1             23/04/2017        0 (first )row
1             24/04/2017        1
1             01/01/2018        252
2             03/05/2017        0 (this is a new customer)
2             10/05/2017        7

我想计算同一客户两次购买之间的时间差。我尝试使用 LAG 和 LEAD 函数,但出现我不理解的语法错误

现在我一直在这样做:

SELECT customer_id, purchase_date
case 
when lag(purchase_date,1,0) over (partition by customer_id order by 
purchase_date) = 0 then 0
ELSE purchase_date -lag(purchase_date,1,0) over(partition by customer_id order 
by purchase_date) 
end 
FROM
Table1

在第一个“over”之后,它给了我一个我不明白的语法错误

最佳答案

Edited 19-10-2019
Since 19-04-2018 MySQL 8 was released, which is much better then this answer to use MySQL's user variables to emulate/simulate LEAD()/LAG() window functions.
So if possible i advice to update your current MySQL version to MySQL 8 if possible.

当前 MySQL 版本不支持 LAG 和 LEAD 等窗口函数。
MySQL 8.0+ 现在是候选版本,将支持窗口函数,但还不能用于生产。

在当前的 MySQL 版本中,您可以使用 MySQL 的用户变量或相关子查询来模拟 LAG。

创建表/插入数据

CREATE TABLE Table1
    (`customer_id` int, `purchase_date` varchar(10))
;

INSERT INTO Table1
    (`customer_id`, `purchase_date`)
VALUES
    (1, '23/04/2017'),
    (1, '24/04/2017'),
    (1, '01/01/2018'),
    (2, '03/05/2017'),
    (2, '10/05/2017')
;

MySQL 用户变量的技巧是正确地初始化它们。

查询

SELECT 
 *
 , (@customer_id := Table1.customer_id) AS init_customer_id_param
 , (@purchase_date := Table1.purchase_date) AS init_purchase_date_param
FROM 
 Table1
CROSS JOIN (
 SELECT
      @customer_id := NULL
   ,  @purchase_date := NULL 
)
 AS init_user_params

结果

| customer_id | purchase_date | @customer_id := NULL | @purchase_date := NULL | init_customer_id_param | init_purchase_date_param |
|-------------|---------------|----------------------|------------------------|------------------------|--------------------------|
|           1 |    23/04/2017 |               (null) |                 (null) |                      1 |               23/04/2017 |
|           1 |    24/04/2017 |               (null) |                 (null) |                      1 |               24/04/2017 |
|           1 |    01/01/2018 |               (null) |                 (null) |                      1 |               01/01/2018 |
|           2 |    03/05/2017 |               (null) |                 (null) |                      2 |               03/05/2017 |
|           2 |    10/05/2017 |               (null) |                 (null) |                      2 |               10/05/2017 |

现在您可以添加计算部分。
请记住,顺序很重要,需要在初始化 MySQL 用户变量之前完成计算。
因此MySQL用户变量具有前一列的值。

查询

SELECT 
 *
 , (
     CASE
       WHEN (@customer_id = Table1.customer_id)
       THEN DATEDIFF(STR_TO_DATE(purchase_date, "%d/%m/%Y"), STR_TO_DATE(@purchase_date, "%d/%m/%Y"))
     END
   ) AS diff
 , (@customer_id := Table1.customer_id) AS init_customer_id_param
 , (@purchase_date := Table1.purchase_date) AS init_purchase_date_param   
FROM 
 Table1
CROSS JOIN (
 SELECT
      @customer_id := NULL
   ,  @purchase_date := NULL 
)
 AS init_user_params
ORDER BY 
  STR_TO_DATE(purchase_date, "%d/%m/%Y") ASC

注意

我使用 STR_TO_DATE 函数将基于 varchar 的日期格式格式化为日期时间格式。
如果您的日期列已经是日期数据类型,您可以删除该函数并使用 THEN DATEDIFF(purchase_date, @purchase_date) 代替。

结果

| customer_id | purchase_date | @customer_id := NULL | @purchase_date := NULL |   diff | init_customer_id_param | init_purchase_date_param |
|-------------|---------------|----------------------|------------------------|--------|------------------------|--------------------------|
|           1 |    23/04/2017 |               (null) |                 (null) | (null) |                      1 |               23/04/2017 |
|           1 |    24/04/2017 |               (null) |                 (null) |      1 |                      1 |               24/04/2017 |
|           1 |    01/01/2018 |               (null) |                 (null) |    252 |                      1 |               01/01/2018 |
|           2 |    03/05/2017 |               (null) |                 (null) | (null) |                      2 |               03/05/2017 |
|           2 |    10/05/2017 |               (null) |                 (null) |      7 |                      2 |               10/05/2017 |

现在它只需选择正确的列即可。

查询

SELECT 
   Table1_user_params.customer_id 
 , Table1_user_params.purchase_date
 , (
     CASE
       WHEN  Table1_user_params.diff IS NULL
       THEN 0
       ELSE Table1_user_params.diff 
     END 
   ) 
    AS diff
FROM ( 
  SELECT 
   *
   , (
       CASE
         WHEN (@customer_id = Table1.customer_id)
         THEN DATEDIFF(STR_TO_DATE(purchase_date, "%d/%m/%Y"), STR_TO_DATE(@purchase_date, "%d/%m/%Y"))
       END
     ) AS diff
   , (@customer_id := Table1.customer_id) AS init_customer_id_param
   , (@purchase_date := Table1.purchase_date) AS init_purchase_date_param   
  FROM 
   Table1
  CROSS JOIN (
    SELECT
        @customer_id := NULL
     ,  @purchase_date := NULL 
  )
   AS init_user_params
  ORDER BY 
    STR_TO_DATE(purchase_date, "%d/%m/%Y") ASC
) 
 AS Table1_user_params

结果

| customer_id | purchase_date | diff |
|-------------|---------------|------|
|           1 |    23/04/2017 |    0 |
|           1 |    24/04/2017 |    1 |
|           1 |    01/01/2018 |  252 |
|           2 |    03/05/2017 |    0 |
|           2 |    10/05/2017 |    7 |

关于MySQL - 查找 ROW - LAG 和 LEAD 之间的时间差不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49532456/

相关文章:

php - 在文件中存储大数据与在表中存储大数据

mysql - 亚搏体育appGitLab CI : My Test Job doesn't pickup the mysql container

MySQL 如何通过 MAX(id) 选择 smth....WHERE userID = 某个数字 GROUP BY smth

mysql - 不可能在一个非常简单的订单查询中避免 'using filesort'

mysql - 如何使用 InnoDB 引擎减少此查询的查询时间

mysql - 使用 max sql 获取关联值

php - 如何获取插入行的ID并更新行?

mysql - 如何从 sql 中的某些 ID 获取最小值和最大值

php - 从多个表中按计数排序的最佳方法是什么?

Mysql2::错误: 'sum_hours' 中的未知列 'field list'