sql - 区分列并在 postgres 中查找过去几个月的总和

标签 sql postgresql distinct

我有一个像这样的示例表:

     purchase_datetime    customer_id  value    purchase_id
    2013-01-08 17:13:29      45236       92        2526
    2013-01-03 15:42:35      45236       16        2565
    2013-01-03 15:42:35      45236       16        2565
    2013-03-08 09:04:52      45236       636       2563
    2013-12-08 12:12:24      45236       23        2505
    2013-12-08 12:12:24      45236       23        2505
    2013-12-08 12:12:24      45236       23        2505
    2013-12-08 12:12:24      45236       23        2505
    2013-07-08 22:35:53      35536       73        2576
    2013-07-08 09:52:03      35536        4        5526
    2013-10-08 16:23:29      52626       20        2226
...
    2013-04-08 17:49:31      52626       27        4526
    2013-12-09 20:40:53      52626       27        4626

现在,我需要找到客户 (purchase_id) 在过去几个月中每次购买的总支出金额(值(value))。但是我有一个问题,因为 purchase_id 加倍了,所以我需要对 purchase_id 做 Distinct。

这是我到目前为止在没有 distinct 的情况下得到的,我不知道如何接近 distinct。

Select customer_id
  sum(case when ( date '2017-01-01'  - purchase_datetime::DATE <=30) then value else 0 end)  as 1month,
  sum( case when ( date '2017-01-01' - purchase_datetime::DATE <=90) then value else 0 end)  as 3month,
  sum( case when ( date '2017-01-01' - purchase_datetime::DATE <=180) then value else 0 end)  as 6month,
  sum( case when ( date '2017-01-01' - purchase_datetime::DATE <=360) then value else 0 end)  as 12month

FROM table_data
GROUP BY (customer_id)
ORDER BY amount_1month DESC;

也许窗口函数更好?

期望的输出:

    purchase_datetime    customer_id  value    purchase_id
    2013-01-08 17:13:29      45236       92        2526
    2013-01-03 15:42:35      45236       16        2565
    2013-03-08 09:04:52      45236       636       2563
    2013-12-08 12:12:24      45236       23        2505
    2013-07-08 22:35:53      35536       73        2576
    2013-07-08 09:52:03      35536        4        5526
    2013-10-08 16:23:29      52626       20        2226
...
    2013-04-08 17:49:31      52626       27        4526
    2013-12-09 20:40:53      52626       27        4626

最佳答案

您可以选择子查询,并在该子查询中使用 DISTINCT(或 GROUP BY)。

例如:

SELECT 
  customer_id, 
  sum(case when purchase_datetime::DATE between current_date - interval '1 month' and current_date then "value" else 0 end)  as "1month",
  sum(case when purchase_datetime::DATE between current_date - interval '3 month' and current_date then "value" else 0 end)  as "3month",
  sum(case when purchase_datetime::DATE between current_date - interval '6 month' and current_date then "value" else 0 end)  as "6month",
  sum(case when purchase_datetime::DATE between current_date - interval '1 year' and current_date then "value" else 0 end)  as "12month"
FROM (
  select 
  distinct purchase_id, customer_id, purchase_datetime,  "value"

  -- distinct on (purchase_id) customer_id, purchase_datetime, "value" 
  -- Note: with this type of distinct you assume that for each purchase_id there is only 1 combination of the 3 other field values.

  from table_data
) p
GROUP BY customer_id
ORDER BY "1month" DESC;

测试数据:

create table table_data (purchase_datetime timestamp(0),customer_id int,"value" int,purchase_id int);
insert into table_data (purchase_datetime,customer_id,"value",purchase_id) values
(current_timestamp - interval '11 month',45236,92,2526),
(current_timestamp - interval '11 month',45236,16,2565),
(current_timestamp - interval '1 month',45236,16,2565),
(current_timestamp - interval '2 month',45236,636,2563),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '3 month',35536,73,2576),
(current_timestamp - interval '2 month',35536,4,5526),
(current_timestamp - interval '1 month',52626,20,2226),
(current_timestamp - interval '6 month',52626,27,4526),
(current_timestamp - interval '6 month',52626,27,4626);

关于sql - 区分列并在 postgres 中查找过去几个月的总和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44154242/

相关文章:

sql - 如何在ms access中使用distinct

Mysql如何连接2个子查询

sql - 自动递增订单号(一张表多个公司)

python - 让 PostgreSQL percent_rank 和 scipy.stats.percentileofscore 结果匹配

sql - Postgresql 如何根据列名查找表?

sql - 使用 R 对两列进行分组并计算不同值

PHP 从 mysqli_fetch_row 数组中排序结果

sql - 选择 SQL 作为 Jupyter Notebook 的默认单元格魔法

mysql - 修改子查询中的日期会减慢执行速度

sql - PostgreSQL 9.3 operator <> 没有给出逻辑结果