sql - 根据其他列计算一列中的不同值

我有一个如下所示的表格:

app_id  supplier_reached    creation_date   platform
10001       1            9/11/2018         iOS
10001       2            9/18/2018         iOS
10002       1            5/16/2018       android
10003       1            5/6/2018        android
10004       1            10/1/2018       android
10004       1            2/3/2018        android
10004       2            2/2/2018           web
10005       4            1/5/2018           web
10005       2            5/1/2018        android
10006       3            10/1/2018         iOS
10005       4            1/1/2018          iOS

目标是找到每个月提交的 app_id 的唯一数量。

如果我只执行 count(distinct app_id) 我将得到以下结果:

Group by month  count(app number)
     Jan              1
     Feb              1
     may              3
  september           1
   october            2

但是，根据其他字段的组合，应用程序也被认为是独一无二的。例如，对于一月份，app_id 是相同的，但是 app_id、supplier_reached 和 platform 的组合> 显示不同的值，因此 app_id 应该被计算两次。按照相同的模式，期望的结果应该是:

Group by month  Desired answer
     Jan              2
     Feb              2
     may              3
   september          2
    october           2

最后，表中可能还有许多其他列，它们可能有助于也可能不会有助于应用程序的独特性。

有没有办法在 SQL 中进行这种类型的计数？

我正在使用 Redshift。

最佳答案

如上所述，在 Redshift 中，count(distinct ...) 不适用于多个字段。

您可以先按您希望唯一的列进行分组，然后像这样计算记录数:

select month,count(1) as app_number 
from (
    select month,app_id,supplier_reached,platform
    from your_table
    group by 1,2,3,4
)
group by 1

关于sql - 根据其他列计算一列中的不同值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52635609/

sql - 根据其他列计算一列中的不同值

上一篇：SQL INFORMATION_SCHEMA.COLUMNS 返回不完整的值

下一篇：postgresql - 在 Where 子句中使用 Upper 和 Lower Function 有什么区别？