sql - Postgres windows 函数与聚合组依据

标签 sql postgresql window-functions

我想获取电子邮件域列表和每个域中的顶级用户。我的方法是对按域分组的每封电子邮件的问题求和,然后使用窗口函数获取排名靠前的用户。但是这不起作用:

SELECT 
  domain,
  sum(questions_per_email) as questions_per_domain,
  first_value(email) OVER (PARTITION BY domain ORDER BY questions_per_email DESC) as top_user
FROM (
    SELECT email,
           lower(substring(u.email from position('@' in u.email)+1)) as domain,
           count(*) as questions_per_email
      FROM questions q
      JOIN identifiers i ON (q.owner_id = i.id)
      JOIN users u ON (u.identifier_id = i.id)
    GROUP BY email
  ) as per_user
GROUP BY domain, top_user

Postgres 给出以下信息:

ERROR:  column "per_user.questions_per_email" must appear in the GROUP BY clause or be used in an aggregate function
LINE 5: ...t_value(email) OVER (PARTITION BY domain ORDER BY questions_...
                                                             ^

我真的不明白这是为什么。我很确定应该能够对聚合结果使用窗口函数。请指教!

谢谢, 克里斯托弗

最佳答案

您可以这样更改您的查询:

with cte1 as (
    SELECT email,
           lower(substring(u.email from position('@' in u.email)+1)) as domain
      FROM questions q
      JOIN identifiers i ON (q.owner_id = i.id)
      JOIN users u ON (u.identifier_id = i.id)
), cte2 as (
    select
        domain, email,
        count(*) as questions_per_email,
        first_value(email) over (partition by domain order by count(*) desc) as top_user
    from cte1
    group by email, domain
)
select domain, top_user, sum(questions_per_email) as questions_per_domain
from cte2
group by domain, top_user

sql fiddle demo

关于sql - Postgres windows 函数与聚合组依据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18686906/

相关文章:

c# - 循环访问 SSIS 中的表时刷新元数据

sql - 将多个行值更新为同一行和不同列

mysql - SQL JOINS 和命名连接列

node.js - Sequelize 数据库错误且没有错误消息

PostgreSQL:影子表

SQL使用跨表分区

r - 如何使用窗口函数?

php - 查询返回额外的空变量

json - 使用 postgresql 9.4 访问 JSON 的更深层元素

sql - 如何在 SQL Server 中使用带有框架的窗口函数执行 COUNT(DISTINCT)