sql - Postgres 数据聚合与可变列

标签 sql database postgresql pivot crosstab

我有一个包含时间日志信息的数据表。

create table "time_records" (
    "id" serial NOT NULL PRIMARY KEY,
    "start" timestamp not null,
    "end" timestamp not null,
    "duration" double precision not null,
    "project" varchar(255) not null,
    "case" integer not null,
    "title" text not null,
    "user" varchar(255) not null
);

这里有几行数据:

"id","start","end","duration","project","case","title","user"
"1","2014-02-01 11:54:00","2014-02-01 12:20:00","26.18","Project A","933","Something done here","John Smith"
"2","2014-02-02 12:34:00","2014-02-02 15:00:00","146","Project B","990","Something else done","Joshua Kehn"
"3","2014-02-02 17:57:00","2014-02-02 18:39:00","41.38","Project A","933","Another thing done","Bob Frank"
"4","2014-02-03 09:30:00","2014-02-03 11:41:00","131","Project A","983","iOS work","Joshua Kehn"
"5","2014-02-03 10:22:00","2014-02-03 13:29:00","187.7","Project C","966","Created views for things","Alice Swiss"

我可以从中提取一些零散的信息。例如,在两个日期之间记录时间的每个项目或在两个日期之间工作的每个人的列表。

我希望能够生成一份报告,其中包含日期,然后在顶部显示每个项目以及该项目记录的总时间。

SELECT
    start::date,
    sum(duration / 60) as "time logged",
    project
FROM
    time_records
WHERE
    project = 'Project A'
GROUP BY
    start::date, project
ORDER BY
    start::date, project;

但是我想要输出多列,因此以某种方式将选择不同的项目与此结合起来。

最终输出如下:

date, project a total, project b total, project c total,
2014-02-01,0.5, 0.3, 10,
2014-02-02,1.3, 20, 3,
2014-02-03,20, 10, 10
...

我可以通过以下方式获取每个项目的每个日期的总金额:

SELECT
    start::date,
    sum(duration / 60) as "time logged",
    project
FROM
    time_records
GROUP BY
    start::date, project
ORDER BY
    start::date, project;

但是每个项目的行中有多个日期。我需要它是一个日期,每个项目的总计分在不同的行上。

这是否有意义/是否可以通过 SQL 而不在查询后编写一些代码来实现?

最佳答案

对于“数据透视表”或交叉表,请使用 crosstab() function of the additional module tablefunc .

表定义

考虑到这个清理后的表定义没有 reserved SQL key words作为标识符(这是一个很大的禁忌,即使您可以用双引号强制它):

CREATE TEMP TABLE time_records (
    id serial PRIMARY KEY,
    t_start timestamp not null,
    t_end timestamp not null,
    duration double precision not null,
    project text not null,
    t_case integer not null,
    title text not null,
    t_user text not null
);

查询

请注意我如何使用具有两个参数的变体来正确处理结果中缺失的项目。

SELECT *
FROM  crosstab (
   $$
   SELECT t_start::date
         , project
         , round(sum(duration / 60)::numeric, 2) AS time_logged
   FROM    time_records
   GROUP   BY 1,2
   ORDER   BY 1,2
   $$
  ,$$VALUES ('Project A'), ('Project B'),('Project C')$$
  ) AS t (
      t_start   date
    , project_a text
    , project_b text
    , project_c text
  );

结果:

t_start    | project_a | project_b | project_c
-----------|-----------|-----------|----------
2014-02-01 | 0.44      |           |
2014-02-02 | 0.69      | 2.43      |
2014-02-03 | 2.18      |           | 3.13

使用 Postgres 9.3 进行测试。

此相关答案中的说明、详细信息和链接:
PostgreSQL Crosstab Query

关于sql - Postgres 数据聚合与可变列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22131589/

相关文章:

php - 使用 PHP 从 MYSQL 数据库格式化 DATETIME

sql - 如何针对这种情况编写查询来获取最大日期?

php - 有关数据库性能等的问题

SQL - SELECT MAX() 和伴随字段

SQL嵌套查询

database - MySQL Insert in table if it doesn't exist if it does not already ¿我做得对吗?

ruby-on-rails - 如何查看数据库中编码(如 base64)的 pdf 文件?

SQL - 按组删除除最后 N 行之外的所有行

postgresql - 安装 PostGIS 的问题

sql - 我如何选择非重复列或分组列并在表格中计数?