我有一个像这样的表格(简化版本):
+------+-------+-----+--------------+-----+
| id | name | age | company.name | ...
+------+-------+-----+--------------------+
| 1 | Adam | 21 | Google | ...
| 3 | Peter | 20 | Apple | ...
| 2 | Bob | 20 | Microsoft | ...
| 9 | Alice | 18 | Google | ...
+------+-------+-----+--------------------+
我需要对数据进行分组,并按任意一列对行进行计数。我需要在每组中获得第一行。用户选择将使用哪一列进行分组。
如果用户选择列年龄进行分组,则结果:
+------+------------+-------+
| id | group_name | count |
+------+------------+-------+
| 9 | 18 | 1 |
+------+------------+-------+
| 2 | 20 | 2 |
+------+------------+-------+
| 1 | 21 | 1 |
+------+------------+-------+
分组的列可以是数字或字符串。
目前我通过这个查询来做到这一点:
SELECT id, group_name, users_name, count(id) as count FROM (
SELECT persons.id as id, company.type as group_name, users.name as users_name
FROM persons
LEFT JOIN company on company.id = persons.company_id
LEFT JOIN position on position.id=persons.position_id
...
LEFT JOIN source on source.id=persons.source_id
WHERE ...
ORDER BY if(company.type = '' or company.type is null,1,0) ASC,
company.type ASC, IF(persons.status = '' or persons.status is null,1,0) ASC,
persons.status ASC, persons.id
) t1 GROUP BY group_name
但是对于新版本的 mysql,这个 SQL 停止工作了,我认为子选择中的顺序被忽略了。
我知道已经写了类似的主题,但提出的解决方案不适用于我的查询。我必须连接许多表,添加多个条件并使用级联顺序,然后从每个组中选择第一行。如果解决方案能够针对性能进行优化,我将非常高兴。
---- 编辑 ----
建议的解决方案: SQL select only rows with max value on a column
建议使用 MAX() 和 GROUP BY 效果不佳。有两个原因
- 如果分组列包含字符串,则查询返回的不是第一行,而是每组的最后一行。
- 如果我的数据集具有级联顺序,则无法同时在几列中使用 MAX。
我创建了 sqlfiddle,其中包含确切的示例。
http://sqlfiddle.com/#!9/23225d/11/0
-- EXAMPLE 1 - Group by string
-- base query
SELECT persons.*, company.* FROM persons
LEFT JOIN company ON persons.company_id = company.id
ORDER BY company.name ASC, company.id ASC;
-- grouping query
SELECT MAX(persons.id) as id, company.name, count(persons.id) as count
FROM persons
LEFT JOIN company ON persons.company_id = company.id
GROUP BY company.name
ORDER BY company.name ASC, persons.id ASC;
-- The results will be:
-- |ID | NAME | COUNT|
-- |1 | Google | 2 |
-- |3 | Microsoft| 3 |
-- EXAMPLE 2 - Cascade order
-- base query
SELECT persons.*, company.* FROM persons
LEFT JOIN company ON persons.company_id = company.id
ORDER BY company.type ASC, persons.status ASC;
-- grouping query
SELECT MAX(persons.id) as id, company.type, count(persons.id) as count
FROM persons
LEFT JOIN company ON persons.company_id = company.id
GROUP BY company.type
ORDER BY company.type ASC, persons.status ASC;
-- The results will be:
-- |ID | NAME| COUNT|
-- |3 | 1 | 2 |
-- |2 | 2 | 3 |
最佳答案
只需将 MAX()
更改为 MIN()
即可获取每组中的第一行而不是最后一行。
要获取级联列的极值,请参阅 SQL : Using GROUP BY and MAX on multiple columns 。在查询的子查询部分中使用它来获取包含这些极值的行,如 SQL select only rows with max value on a column 中。
所以完整查询的形式是:
SELECT t1.id, t1.grouped_column, t2.count
FROM yourTable AS t
JOIN (SELECT t3.grouped_column, t3.order_column1, MIN(t4.order_column2) AS order_column2, SUM(t3.count) AS count
FROM (SELECT grouped_column, MIN(order_column1) AS order_column1, COUNT(*) AS count
FROM yourTable
GROUP BY grouped_column) AS t3
JOIN yourTable AS t4
ON t3.grouped_column = t4.grouped_column AND t3.order_column1 = t4.order_column1
GROUP BY t4.grouped_column, t4.order_column1) AS t2
ON t1.grouped_column = t2.grouped_column AND t1.ordered_column1 = t2.order_column1 AND t1.order_column2 = t2.order_column2
由于您想对联接进行操作,因此我建议您定义一个使用联接的 View 。然后,您可以使用该 View 代替上述查询中的 yourTable
。
关于MySQL如何选择每个组的第一行计数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49061296/