以下查询取决于 links
中包含约 4k 行的表comments
中的表和 ~40k 行表,目前大约需要 0.2 秒,考虑到没有那么多数据,这似乎相当慢。
SELECT
t1.id, t1.url, t1.dateAdded
FROM links AS t1 LEFT JOIN
comments AS t2
ON (t1.id = t2.linkId)
WHERE
COALESCE(t2.dateAdded, t1.dateAdded) <= "2020-03-22 20:04:45"
GROUP BY t1.id
ORDER BY
COALESCE(
(
SELECT
MAX(dateAdded)
FROM comments
WHERE
linkId = t1.id AND
dateAdded <= "2020-03-22 20:04:45"
),
t1.dateAdded
) DESC,
t1.id DESC
LIMIT 10
t1.id
是主键,t2.linkId
是外键;我还尝试为 dateAdded
添加索引在两个表中但这似乎没有帮助。
为了找出瓶颈,我将查询简化为以下内容,并注意到在按 t1.dateAdded
排序时按 t1.id
排序时查询需要 0.12s只需要0.003秒
SELECT
t1.id, t1.url, t1.dateAdded
FROM links AS t1 LEFT JOIN
comments AS t2
ON (t1.id = t2.linkId)
WHERE
COALESCE(t2.dateAdded, t1.dateAdded) <= "2020-03-22 20:04:45"
GROUP BY t1.id
ORDER BY
t1.id DESC -- here I tried both t1.dateAdded and t1.id
因此,我尝试使用 EXPLAIN
来找出差异。似乎唯一的区别在于 Extra
字段 ORDER BY t1.id
它是空的,并且为 ORDER BY t1.dateAdded
它是Using temporary; Using filesort
(请注意,我在 t1.dateAdded
上有索引)。不幸的是,我有点困于解释这意味着什么,以及一般来说如何优化原始查询。请注意id
是 INT(10)
和dateAdded
是 DATETIME
.
一般来说,我想要实现的目标是对链接进行排序,以便最新链接或带有最新评论的链接位于顶部,其中“最新”意味着相对于提供的时间(即不考虑链接/之后添加评论)。
预先感谢您的任何帮助或提示
编辑:添加更多详细信息
EXPLAIN
用于使用 t1.id
进行简化查询
+------+-------------+-------+-------+---------------+------------+---------+--------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+--------------+------+-------------+
| 1 | SIMPLE | t1 | index | NULL | PRIMARY | 4 | NULL | 3674 | |
| 1 | SIMPLE | t2 | ref | fk_link_id | fk_link_id | 5 | db1.t1.id | 8 | Using where |
+------+-------------+-------+-------+---------------+------------+---------+--------------+------+-------------+
EXPLAIN
用于使用 t1.dateAdded
进行简化查询
+------+-------------+-------+-------+---------------+------------+---------+--------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+--------------+------+---------------------------------+
| 1 | SIMPLE | t1 | index | NULL | PRIMARY | 4 | NULL | 3674 | Using temporary; Using filesort |
| 1 | SIMPLE | t2 | ref | fk_link_id | fk_link_id | 5 | db1.t1.id | 8 | Using where |
+------+-------------+-------+-------+---------------+------------+---------+--------------+------+---------------------------------+
有关links
的信息表:
CREATE TABLE `links` (
`id` int(10) UNSIGNED NOT NULL,
`url` varchar(2083) CHARACTER SET utf8mb4 NOT NULL,
`dateAdded` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `links`
ADD PRIMARY KEY (`id`),
ADD KEY `dateAdded` (`dateAdded`);
有关comments
的信息表:
CREATE TABLE `comments` (
`id` int(10) UNSIGNED NOT NULL,
`linkId` int(10) UNSIGNED DEFAULT NULL,
`userId` int(10) UNSIGNED NOT NULL,
`content` varchar(2000) CHARACTER SET utf8mb4 NOT NULL,
`dateAdded` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `comments`
ADD PRIMARY KEY (`id`),
ADD KEY `fk_link_id` (`linkId`);
ALTER TABLE `comments`
ADD CONSTRAINT `fk_link_id` FOREIGN KEY (`linkId`) REFERENCES `links` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
最佳答案
我可以首先指出查询中的 GROUP BY
是不必要的(尽管没有错误),因为您没有选择任何聚合。除此之外,我觉得只需使用 MAX() 作为分析函数,然后按其排序,您的生活就会变得更轻松。考虑这个版本:
WITH cte AS (
SELECT t1.id, t1.url, t1.dateAdded,
MAX(t2.dateAdded) OVER (PARTITION BY t1.id) maxDateAdded
FROM links AS t1
LEFT JOIN comments AS t2 ON t1.id = t2.linkId
WHERE
(t2.dateAdded IS NOT NULL AND t2.dateAdded <= '2020-03-22 20:04:45') OR
(t2.dateAdded IS NULL AND t1.dateAdded <= '2020-03-22 20:04:45')
)
SELECT id, url, dateAdded
FROM cte
ORDER BY maxDateAdded DESC, t1.id DESC
LIMIT 10;
此答案假设您使用的是 MySQL 8+。只需付出更多的努力,就可以为早期版本的 MySQL 重写它。
对于优化上述查询,以下索引可能会有所帮助:
CREATE INDEX idx2 ON comments (linkID, dateAdded);
CREATE INDEX idx1 ON links (dateAdded, url, id);
如果使用这些索引,将加快连接速度,并且还允许对 MAX
的调用快速评估。请注意,我已将 WHERE
子句重写为可排序,避免调用 COALESCE
。
关于mysql - 使用 DATETIME 列上的 ORDER BY 优化查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60817361/