我想优化以下查询:
SELECT SQL_NO_CACHE t.topic_id
FROM bb_topics t, bb_posters ps
WHERE t.topic_id = ps.topic_id
AND forum_id IN (2, 6, 7, 10, 15, 20)
ORDER BY ps.timestamp desc
LIMIT 20
Query took 0.1475 sec
所以一开始我用 INNER JOIN 子查询替换了 WHERE IN:
SELECT SQL_NO_CACHE t.topic_id
FROM ( SELECT * FROM bb_topics WHERE forum_id IN (2, 6, 7, 10, 15, 20) ) t
INNER JOIN bb_posters ps ON t.topic_id = ps.topic_id
ORDER BY ps.timestamp desc
LIMIT 20
Query took 0.1541 sec
然后我尝试创建一个临时表:
CREATE TEMPORARY TABLE IF NOT EXISTS bb_topics_tmp ( INDEX(topic_id) )
ENGINE=MEMORY
AS ( SELECT * FROM bb_topics WHERE forum_id IN (2, 6, 7, 10, 15, 20) );
SELECT SQL_NO_CACHE t.topic_id
FROM bb_topics_tmp t, bb_posters ps
AND t.topic_id = ps.topic_id
ORDER BY ps.timestamp desc
LIMIT 20
Query took 0.1467 sec
我不明白为什么从 38,522 行的完整表中选择比从 9,943 行的临时表中选择要快得多:
SELECT SQL_NO_CACHE t.topic_id
FROM bb_topics t, bb_posters ps
WHERE t.topic_id = ps.topic_id
ORDER BY ps.timestamp desc
LIMIT 20
Query took 0.0006 sec
topic_id和timestamp都有索引。
有趣的是,即使使用这样的东西也比论坛列表快得多:
AND pt.post_text LIKE '%searchterm%'
更新:
这是 EXPLAIN 的输出:
SELECT SQL_NO_CACHE t.topic_id, t.topic_title, ps.timestamp, u.username,
u.user_id, ps.size, ps.downloaded, ROUND(a.rating_sum/a.rating_count) AS Rating,
a.attach_id, pt.bbcode_uid, pt.post_text
FROM bb_topics t
JOIN bb_posters ps ON ps.topic_id = t.topic_id
LEFT JOIN bb_users u ON u.user_id = t.topic_poster
LEFT JOIN bb_posts_text pt ON pt.post_id = bt.post_id
LEFT JOIN bb_attachments_desc a ON bt.attach_id = a.attach_id
WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
ORDER BY ps.timestamp desc
LIMIT 1, 20
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t range PRIMARY,forum_id forum_id 2 NULL 8379 Using where; Using temporary; Using filesort
1 SIMPLE ps eq_ref topic_id topic_id 3 DB.t.topic_id 1
1 SIMPLE u eq_ref PRIMARY PRIMARY 3 DB.t.topic_poster 1 Using index
1 SIMPLE pt eq_ref PRIMARY PRIMARY 3 DB.bt.post_id 1 Using index
1 SIMPLE a eq_ref PRIMARY PRIMARY 3 DB.bt.attach_id 1 Using index
Query took 0.8527 sec
没有 WHERE t.forum_id IN
的相同查询:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ps index topic_id timestamp 4 NULL 21
1 SIMPLE t eq_ref PRIMARY PRIMARY 3 DB.bt.topic_id 1
1 SIMPLE u eq_ref PRIMARY PRIMARY 3 DB.t.topic_poster 1
1 SIMPLE pt eq_ref PRIMARY PRIMARY 3 DB.bt.post_id 1
1 SIMPLE a eq_ref PRIMARY PRIMARY 3 DB.bt.attach_id 1
Query took 0.0022 sec
更新 2:
添加 USE INDEX (timestamp)
解决了问题:
SELECT SQL_NO_CACHE t.topic_id, t.topic_title, ps.timestamp, u.username,
u.user_id, ps.size, ps.downloaded, ROUND(a.rating_sum/a.rating_count) AS Rating,
a.attach_id, pt.bbcode_uid, pt.post_text
FROM bb_topics t
JOIN bb_posters ps USE INDEX (timestamp) ON ps.topic_id = t.topic_id
LEFT JOIN bb_users u ON u.user_id = t.topic_poster
LEFT JOIN bb_posts_text pt ON pt.post_id = bt.post_id
LEFT JOIN bb_attachments_desc a ON bt.attach_id = a.attach_id
WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
ORDER BY ps.timestamp desc
LIMIT 1, 20
Query took 0.0023 sec
最佳答案
这些并不是非常困难的查询。通过使用 SQL_NO_CACHE 并为它们计时,您正在做正确的事情。但是您还需要查看 EXPLAIN 的结果。
使用 JOIN 语法而不是逗号分隔的表列表。查询应该是等价的,但旧式语法更难理解。
SELECT SQL_NO_CACHE
t.topic_id
FROM bb_topics AS t
JOIN bb_posters AS ps ON t.topic_id = ps.topic_id
WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
ORDER BY ps.timestamp desc
LIMIT 20
尝试使用一些复合(多列) covering indexes 让您的表现更上一层楼。
您需要按时间戳对 bb_posters 表进行排序,并且需要 topic_id。所以试试这个索引:(timestamp, topic_id)
如果你可以使用这样的语句
WHERE ps.timestamp >= DATE(NOW()) - INTERVAL 7 DAY
限制搜索的时间范围,这将有助于提高性能。
您需要 bb_topics 表中的 topic_id 和 forum_id。所以试试这个索引 (topic_id, forum_id)
您可以对您尝试连接的其他表使用类似的复合覆盖索引。
如果您的表具有良好的索引,那么对它们的查询应该与对临时表的查询一样高效。创建临时表往往会对服务器做一些事情,例如清除缓存在 RAM 中的表数据,这会对性能产生意想不到的负面影响。
关于MySQL 子查询和临时表很慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22580314/