mysql - blob 列上的慢查询分组

标签 mysql sql indexing query-optimization explain

我正在使用以下查询从 mediumblob 列中提取频繁出现的短值:

select bytes, count(*) as n
from pr_value
where bytes is not null && length(bytes)<11 and variable_id=5783
group by bytes order by n desc limit 10;

我遇到的问题是这个查询花费了太多时间(大约 10 秒,少于 100 万条记录):

mysql> select bytes, count(*) as n from pr_value where bytes is not null && length(bytes)<11 and variable_id=5783 group by bytes order by n desc limit 10;
+-------+----+
| bytes | n  |
+-------+----+
| 32    | 21 |
| 27    | 20 |
| 52    | 20 |
| 23    | 19 |
| 25    | 19 |
| 26    | 19 |
| 28    | 19 |
| 29    | 19 |
| 30    | 19 |
| 31    | 19 |
+-------+----+

表格如下(无关列未显示):

mysql> describe pr_value;
+-------------+---------------+------+-----+---------+-------+
| Field       | Type          | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| product_id  | int(11)       | NO   | PRI | NULL    |       |
| variable_id | int(11)       | NO   | PRI | NULL    |       |
| author_id   | int(11)       | NO   | PRI | NULL    |       |
| bytes       | mediumblob    | YES  | MUL | NULL    |       |
+-------------+---------------+------+-----+---------+-------+

类型是 mediumblob 因为大多数值都很大。不到 10% 的内容与我通过此特定查询查找的内容一样短。

我有以下索引:

mysql> show index from pr_value;
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| pr_value |          0 | PRIMARY  |            1 | product_id  | A         |        8961 |     NULL | NULL   |      | BTREE      |         |               |
| pr_value |          0 | PRIMARY  |            2 | variable_id | A         |      842402 |     NULL | NULL   |      | BTREE      |         |               |
| pr_value |          0 | PRIMARY  |            3 | author_id   | A         |      842402 |     NULL | NULL   |      | BTREE      |         |               |
| pr_value |          1 | bytes    |            1 | bytes       | A         |      842402 |       10 | NULL   | YES  | BTREE      |         |               |
| pr_value |          1 | bytes    |            2 | variable_id | A         |      842402 |     NULL | NULL   |      | BTREE      |         |               |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

MySQL 是这样解释我的查询的:

mysql> explain select bytes, count(*) as n from pr_value where bytes is not null && length(bytes)<11 and variable_id=5783 group by bytes order by n desc limit 10; 
+----+-------------+----------+-------+---------------+-------+---------+------+--------+----------------------------------------------+
| id | select_type | table    | type  | possible_keys | key   | key_len | ref  | rows   | Extra                                        |
+----+-------------+----------+-------+---------------+-------+---------+------+--------+----------------------------------------------+
|  1 | SIMPLE      | pr_value | range | bytes         | bytes | 13      | NULL | 421201 | Using where; Using temporary; Using filesort |
+----+-------------+----------+-------+---------------+-------+---------+------+--------+----------------------------------------------+

请注意,可以在不更改持续时间的情况下删除字节列长度的条件。

我该怎么做才能使这个查询更快?

当然,我宁愿不必添加列。

最佳答案

你在 (bytes, variable_id) 上的索引不是很聪明。如果你的查询中总是有一个 variable_id 子句,你应该首先添加带有 variable_id 的索引:

(variable_id, bytes)

这取决于 variable_id 的判别力。但这应该有所帮助。

另一个技巧是添加一个新的索引列,其结果为“length(bytes)<11”:

update pr_value set small = length(bytes)<11;

使用 (small,variable_id) 添加新索引。

关于mysql - blob 列上的慢查询分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11311235/

相关文章:

mysql - 我可以使用相对路径将csv文件导入mysql吗?

mysql - 如何为 2 个 SQL 查询创建一个 MySQL 索引?

MongoDB 查询优化器不断为查询选择效率最低的索引

python - 在日期为 "close"的级别上重新索引 MultiIndex

mysql - 如何获取 MySQL 中最近 10 周的数据?

MYSQL View ?连接两个表

mysql - 省略以特定字符开头的结果

sql - Oracle中的游标循环

php - MySql语法错误;可以在一个查询中删除两个表吗?

php - MySQL 和 UNIQUE 列 : how to avoid duplicates?