当我在一个有 22M 行的表上运行以下查询时,需要 20 秒才能运行:
select p.*,
(select avg(close)
from endOfDayData p2
where p2.symbol = p.symbol and
p2.date between p.date - interval 6 day and p.date
) as MvgAvg_X
from endOfDayData p
where p.symbol = 'AAPL'
表结构如下:
mysql> desc endOfDayData;
+--------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------------+------+-----+---------+-------+
| date | date | NO | PRI | NULL | |
| symbol | varchar(14) | NO | PRI | NULL | |
| open | decimal(10,4) | NO | | NULL | |
| high | decimal(10,4) | NO | | NULL | |
| low | decimal(10,4) | NO | | NULL | |
| close | decimal(10,4) | NO | | NULL | |
| volume | int(11) | NO | | NULL | |
+--------+---------------+------+-----+---------+-------+
存在以下索引:
mysql> show index from endOfDayData;
+--------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| endOfDayData | 0 | PRIMARY | 1 | date | A | 162294 | NULL | NULL | | BTREE | | |
| endOfDayData | 0 | PRIMARY | 2 | symbol | A | 24019617 | NULL | NULL | | BTREE | | |
| endOfDayData | 1 | EOD_dates | 1 | date | A | 50145 | NULL | NULL | | BTREE | | |
| endOfDayData | 1 | EOD_symbol | 1 | symbol | A | 14322 | NULL | NULL | | BTREE | | |
+--------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
该机器是一个专用盒子,配有 80GB 内存和双处理器。我觉得它应该在一秒钟内运行并具有正确的索引。谢谢
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+------------------------------+------------+---------+--------------------+------+-----------------------+
| 1 | PRIMARY | p | ref | EOD_symbol | EOD_symbol | 16 | const | 8409 | Using index condition |
| 2 | DEPENDENT SUBQUERY | p2 | ref | PRIMARY,EOD_dates,EOD_symbol | EOD_symbol | 16 | financial.p.symbol | 1677 | Using index condition |
我创建了一个新表,以 ID int 作为主键,并在符号、日期上创建了索引。
CREATE INDEX EODDateSym ON endOfDayData_new (symbol, date) USING BTREE;
还有 17 秒。再次感谢所有的想法和帮助
我的 my.conf 是
[mysql]
# CLIENT #
port = 3306
socket = /var/lib/mysql/mysql.sock
[mysqld]
# GENERAL #
user = mysql
default-storage-engine = InnoDB
socket = /var/lib/mysql/mysql.sock
pid-file = /var/lib/mysql/mysql.pid
# MyISAM #
key-buffer-size = 32M
myisam-recover = FORCE,BACKUP
# SAFETY #
max-allowed-packet = 16M
max-connect-errors = 1000000
# DATA STORAGE #
datadir = /var/lib/mysql/
# BINARY LOGGING #
log-bin = /var/lib/mysql/mysql-bin
expire-logs-days = 14
sync-binlog = 1
server_id = 1
# CACHES AND LIMITS #
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 500
thread-cache-size = 50
open-files-limit = 65535
table-definition-cache = 4096
table-open-cache = 4096
# INNODB #
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 512M
innodb-flush-log-at-trx-commit = 1
innodb-file-per-table = 1
innodb-buffer-pool-size = 68G
# LOGGING #
log-error = /var/lib/mysql/mysql-error.log
log-queries-not-using-indexes = 1
slow-query-log = 1
slow-query-log-file = /var/lib/mysql/mysql-slow.log
最佳答案
由于您要同时匹配 symbol
和 date
,因此您需要在这两列上设置索引,(symbol, date)
,以便其在您所表达的条件下有效。
MySQL 通常会为给定的表选择最适合作业的索引,并且不能以任何有意义的方式组合两个索引。
如果您将这两个作为主键,那就很奇怪并且可能会损害性能。 UNIQUE
索引与常规 INT AUTO_INCRMENT PRIMARY KEY
ID 类型列配合使用效果更好。当主键尽可能紧凑时,MySQL 的性能最佳。
关于mysql - mysql 表上的正确索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23142021/