我只是用一些虚拟数据进行分区实验,到目前为止还没有优化我的查询。
我从互联网下载了一个数据集,其中包含一个测量值
表:
CREATE TABLE `partitioned_measures` (
`measure_timestamp` datetime NOT NULL,
`station_name` varchar(255) DEFAULT NULL,
`wind_mtsperhour` int(11) NOT NULL,
`windgust_mtsperhour` int(11) NOT NULL,
`windangle` int(3) NOT NULL,
`rain_mm` decimal(5,2) DEFAULT NULL,
`temperature_dht11` int(5) DEFAULT NULL,
`humidity_dht11` int(5) DEFAULT NULL,
`barometric_pressure` decimal(10,2) NOT NULL,
`barometric_temperature` decimal(10,0) NOT NULL,
`lux` decimal(7,2) DEFAULT NULL,
`is_plugged` tinyint(1) DEFAULT NULL,
`battery_level` int(3) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (TO_DAYS(measure_timestamp))
(PARTITION `slow` VALUES LESS THAN (736634) ENGINE = InnoDB,
PARTITION `fast` VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
就像一个学习练习,我想尝试通过 measure_timestamp
对测量进行分区(无需索引的帮助)。具体来说,我认为尝试将最近一个月单独放入一个分区中会很有趣。 (我知道最好有相同大小的分区,但我只是想尝试一下)
我使用以下命令添加分区(请注意,数据集于 2016 年 12 月结束,绝大多数数据点都在前几个月):
ALTER TABLE partitioned_measures
PARTITION BY RANGE(TO_DAYS(measure_timestamp)) (
PARTITION slow VALUES LESS THAN(TO_DAYS('2016-12-01')),
PARTITION fast VALUES LESS THAN (MAXVALUE)
);
为了查询,我正在查看第二个及以后的所有条目(只是为了确保我只查看最新的分区):
select SQL_NO_CACHE COUNT(*) FROM partitioned_measures
WHERE measure_timestamp >= '2016-12-02'
AND DAYOFWEEK(measure_timestamp) = 1;
当我在其前面添加 EXPLAIN 时,我得到以下内容:
+----+-------------+----------------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | partitioned_measures | slow,fast | ALL | NULL | NULL | NULL | NULL | 1835458 | 33.33 | Using where |
+----+-------------+----------------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
但是查询时间与分区之前大致相同(约 1.6 秒)。我以前从未使用过分区,所以我觉得我缺少一些概念性的东西。
最佳答案
很棘手,但我找到了一个可行的解决方案,或者我应该说一个解决方法,它似乎是一个 MySQL 错误?
ALTER TABLE partitioned_measures
PARTITION BY RANGE COLUMNS(measure_timestamp) (
PARTITION slow VALUES LESS THAN('2016-12-01'),
PARTITION fast VALUES LESS THAN(MAXVALUE)
);
参见demo哪个正确使用分区修剪
我注意到语法 here
我仍然觉得分区惩罚不能正确工作很奇怪,
ALTER TABLE partitioned_measures
PARTITION BY RANGE(TO_DAYS(measure_timestamp)) (
PARTITION slow VALUES LESS THAN(TO_DAYS('2016-12-01')),
PARTITION fast VALUES LESS THAN (MAXVALUE)
);
MySQL 5.7 应该能够进行分区修剪,TO_DAYS()
就很好
Pruning can also be applied for tables partitioned on a DATE or DATETIME column when the partitioning expression uses the YEAR() or TO_DAYS() function. In addition, in MySQL 5.7
关于mysql - 无法通过分区来改变查询时间,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55975135/