mysql - 加快MYSQL中的自连接

标签 mysql sql innodb inner-join

我有一个不同对象之间的连接表,我基本上是在尝试使用自连接进行图遍历。 我的表定义为:

CREATE TABLE `connections` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `position` int(11) NOT NULL,
  `dId` bigint(20) NOT NULL,
  `sourceId` bigint(20) NOT NULL,
  `targetId` bigint(20) NOT NULL,
  `type` bigint(20) NOT NULL,
  `weight` float NOT NULL DEFAULT '1',
  `refId` bigint(20) NOT NULL,
  `ts` bigint(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `sourcetype` (`type`,`sourceId`,`targetId`),
  KEY `targettype` (`type`,`targetId`,`sourceId`),
  KEY `complete` (`dId`,`sourceId`,`targetId`,`type`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

该表包含大约 3M 条目(~1K 类型 1,1M 类型 2,2M 类型 3)。

超过 2 或 3 跳的查询实际上非常快(当然需要一段时间才能收到所有结果),但是获取 3 跳的查询计数非常慢(> 30 秒)。

这是查询(返回 2M):

SELECT
  count(*)
FROM
  `connections` AS `t0`
JOIN
  `connections` AS `t1` ON `t1`.`targetid`=`t0`.`sourceid`
JOIN
  `connections` AS `t2` ON `t2`.`targetid`=`t1`.`sourceid`
WHERE
  `t2`.dId = 1
  AND
  `t2`.`sourceid` = 1
  AND
  `t2`.`type` = 1
  AND
  `t1`.`type` = 2
  AND
  `t0`.`type` = 3;

这是相应的解释:

id  select_type  table  type  possible_keys                   key         key_len  ref                         rows  Extra  
1   SIMPLE       t2     ref   targettype,complete,sourcetype  complete    16       const,const                  100  Using where; Using index
1   SIMPLE       t1     ref   targettype,sourcetype           targettype   8       const                       2964  Using where; Using index
1   SIMPLE       t0     ref   targettype,sourcetype           sourcetype  16       const,travtest.t1.targetId  2964  Using index

编辑:这是添加和索引到 type 后的 EXPLAIN:

id  select_type  table  type  possible_keys                        key         key_len  ref                         rows  Extra     
1   SIMPLE       t2     ref   type,complete,sourcetype,targettype  complete    16       const,const                 100   Using where; Using index
1   SIMPLE       t1     ref   type,sourcetype,targettype           sourcetype  16       const,travtest.t2.targetId    2   Using index
1   SIMPLE       t0     ref   type,sourcetype,targettype           sourcetype  16       const,travtest.t1.targetId    2   Using index

有什么办法可以改善吗?

第二次编辑:

EXPLAN EXTENDED:
+----+-------------+-------+------+-------------------------------------+------------+---------+----------------------------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys                       | key        | key_len | ref                        | rows | filtered | Extra                    |
+----+-------------+-------+------+-------------------------------------+------------+---------+----------------------------+------+----------+--------------------------+
|  1 | SIMPLE      | t2    | ref  | type,complete,sourcetype,targettype | complete   | 16      | const,const                |  100 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | t1    | ref  | type,sourcetype,targettype          | sourcetype | 16      | const,travtest.t2.targetId |    1 |   100.00 | Using index              |
|  1 | SIMPLE      | t0    | ref  | type,sourcetype,targettype          | sourcetype | 16      | const,travtest.t1.targetId |    1 |   100.00 | Using index              |
+----+-------------+-------+------+-------------------------------------+------------+---------+----------------------------+------+----------+--------------------------+

SHOW WARNINGS;
+-------+------+--------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                    |
+-------+------+--------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select count(0) AS `count(*)` from `travtest`.`connections` `t0`            |
|       |      | join `travtest`.`connections` `t1` join `travtest`.`connections` `t2`                      |
|       |      | where ((`travtest`.`t0`.`sourceId` = `travtest`.`t1`.`targetId`) and                       |
|       |      | (`travtest`.`t1`.`sourceId` = `travtest`.`t2`.`targetId`) and (`travtest`.`t0`.`type` = 3) |
|       |      | and (`travtest`.`t1`.`type` = 2) and (`travtest`.`t2`.`type` = 1) and                      |
|       |      | (`travtest`.`t2`.`sourceId` = 1) and (`travtest`.`t2`.`dId` = 1))                          |
+-------+------+--------------------------------------------------------------------------------------------+

最佳答案

sourceidtargetidtype 列创建索引,然后尝试使用此查询:

SELECT
  count(*)
FROM
  `connections` AS `t0`
JOIN
  `connections` AS `t1` ON `t1`.`targetid`=`t0`.`sourceid` and `t1`.`type` = 2
JOIN
  `connections` AS `t2` ON `t2`.`targetid`=`t1`.`sourceid` and `t2`.dId = 1 AND `t2`.`sourceid` = 1 AND `t2`.`type` = 1
WHERE
  `t0`.`type` = 3;

--------更新-----

我认为这些索引是正确的,并且通过这些大表,您可以达到最佳优化。我认为您无法通过表分区/分片等其他优化来改进此查询。

如果这些数据不经常更改或者我看到的唯一方法是垂直缩放,您可以实现某种缓存

关于mysql - 加快MYSQL中的自连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19910020/

相关文章:

mysql - 输出订单号和收入,其中与订单号相关的所有采购订单号已发货或准备发货

c# - 使用存储过程最优雅的方法是什么?

mysql外键问题

MySQL 普通文本到脊椎大小写转换

php - 左外连接查询中的简单逻辑

php - 客户想要一个像 Facebook Like 按钮一样的“关联”按钮

php - 根据各自的日期时间字段从两个不同的表中获取记录 - mysql

sql - 将 sql 整数舍入到最接近的小时

mysql - 无论是否在子查询中,WHERE 子句都会给出不同的结果

php - Wamp 服务器与普通的互联网托管服务提供商有何不同?