mysql - group by 和 group concat ,不使用主 pk 优化 mysql 查询

标签 mysql database query-optimization

我的例子是 MYSQL 版本是 5.6.34-日志

问题 摘要以下查询耗时40 秒ORDER_ITEM

  • 758423条记录

    PAYMENT

  • 177272条记录

submission_entry

  • 2165698条记录

    作为整个表计数。

详情如下:

  • 我有这个查询,引用[1]

  • 我添加了SQL_NO_CACHE 用于测试重新测试时的重复测试
    查询。

  • 我有优化的索引引用[2],但不显着
    改进。

  • 在此处查找表结构[3]

  • 查找使用过的解释计划 [4]

[1]

     SELECT SQL_NO_CACHE
          `payment`.`id`                                                                                    AS id,
          `order_item`.`order_id`                                                                           AS order_id,


          GROUP_CONCAT(DISTINCT (CASE WHEN submission_entry.text = '' OR submission_entry.text IS NULL
            THEN ' '
                                 ELSE submission_entry.text END) ORDER BY question.var DESC SEPARATOR 0x1D) AS buyer,


          event.name                                                                                        AS event,
          COUNT(DISTINCT CASE WHEN (`order_item`.status > 0 OR (
            `order_item`.status != -1 AND `order_item`.status >= -2 AND `payment`.payment_type_id != 8 AND
            payment.make_order_free = 1))
            THEN `order_item`.id
                         ELSE NULL END)                                                                     AS qty,
          payment.currency                                                                                  AS `currency`,
          (SELECT SUM(order_item.sub_total)
           FROM order_item
           WHERE payment_id =
                 payment.id)                                                                                AS sub_total,
          CASE WHEN payment.make_order_free = 1
            THEN ROUND(payment.total + COALESCE(refunds_total, 0), 2)
          ELSE ROUND(payment.total, 2) END                                                                  AS 'total',
          `payment_type`.`name`                                                                             AS payment_type,
          payment_status.name                                                                               AS status,
          `payment_status`.`id`                                                                             AS status_id,
          DATE_FORMAT(CONVERT_TZ(order_item.`created`, '+0:00', '-8:00'),
                      '%Y-%m-%d %H:%i')                                                                     AS 'created',
          `user`.`name`                                                                                     AS 'agent',
          event.id                                                                                          AS event_id,
          payment.checked,
          DATE_FORMAT(CONVERT_TZ(payment.checked_date, '+0:00', '-8:00'),
                      '%Y-%m-%d %H:%i')                                                                     AS checked_date,
          DATE_FORMAT(CONVERT_TZ(`payment`.`complete_date`, '+0:00', '-8:00'),
                      '%Y-%m-%d %H:%i')                                                                     AS `complete date`,
          `payment`.`delivery_status`                                                                       AS `delivered`
        FROM `order_item`
          INNER JOIN `payment`
            ON payment.id = `order_item`.`payment_id` AND (payment.status > 0.0 OR payment.status = -3.0)
          LEFT JOIN (SELECT
                       sum(`payment_refund`.total) AS `refunds_total`,
                       payment_refunds.payment_id  AS `payment_id`
                     FROM payment
                       INNER JOIN `payment_refunds` ON payment_refunds.payment_id = payment.id
                       INNER JOIN `payment` AS `payment_refund`
                         ON `payment_refund`.id = `payment_refunds`.payment_id_refund
                     GROUP BY `payment_refunds`.payment_id) AS `refunds` ON `refunds`.payment_id = payment.id
#           INNER JOIN event_date_product ON event_date_product.id = order_item.event_date_product_id
#           INNER JOIN event_date ON event_date.id = event_date_product.event_date_id
          INNER JOIN event ON event.id = order_item.event_id
          INNER JOIN payment_status ON payment_status.id = payment.status
          INNER JOIN payment_type ON payment_type.id = payment.payment_type_id
          LEFT JOIN user ON user.id = payment.completed_by
          LEFT JOIN submission_entry ON submission_entry.form_submission_id = `payment`.`form_submission_id`
          LEFT JOIN question ON question.id = submission_entry.question_id AND question.var IN ('name', 'email')
        WHERE 1 = '1' AND (order_item.status > 0.0 OR order_item.status = -2.0)
        GROUP BY `order_item`.`order_id`
        HAVING 1 = '1'
        ORDER BY `order_item`.`order_id` DESC
        LIMIT 10

[2]

 CREATE INDEX order_id
      ON order_item (order_id);

    CREATE INDEX payment_id
      ON order_item (payment_id);

    CREATE INDEX status
      ON order_item (status);

第二张表

CREATE INDEX payment_type_id
  ON payment (payment_type_id);

CREATE INDEX status
  ON payment (status);

[3]

CREATE TABLE order_item
(
  id                         INT AUTO_INCREMENT
    PRIMARY KEY,
  order_id                   INT                                 NOT NULL,
  form_submission_id         INT                                 NULL,
  status                     DOUBLE DEFAULT '0'                  NULL,
  payment_id                 INT DEFAULT '0'                     NULL
);

第二张表

CREATE TABLE payment
(
  id                 INT AUTO_INCREMENT,
  payment_type_id    INT                                 NOT NULL,
  status             DOUBLE                              NOT NULL,
  form_submission_id INT                                 NOT NULL,
  PRIMARY KEY (id, payment_type_id)
);

[4] 运行代码片段以查看 HTML 格式的 EXPLAIN

<!DOCTYPE html>
<html>
<head>
  <title></title>
</head>
<body>
<table border="1" style="border-collapse:collapse">
<tr><th>id</th><th>select_type</th><th>table</th><th>type</th><th>possible_keys</th><th>key</th><th>key_len</th><th>ref</th><th>rows</th><th>Extra</th></tr>
<tr><td>1</td><td>PRIMARY</td><td>payment_status</td><td>range</td><td>PRIMARY</td><td>PRIMARY</td><td>8</td><td>NULL</td><td>4</td><td>Using where; Using temporary; Using filesort</td></tr>
<tr><td>1</td><td>PRIMARY</td><td>payment</td><td>ref</td><td>PRIMARY,payment_type_id,status</td><td>status</td><td>8</td><td>exp_live_18092017.payment_status.id</td><td>17357</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>payment_type</td><td>eq_ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.payment.payment_type_id</td><td>1</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>user</td><td>eq_ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.payment.completed_by</td><td>1</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>submission_entry</td><td>ref</td><td>form_submission_id,idx_submission_entry_1</td><td>form_submission_id</td><td>4</td><td>exp_live_18092017.payment.form_submission_id</td><td>2</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>question</td><td>eq_ref</td><td>PRIMARY,var</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.submission_entry.question_id</td><td>1</td><td>Using where</td></tr>
<tr><td>1</td><td>PRIMARY</td><td>order_item</td><td>ref</td><td>status,payment_id</td><td>payment_id</td><td>5</td><td>exp_live_18092017.payment.id</td><td>3</td><td>Using where</td></tr>
<tr><td>1</td><td>PRIMARY</td><td>event</td><td>eq_ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.order_item.event_id</td><td>1</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>&lt;derived3&gt;</td><td>ref</td><td>key0</td><td>key0</td><td>5</td><td>exp_live_18092017.payment.id</td><td>10</td><td>Using where</td></tr>
<tr><td>3</td><td>DERIVED</td><td>payment_refunds</td><td>index</td><td>payment_id,payment_id_refund</td><td>payment_id</td><td>4</td><td>NULL</td><td>1110</td><td></td></tr>
<tr><td>3</td><td>DERIVED</td><td>payment</td><td>ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.payment_refunds.payment_id</td><td>1</td><td>Using index</td></tr>
<tr><td>3</td><td>DERIVED</td><td>payment_refund</td><td>ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.payment_refunds.payment_id_refund</td><td>1</td><td></td></tr>
<tr><td>2</td><td>DEPENDENT SUBQUERY</td><td>order_item</td><td>ref</td><td>payment_id</td><td>payment_id</td><td>5</td><td>func</td><td>3</td><td></td></tr></table>
</body>
</html>

预期结果

必须小于5秒而不是40秒

重要 更新

1)回复评论1:那两个表根本就没有外键

更新 1:本地上,原始查询需要40 秒 如果我删除以下内容,它将变为25 秒 节省15 秒

GROUP_CONCAT(DISTINCT (CASE WHEN submission_entry.text = '' OR submission_entry.text IS NULL
    THEN ' '
                         ELSE submission_entry.text END) ORDER BY question.var DESC SEPARATOR 0x1D) AS buyer

如果我在 40 秒 左右的同一时间移除 没有保存!

COUNT(DISTINCT CASE WHEN (`order_item`.status > 0 OR (
    `order_item`.status != -1 AND `order_item`.status >= -2 AND `payment`.payment_type_id != 8 AND
    payment.make_order_free = 1))
    THEN `order_item`.id
                 ELSE NULL END)                                                                     AS qty,

如果我移除,它需要大约36 秒 节省4 秒

(SELECT SUM(order_item.sub_total)
   FROM order_item
   WHERE payment_id =
         payment.id)                                                                                AS sub_total,
  CASE WHEN payment.make_order_free = 1
    THEN ROUND(payment.total + COALESCE(refunds_total, 0), 2)
  ELSE ROUND(payment.total, 2) END                                                                  AS 'total',

最佳答案

删除HAVING 1=1;优化器可能不够聪明,无法忽略它。请提供EXPLAIN SELECT(不是在 html 中)以查看优化器正在做什么。

在这种情况下使用复合 PK 似乎是错误的:PRIMARY KEY (id, payment_type_id)。请证明这一点。

请解释status的含义或DOUBLE的必要性:status DOUBLE

要找出查询为何如此缓慢的原因需要一些努力。让我们从抛出规范化部分开始,例如日期和事件名称和货币。那就是将查询缩减到足以找到所需的行,但不是每行的详细信息。如果它仍然很慢,让我们调试它。如果它“快”了,然后再添加其他内容,一项一项地找出导致性能问题的原因。

只是id是每个表的PRIMARY KEY吗?还是有更多异常(exception)情况(如payment)?

question.var 指定一个值似乎是“错误的”,但随后使用 LEFT 来暗示它是可选的。请将所有 LEFT JOINs 更改为 INNER JOINs 除非我在这个问题上弄错了。

是否有任何表(可能是 submission_entryevent_date_product)是“多对多”映射表?如果是这样,请按照提示进行操作 here以获得一些性能提升。

当您回来时,请为每个表提供SHOW CREATE TABLE

关于mysql - group by 和 group concat ,不使用主 pk 优化 mysql 查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46510329/

相关文章:

php - 验证 MySQL 的登录详细信息

mysql - 如何提高MySQL join 的效率?

mysql - 优化我对两个非常大的表的查询

sql - 优化mysql计数查询

PHP 和 MySQL 选择单个值

php - 无法使用 TinyMCE 编辑器通过 Ajax 填充文本区域

php - Codeigniter php 上的数据库错误

Java、mysql错误:Column count doesn't match value count at row 1

c# - 如何将EF数据库恢复为默认值?

android - 从 SQLite 数据库读取 Android