mysql - 旋转非规范化行——将多行转换为单行——导致 LEFT JOIN 速度慢

标签 mysql sql join left-join pivot-table

我正在尝试获取一个表,其中字符串排列在记录中,这些记录是三个字符串的簇,并返回一个查询,其中数据在一个记录中。

表定义是:

CREATE TABLE Xerox.FIM2 (
  KPIN CHAR(18) NOT NULL COMMENT 'Part Number',
  KSEQ DECIMAL(2, 0) NOT NULL COMMENT 'Sequence Number',
  SDS1 CHAR(40) NOT NULL COMMENT 'Supplemental Description 1',
  SDS2 CHAR(30) NOT NULL COMMENT 'Supplemental Description 2',
  SDS3 CHAR(30) NOT NULL COMMENT 'Supplemental Description 3',
  PRIMARY KEY (KPIN, KSEQ),
  INDEX IDX_FIM2_KSEQ (KSEQ),
  INDEX IDX_FIM2_SDS1 (SDS1),
  INDEX IDX_FIM2_SDS2 (SDS2),
  INDEX IDX_FIM2_SDS3 (SDS3),
  INDEX UK_FIM2_KPIN (KPIN)
)
ENGINE = INNODB
AVG_ROW_LENGTH = 180
CHARACTER SET latin1
COLLATE latin1_swedish_ci
COMMENT = 'Supplemental Part Descriptions';

给定的零件号可能没有记录或最多 8 条记录。这八条记录包含我试图从中转换的字符串数据(描述,每条记录三个):

KPIN  KSEQ  SDS1                                SDS2                            SDS3

R0205   1   COLD ROLLED A1008/A CS TYPE B       MATTE FINISH OIL / EXPOSED      MIN YIELD 30,000 RB 45-60
R0205   2   THICKNESS .032 TOL +.006/-.000      WIDTH 48.000 TOL +.188/-.000    LENGTH 84.875 TOL +/-.020
R0205   3   SQUARENESS TOL +/-.062              MAX LIFT WEIGHT 10,000 LBS      <null>

数据如:

KPIN  KSEQ  DES1                                DES2                            DES3                        DES4                                DES5                            DES6                        DES7                                DES8                            DES9                        DES10                               DES11                           DES12

R0205   1   COLD ROLLED A1008/A CS TYPE B       MATTE FINISH OIL / EXPOSED      MIN YIELD 30,000 RB 45-60   THICKNESS .032 TOL +.006/-.000      WIDTH 48.000 TOL +.188/-.000    LENGTH 84.875 TOL +/-.020   THICKNESS .032 TOL +.006/-.000      WIDTH 48.000 TOL +.188/-.000    LENGTH 84.875 TOL +/-.020   SQUARENESS TOL +/-.062              MAX LIFT WEIGHT 10,000 LBS      <null><null><null><null>...

执行此操作的 SQL 命令是:

    SELECT FIM1.KPIN, FIM1.DES, p.DES1, p.DES2, p.DES3, p.DES4, p.DES5, p.DES6,
                                p.DES7, p.DES8, p.DES9, p.DES10, p.DES11, p.DES12,
                                p.DES13, p.DES14, p.DES15, p.DES16, p.DES17, p.DES18,
                                p.DES19, p.DES20, p.DES21, p.DES22, p.DES23, p.DES24
    FROM FIM1
    JOIN
    (SELECT  KPIN, KSEQ,
      MAX(CASE WHEN (KSEQ = 1) THEN SDS1 END) AS DES1,
      MAX(CASE WHEN (KSEQ = 1) THEN SDS2 END) AS DES2,
      MAX(CASE WHEN (KSEQ = 1) THEN SDS3 END) AS DES3,
      MAX(CASE WHEN (KSEQ = 2) THEN SDS1 END) AS DES4,
      MAX(CASE WHEN (KSEQ = 2) THEN SDS2 END) AS DES5,
      MAX(CASE WHEN (KSEQ = 2) THEN SDS3 END) AS DES6,
      MAX(CASE WHEN (KSEQ = 3) THEN SDS1 END) AS DES7,
      MAX(CASE WHEN (KSEQ = 3) THEN SDS2 END) AS DES8,
      MAX(CASE WHEN (KSEQ = 3) THEN SDS3 END) AS DES9,
      MAX(CASE WHEN (KSEQ = 4) THEN SDS1 END) AS DES10,
      MAX(CASE WHEN (KSEQ = 4) THEN SDS2 END) AS DES11,
      MAX(CASE WHEN (KSEQ = 4) THEN SDS3 END) AS DES12,
      MAX(CASE WHEN (KSEQ = 5) THEN SDS1 END) AS DES13,
      MAX(CASE WHEN (KSEQ = 5) THEN SDS2 END) AS DES14,
      MAX(CASE WHEN (KSEQ = 5) THEN SDS3 END) AS DES15,
      MAX(CASE WHEN (KSEQ = 6) THEN SDS1 END) AS DES16,
      MAX(CASE WHEN (KSEQ = 6) THEN SDS2 END) AS DES17,
      MAX(CASE WHEN (KSEQ = 6) THEN SDS3 END) AS DES18,
      MAX(CASE WHEN (KSEQ = 7) THEN SDS1 END) AS DES19,
      MAX(CASE WHEN (KSEQ = 7) THEN SDS2 END) AS DES20,
      MAX(CASE WHEN (KSEQ = 7) THEN SDS3 END) AS DES21,
      MAX(CASE WHEN (KSEQ = 8) THEN SDS1 END) AS DES22,
      MAX(CASE WHEN (KSEQ = 8) THEN SDS2 END) AS DES23,
      MAX(CASE WHEN (KSEQ = 8) THEN SDS3 END) AS DES24

    FROM FIM2
    GROUP BY KPIN
    ORDER BY KPIN, KSEQ) as p
    ON FIM1.KPIN=p.KPIN ORDER BY FIM1.KPIN

表FIM1(主要件号表)共有118552条记录

表FIM2中有66,303条记录(所有这些描述)

表FIM1中的62,163条记录在FIM2中至少有一条描述记录。

上面写的查询在大约 3 秒内返回 62,163 条记录。如果我将其更改为 LEFT JOIN 而不是 JOIN,我希望获得所有 118,552 条包含大量空值的记录。该操作需要几分钟才能完成。我想了解为什么这两个操作之间存在如此巨大的差异。看起来 LEFT JOIN 迫使引擎一次又一次地遍历 FIM2 表,但为什么简单的 JOIN 不必这样做呢?我意识到我正在为每个项目创建一个类似于临时记录的东西,但简单的连接仍然能够优化它。

此行合并技术基于此处详述的概念: How to pivot? How to convert multiple rows into one row with multiple columns?

我的查询计划: (对于加入):

id  select_type table       type        possible_keys           key         key_len ref     rows    Extra
1   PRIMARY     <derived2>  ALL         (null)                  (null)      (null)  (null)  64512   Using temporary; Using filesort
1   PRIMARY     FIM1        eq_ref      PRIMARY,UK_FIM1_KPIN    PRIMARY     18      p.KPIN  1   
2   DERIVED     FIM2        index       (null)                  PRIMARY     19      (null)  64085   Using temporary; Using filesort

对于左连接:

id  select_type table       type        possible_keys   key             key_len     ref     rows    Extra
1   PRIMARY     FIM1        index       (null)          IDX_FIM1_DES    30          (null)  124199  Using index; Using temporary; Using filesort
1   PRIMARY     <derived2>  ALL         (null)          (null)          (null)      (null)  64512   
2   DERIVED     FIM2        index       (null)          PRIMARY         19          (null)  64085   Using temporary; Using filesort

最佳答案

啊,非规范化数据集的乐趣。这不是您所要求的,但是您能否像这样重做此 FIM2 表,以便每行中只有一项而不是三项?如果您这样做,您的生活变得更轻松。

KPIN  KSEQ  KFLD  ATTRIBUTE
R0205   1      1   COLD ROLLED A1008/A CS TYPE B
R0205   1      2   MATTE FINISH OIL / EXPOSED
R0205   1      3   MIN YIELD 30,000 RB 45-60
R0205   2      1   THICKNESS .032 TOL +.006/-.000
R0205   2      2   WIDTH 48.000 TOL +.188/-.000
R0205   2      3   LENGTH 84.875 TOL +/-.020
R0205   3      1   SQUARENESS TOL +/-.062
R0205   3      2   MAX LIFT WEIGHT 10,000 LBS      

但是,这不是你问的。你想改进你的报告程序,以便它可以在合理的时间内LEFT JOIN这两个表。

分解您的查询很重要。首先,您尝试LEFT JOIN FIM1FIM2 表,然后旋转结果集以将内容扩展到一行。

您的方法在功能上是正确的,是先进行旋转,然后进行连接。让我们尝试另一种方法,首先加入,然后旋转。这应该会减少满足查询所需的数字运算。

     SELECT ff.KPIN, ff.DES,
            a.SDS1 AS DES1,
            a.SDS2 AS DES2,
            a.SDS3 AS DES3,
            b.SDS1 AS DES4,
            b.SDS2 AS DES5,
            b.SDS3 AS DES6,
            c.SDS1 AS DES7,
            c.SDS2 AS DES8, ...
            h.SDS1 AS DES23,
            h.SDS2 AS DES24
       FROM FIM1 AS ff
  LEFT JOIN FIM2 AS a ON ff.KPIN = a.KPIN AND a.KSEQ=1
  LEFT JOIN FIM2 AS b ON ff.KPIN = b.KPIN AND b.KSEQ=2
  LEFT JOIN FIM2 AS c ON ff.KPIN = c.KPIN AND c.KSEQ=3
  LEFT JOIN FIM2 AS d ON ff.KPIN = d.KPIN AND d.KSEQ=4
  LEFT JOIN FIM2 AS e ON ff.KPIN = e.KPIN AND e.KSEQ=5
  LEFT JOIN FIM2 AS f ON ff.KPIN = f.KPIN AND f.KSEQ=6
  LEFT JOIN FIM2 AS g ON ff.KPIN = g.KPIN AND g.KSEQ=7
  LEFT JOIN FIM2 AS h ON ff.KPIN = h.KPIN AND h.KSEQ=8

如果您使用复合索引(KSEQ, KPIN) 索引FIM2,这可能会非常快。不可否认,八个 LEFT JOIN 操作看起来很麻烦,但是您的 MAX() ... GROUP BY 查询也很麻烦。您还应该尝试索引 (KSEQ,KPIN) 以查看哪个更快。

关于mysql - 旋转非规范化行——将多行转换为单行——导致 LEFT JOIN 速度慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27320532/

相关文章:

php - xampp php文件上传错误

mysql - 如何使用连接表并通过选择查询进行计数

php - mySQL 查询在 PHPmyAdmin 中工作,但在 PHP 文件中不起作用

mysql - 排名排行榜 - MySQL

sql - 在 SQL 表中搜索值列表的最有效方法

c# - c#中的参数无效异常

MySQL - 如何优化大量条件

php - MySQL 动态选择列

join - 连接表时 Hive 查询出错

mysql - 列出子查询中提到的表中的列(MySQL)