MYSQL:即使正确添加了索引,仍在使用文件排序,使用临时

标签 mysql sql performance indexing filesort

我有这个查询,我认为我已经正确地为它们建立了索引。但仍然可以获得文件排序和临时索引。

查询如下:

SELECT * FROM
    (SELECT PIH.timestamp, PIH.practice_id, PIH.timestamp as invoice_num, PIH.custom_invnum,
CEIL(PIH.total_invoice + PIH.tax + PIH.other_bill)  as grand_total, PIH.total_invoice, PIH.extra_charge_ph as extra_charge,
PIH.tax, PIH.other_bill, PIH.changed, PIH.source,
PIH.notes, PIH.is_active, PIH.paid as pay,
PIH.covered_amount, IF(PIH.is_active = 1, IF(PIH.total_invoice = 0 OR PIH.total_invoice + PIH.tax + PIH.other_bill - PIH.covered_amount <= PIH.paid, 1, IF(PIH.paid = 0, 0, 2)), '')  as invoice_st,
RPP.patient_id, RPP.first_name as pfname, RPP.last_name as plname, RPP.dob as p_dob, RPP.gender as p_gender, RPP.reff_id as p_reff_id, RPP.mobile_number as p_mobile, IF(PIH.group_doctors IS NOT NULL, NULL, D.doc_title) as doc_title, IF(PIH.group_doctors IS NOT NULL,
PIH.group_doctors, D.first_name) as doc_fname, IF(PIH.group_doctors IS NOT NULL, PIH.group_doctors, D.last_name) as doc_lname, IF(PIH.group_doctors IS NOT NULL, NULL, D.spc_dsg) as spc_dsg, PA.username, TL.timestamp as checkout_time, IP.name as ip_name, PMM.timestamp as mcu_id
            FROM  practice_invoice_header PIH
            INNER JOIN  practice_invoice_detail PID  ON PID.timestamp = PIH.timestamp
              AND  PID.practice_id = PIH.practice_id
            INNER JOIN  practice_queue_list PQL  ON PQL.encounter_id = PID.encounter_id
              AND  PQL.practice_place_id = PIH.practice_id
            INNER JOIN  temp_search_view D  ON D.id = PQL.doctor_id
              AND  D.pp_id = PQL.practice_place_id
            INNER JOIN  practice_place PP  ON PP.id = PIH.practice_id
            INNER JOIN  ref_practice_patient RPP  ON RPP.patient_id = PIH.patient_id
              AND  RPP.practice_id = PP.parent_id
            LEFT JOIN  practice_mcu_module PMM  ON PMM.id = PID.mcu_module_id
              AND  PMM.practice_id = PID.practice_id
            LEFT JOIN  transaction_log TL  ON TL.reff_id = PIH.timestamp
              AND  TL.practice_id = PIH.practice_id
              AND  TL.activity = "CHK"
            LEFT JOIN  practice_admin PA  ON PA.id = TL.admin_id
            LEFT JOIN  insurance_plan IP  ON IP.id = PIH.insurance_plan_id
            WHERE  PIH.source <> 'P'
              AND  PIH.practice_id = 28699
              AND  PIH.is_active = 1
              AND  PQL.cal_id >= 201807010
              AND  PQL.cal_id <= 201807312
            GROUP BY  PIH.timestamp, PIH.practice_id 
    ) AS U  LIMIT 0,20

注意:我只展示了一些在这个查询中使用的主表和排序using filesort/temporary的那些,当然如果我把所有的都贴出来将会有太多的信息。

查询是关于发票列表的,它有标题 (practice_invoice_header) 和详细信息 (practice_invoice_detail)。并且此查询与 practice_place 表连接

CREATE TABLE `practice_invoice_header` (
 `timestamp` bigint(20) NOT NULL,
 `practice_id` int(11) NOT NULL,
 `cal_id` int(11) NOT NULL,
 `patient_id` int(11) NOT NULL DEFAULT 0,
 `source` char(1) NOT NULL COMMENT 'E = ENCOUNTER; P = OTHER (PHARM / LAB)',
 `total_invoice` float(30,2) NOT NULL DEFAULT 0.00,
 `tax` float(30,2) NOT NULL DEFAULT 0.00,
 `other_bill` float(30,2) NOT NULL DEFAULT 0.00,
 `changed` float(30,2) NOT NULL DEFAULT 0.00,
 `paid` float(30,2) NOT NULL DEFAULT 0.00,
 `covered_amount` float(30,2) NOT NULL DEFAULT 0.00,
 `notes` varchar(300) DEFAULT NULL,
 `custom_invnum` varchar(30) DEFAULT NULL,
 `insurance_plan_id` varchar(20) DEFAULT NULL,
 `is_active` int(11) NOT NULL DEFAULT 1,
 `cancel_reason` varchar(200) DEFAULT NULL,
 PRIMARY KEY (`timestamp`,`practice_id`),
 KEY `custom_invnum` (`custom_invnum`),
 KEY `insurance_plan_id` (`insurance_plan_id`),
 KEY `practice_id_3` (`practice_id`,`xxx_reff_id`),
 KEY `ph_check_status` (`ph_checked_by`),
 KEY `cal_id` (`cal_id`),
 KEY `practice_id_5` (`practice_id`,`outpx_id`),
 KEY `practice_id_6` (`practice_id`,`cal_id`,`source`,`is_active`),
 KEY `total_invoice` (`total_invoice`),
 KEY `patient_id` (`patient_id`),
 CONSTRAINT `practice_invoice_header_ibfk_1` FOREIGN KEY (`practice_id`)
       REFERENCES `practice_place` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `practice_invoice_detail` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `timestamp` bigint(20) NOT NULL,
 `practice_id` int(11) NOT NULL,
 `item_id` int(11) NOT NULL,
 `item_sub_id` int(11) DEFAULT NULL,
 `item_type` char(1) NOT NULL COMMENT 'D = DRUG; P = PROCEDURE; L = LAB',
 `item_qty` float NOT NULL,
 `item_price` float(22,2) NOT NULL,
 `discount` float NOT NULL DEFAULT 0,
 `is_active` int(11) NOT NULL DEFAULT 1,
 PRIMARY KEY (`id`),
 KEY `item_type` (`item_type`),
 KEY `timestamp` (`timestamp`,`practice_id`),
 KEY `practice_id` (`practice_id`),
 KEY `item_id_2` (`item_id`,`item_sub_id`,`item_type`),
 KEY `timestamp_2` (`timestamp`,`practice_id`,`item_id`,`item_sub_id`,`item_type`),
 KEY `practice_id_3` (`practice_id`,`item_type`),
 KEY `the_id` (`id`,`practice_id`) USING BTREE,
 KEY `timestamp_3` (`timestamp`,`practice_id`,`item_type`,`item_comission`,
      `item_comission_type`, `doctor_id`,`item_id`,`item_sub_id`,`id`) USING BTREE,
 KEY `timestamp_4` (`timestamp`,`practice_id`,`item_id`,`item_sub_id`,`item_type`,
      `item_comission_2`,`item_comission_2_type`,`doctor_id_2`,`id`) USING BTREE,
 KEY `request_id` (`request_id`,`request_practice`),
 KEY `timestamp_5` (`timestamp`,`practice_id`,`is_active`),
 KEY `practice_id_6` (`practice_id`,`encounter_id`,`is_active`),
 KEY `practice_id_7` (`practice_id`,`item_type`,`encounter_id`,`is_active`),
 CONSTRAINT `practice_invoice_detail_ibfk_1` FOREIGN KEY (`timestamp`)
     REFERENCES `practice_invoice_header` (`timestamp`) ON DELETE CASCADE,
 CONSTRAINT `practice_invoice_detail_ibfk_2` FOREIGN KEY (`practice_id`)
     REFERENCES `practice_place` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1447348 DEFAULT CHARSET=latin1

CREATE TABLE `ref_practice_patient` (
 `practice_id` int(11) NOT NULL,
 `patient_id` int(11) NOT NULL,
 `reff_id` varchar(35) DEFAULT NULL,
 `is_user` int(11) NOT NULL DEFAULT 0,
 `parent_user_id` int(11) NOT NULL DEFAULT 0
 PRIMARY KEY (`practice_id`,`patient_id`),
 KEY `patient_id` (`patient_id`),
 KEY `reff_id` (`reff_id`),
 KEY `practice_id` (`practice_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `practice_place` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(75) NOT NULL,
 `statement` text DEFAULT NULL,
 `address` varchar(200) NOT NULL,
 `phone` varchar(15) NOT NULL,
 `wa_number` varchar(15) DEFAULT NULL,
 `fax` varchar(15) NOT NULL,
 `email` varchar(50) NOT NULL,
 `is_branch` int(11) NOT NULL,
 `parent_id` int(11) NOT NULL,
 `editted_by` int(11) DEFAULT NULL,
 `editted_date` bigint(20) DEFAULT NULL,
 `status` int(11) NOT NULL DEFAULT 1,
 PRIMARY KEY (`id`),
 KEY `parent_id` (`parent_id`),
 KEY `reff_id` (`reff_id`),
) ENGINE=InnoDB AUTO_INCREMENT=29058 DEFAULT CHARSET=latin1

下面是查询产生的解释,我突出显示了使用 filsort 的那个(第 2 个)

1 PRIMARY ALL NULL  NULL NULL NULL  14028
2 DERIVED PP  const PRIMARY,parent_id PRIMARY 4 const 1 Using temporary; Using filesort
2 DERIVED PIH ref   PRIMARY,practice_id_3,practice_id_5,practice_id_6,practice_id_8,pharm_read,lab_read,rad_read,patient_id
                   practice_id_5 4 const  7014 Using where 
2 DERIVED RPP eq_ref PRIMARY,patient_id,practice_id,practice_id_2,practice_id_3 
                   PRIMARY 8 const,k6064619_lokadok.PIH.patient_id  1
2 DERIVED PID ref   timestamp,practice_id,timestamp_2,practice_id_2,practice_id_3,timestamp_3,timestamp_4,practice_id_4,practice_id_5,timestamp_5,practice_id_6,practice_id_7
                   timestamp 12 k6064619_lokadok.PIH.timestamp,const  1 
2 DERIVED PMM eq_ref PRIMARY,id,practice_id
                   PRIMARY 4 k6064619_lokadok.PID.mcu_module_id  1 Using where 
2 DERIVED TL  ref   reff_id reff_id 12 k6064619_lokadok.PIH.timestamp,const  1 Using where 
2 DERIVED PA  eq_ref PRIMARY PRIMARY 4 k6064619_lokadok.TL.admin_id  1 Using where 
2 DERIVED IP  ref   PRIMARY,id PRIMARY 22 k6064619_lokadok.PIH.insurance_plan_id  1 Using where 
2 DERIVED PQL ref   PRIMARY,encounter_id,cal_id_2
                   encounter_id 5 k6064619_lokadok.PID.encounter_id  2 Using where; Using index
2 DERIVED D   ref   doc_id,pp_id,id_2,pp_doc doc_id 4 k6064619_lokadok.PQL.doctor_id  1 Using where

我相信我已经在 practice_place 表中索引了 parent_id,还在 ref_practice_patient 中索引了 patient_id 和 < strong>practice_id 是 PRIMARY

最佳答案

为什么要有外部查询?优化器可以随意打乱内部查询的结果,从而让 LIMIT 选择您不期望的顺序。至少要加上ORDER BY,最好也把外层的select也扔掉。

主要索引

我们来分析一下可能设计索引的地方:

        WHERE  PIH.source <> 'P'
          AND  PIH.practice_id = 28699
          AND  PIH.is_active = 1
          AND  PQL.cal_id >= 201807010
          AND  PQL.cal_id <= 201807312
        GROUP BY  PIH.timestamp, PIH.practice_id 

由于涉及多个表,因此不可能有一个索引来处理所有 WHERE

由于测试并非全部为 =,因此不可能超出 WHERE 并包含 GROUP BY 的列。

所以,我看到两个索引:

PIH:  INDEX(practice_id, is_active,   -- in either order
            source)
PQL:  INDEX(cal_id)

由于我们无法进入GROUP BY,优化器别无选择,只能根据WHERE 收集所有行,进行一些分组,然后执行ORDER BY(正如我所说,缺少但必需的)。

因此,GROUP BYORDER BY 将需要一个或两个临时文件和文件排序。不,您无法摆脱它,至少在不以某种方式更改查询的情况下不能。 (请注意,“文件排序”实际上可能在 RAM 中完成。)

您的额外 SELECT 层可能会添加额外的临时文件和文件排序。

EXPLAIN 无法指出存在两种类型的情况。 EXPLAIN FORMAT=JSON 有这样的细节。

其他问题...

PRIMARY KEY 中使用 timestamp 是有风险的,除非您确定两行可以出现相同的时间戳,或者 PK 中有另一列可以确保独特性。

不要使用 FLOAT 来赚钱。它将产生额外的舍入误差,并且它不能存储超过 7 位有效数字(即每美分不到 10 万美元)。不要使用 float(30,2),它会更糟,因为您要强制进行额外的舍入。使用 DECIMAL(30,2),但选择一些合理的值,而不是 30。它占用 14 个字节——主要是浪费空间。

只要你有INDEX(a,b),你就不需要INDEX(a);它是多余的并且会(稍微)减慢 INSERTs

LEFT JOIN  transaction_log TL
           ON  TL.reff_id = PIH.timestamp
          AND  TL.practice_id = PIH.practice_id
          AND  TL.activity = "CHK"

需要

INDEX(reff_id, practice_id, activity)  -- in any order

还有

        INNER JOIN  practice_invoice_detail PID  ON PID.timestamp = PIH.timestamp
          AND  PID.practice_id = PIH.practice_id

PIH:  INDEX(practice_id, timestamp)   -- not the opposite order
PIH:  INDEX(practice_id, is_active, timestamp)

        INNER JOIN  practice_queue_list PQL  ON PQL.encounter_id = PID.encounter_id
          AND  PQL.practice_place_id = PIH.practice_id

PQL:  INDEX(encounter_id, cal_id)
PQL:  INDEX(encounter_id, practice_place_id, cal_id)

一些讨论...

  • JOIN 中,EXPLAIN 显示了处理表格的一种顺序;如果它以其他方式处理表格,它不会给你任何线索。
  • 我试图展示如果先使用 PQL 或先使用 PIH 可能需要什么索引——即为该表使用 WHERE 东西,然后
  • 我已尝试显示连接到另一个表的最佳索引。
  • 优化器可能不会从 WHERE 子句中未提及的任何表开始,但这并不确定。
  • 我没有列出访问每个其他表的最佳索引。
  • 更多讨论:http://mysql.rjweb.org/doc.php/index_cookbook_mysql

关于MYSQL:即使正确添加了索引,仍在使用文件排序,使用临时,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51978527/

相关文章:

mysql查询需要优化

java - 如何验证一个算法的运行时间?

performance - 用于客户端-服务器的 SQLite

java - 哪种列表实现最适合从前面和后面删除和插入?

mysql - SQL - 按排序顺序调用聚合函数

mysql - 如何设置包含完整时间戳的列等于时间戳的一部分的 SQL 查询

Sql Server 旧数据库是否转为聚集索引

html - SQL 查询内联表,打印到 HTML <select> 标签

mysql - 数据库性能: how do more columns in a table impact design/development/performance

php - 如何编写在Mysql触发器中获取多条记录的选择查询?