我有一个查询需要一天多的时间才能完成,temp_message_split 表中有 1700128 条记录,因此请帮助调整此查询
下面提供了建表语句和解释计划。
UPDATE TEMP_MESSAGE_SPLIT t1 , TEMP_MESSAGE_SPLIT t2
SET t1.STATUS = 'D'
WHERE
(t1.temp_message_split_key < t2.temp_message_split_key AND t1.DH_MEMBER_ID = t2.DH_MEMBER_ID)
AND nullif(t1.dh_member_id,'') IS NOT NULL;
这里是创建表DDL
CREATE TABLE
temp_message_split
(
FIRST_NAME VARCHAR(20),
LAST_NAME VARCHAR(30),
DOB VARCHAR(10),
EMPLOYEE_ID VARCHAR(20),
CES_CUST_NUM VARCHAR(7),
MED_POLICY_NUM VARCHAR(20),
EMAIL_ADDR VARCHAR(50),
DH_MEMBER_ID VARCHAR(9),
ALT_ID VARCHAR(20),
DRSN VARCHAR(2),
SSN VARCHAR(9),
EPIPHANY_MEMBER_ID VARCHAR(18),
PORTAL_ADDRESS VARCHAR(30),
STATEMENT_VENDOR VARCHAR(20),
CONTENT_KEY VARCHAR(18),
EPIPHANY_COMMUNICATION_ID VARCHAR(200),
PRIORITY VARCHAR(4),
DAYS_UNTIL_EXPIRED VARCHAR(4),
CONTENT_DTL_KEY VARCHAR(18),
STATUS VARCHAR(1),
ACTIVATION_MEMBER_KEY bigint,
MESSAGE_BOARD_KEY bigint,
PORTAL_STATEMENT_LOC_KEY bigint,
temp_message_split_KEY bigint NOT NULL AUTO_INCREMENT,
PRIMARY KEY (temp_message_split_KEY),
INDEX EPIPHANY_COMMUNICATION_ID_IDX (EPIPHANY_COMMUNICATION_ID),
INDEX TEMP_MESSAGE_SPLIT_IDX2 (ALT_ID),
INDEX TEMP_MESSAGE_SPLIT_IDX3 (DRSN),
INDEX TEMP_MESSAGE_SPLIT_IDX4 (ALT_ID, DRSN),
INDEX TEMP_MESSAGE_SPLIT_IDX1 (DH_MEMBER_ID)
)
ENGINE=InnoDB DEFAULT CHARSET=utf8;
这是它的解释计划:
+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
| 1 | SIMPLE | t2 | NULL | index | PRIMARY,TEMP_MESSAGE_SPLIT_IDX1 | TEMP_MESSAGE_SPLIT_IDX1 | 30 | NULL | 1619639 | 100.00 | Using index |
| 1 | UPDATE | t1 | NULL | ALL | PRIMARY,TEMP_MESSAGE_SPLIT_IDX1 | NULL | NULL | NULL | 1619639 | 33.33 | Range checked for each record (index map: 0x5) |
+----+-------------+-------+------------+-------+---------------------------------+-------------------------+---------+------+---------+----------+------------------------------------------------+
2 rows in set (0.00 sec)
此查询需要一天多的时间来处理 temp_message_split 表中的 1700128,我们需要对其进行调整,使其花费尽可能多的时间。时间尽可能。
最佳答案
我最好的猜测是您想将状态设置为 D
对于除最高值 temp_message_split_key
之外的所有值对于每个 DH_MEMBER_ID
。
最好的解决方案是NOT EXISTS
,但MySQL不支持NOT EXISTS
在 UPDATE
的同一张 table 上查询。
因此,另一种方法使用 GROUP BY
:
UPDATE TEMP_MESSAGE_SPLIT t1 JOIN
(SELECT t2.DH_MEMBER_ID, MAX(t2.temp_message_split_key) as max_temp_message_split_key
FROM TEMP_MESSAGE_SPLIT t2
GROUP BY t2.DH_MEMBER_ID
) t2
ON t1.DH_MEMBER_ID = t2.DH_MEMBER_ID AND
t1.temp_message_split_key < t2.max_temp_message_split_key
SET t1.STATUS = 'D';
(dh_member_id, temp_message_split_key)
上的索引可能有助于提高性能。
这仍然需要很长时间,因为您(大概)正在更新很多行。如果可能的话,那么用您想要的值创建一个新表可能会更简单。这会快得多(由于日志记录和锁定)。
NULLIF()
可能什么也没做,但它对查询性能的影响很小。最好写成 t1.dh_member_id <> ''
.
关于mysql - 需要调整mysql查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54007022/