这是我的第一个存储过程,所以我不确定我是否正确执行此操作。我已经尝试尽可能地优化它,但仍然在运行 10 分钟时出现查询超时。我真的需要它的规模比我目前正在使用的规模更高。任何帮助都会很棒。
我有一个大小合适的数据集(108K 行),其中一个字段包含一个逗号分隔列表(我希望工程师没有这样做)。我需要分解该字段,以便每个条目都位于其自己的行上,并且所有其他字段也分配给该行。我开发了一个存储过程,它逐行循环遍历表,然后分解字段并将其插入到第二个表中。
这是我使用的代码:
DROP TABLE IF EXISTS dwh_inventory.nas_share_temp;
CREATE TABLE dwh_inventory.nas_share_temp (
share_id int(11) NOT NULL,
fileShareId int(11) NOT NULL,
storageId int(11) NOT NULL,
identifier varchar(1024) NOT NULL,
name varchar(255) NOT NULL,
protocol enum('CIFS','NFS') NOT NULL,
ipInterfaces VARCHAR(100) NOT NULL
) ENGINE=INNODB DEFAULT CHARSET=utf8;
DROP PROCEDURE IF EXISTS dwh_inventory.share_step;
DELIMITER $$
CREATE PROCEDURE dwh_inventory.share_step()
BEGIN
DECLARE n INT DEFAULT 0;
DECLARE i INT DEFAULT 0;
DECLARE strLen INT DEFAULT 0;
DECLARE SubStrLen INT DEFAULT 0;
DECLARE ip VARCHAR(20);
SET autocommit = 0;
SELECT COUNT(*) FROM dwh_inventory.nas_share INTO n;
SET i=0;
WHILE i<n DO
SELECT id, fileShareId, storageId, identifier, name, protocol, ipInterfaces
INTO @share_id, @fileShareId, @storageId, @identifier, @name, @protocol, @ipInterfaces
FROM dwh_inventory.nas_share LIMIT i,1;
IF @ipInterfaces IS NULL THEN
SET @ipInterfaces = '';
END IF;
do_this:
LOOP
SET strLen = CHAR_LENGTH(@ipInterfaces);
SET ip = SUBSTRING_INDEX(@ipInterfaces, ',', 1);
INSERT INTO dwh_inventory.nas_share_temp
(share_id, fileShareId, storageId, identifier,name,protocol,ipInterfaces)
VALUES (@share_id,
@fileShareId,
@storageId,
@identifier,
@name,
@protocol,
ip
);
SET SubStrLen = CHAR_LENGTH(SUBSTRING_INDEX(@ipInterfaces, ',', 1)) + 2;
SET @ipInterfaces = MID(@ipInterfaces, SubStrLen, strLen);
IF @ipInterfaces = '' THEN
LEAVE do_this;
END IF;
END LOOP do_this;
COMMIT;
SET i = i + 1;
END WHILE;
SET autocommit = 1;
END;
$$
DELIMITER ;
CALL dwh_inventory.share_step();
数据示例:
id,fileShareId,storageId,identifier,name,protocol,ipInterfaces
1325548,1128971,33309,/vol/vol0/:NFS,/vol/vol0/,NFS,"10.66.213.118,10.68.208.76"
1325549,1128991,33309,/vol/vol0/:NFS,/vol/vol0/,NFS,"10.66.213.119,10.68.208.77"
1325550,1128992,33325,/vol/aggr2_64_hs2032/EPS_ROOT/:NFS,/vol/aggr2_64_hs2032/EPS_ROOT/,NFS,10.17.124.10
1325551,1128993,33325,/vol/aggr2_64_hs2032/GCO_Report/:NFS,/vol/aggr2_64_hs2032/GCO_Report/,NFS,10.17.124.10
1325552,1128995,33325,/vol/aggr2_64_hs2032/PI/:NFS,/vol/aggr2_64_hs2032/PI/,NFS,10.17.124.10
1325553,1128996,33325,/vol/aggr2_64_hs2032/a/:NFS,/vol/aggr2_64_hs2032/a/,NFS,10.17.124.10
1325554,1128997,33325,/vol/aggr1_64_sapserv/:NFS,/vol/aggr1_64_sapserv/,NFS,147.204.2.13
1325555,1128999,33325,/vol/aggr2_64_hs2032/:NFS,/vol/aggr2_64_hs2032/,NFS,10.17.124.10
1325556,1129001,33325,/vol/aggr2_64_hs2032/central/:NFS,/vol/aggr2_64_hs2032/central/,NFS,10.17.124.10
1325557,1129004,33325,/vol/nsvfm0079b_E5V/db_clients/:NFS,/vol/nsvfm0079b_E5V/db_clients/,NFS,"10.21.188.161,10.70.151.93"
1325558,1129006,33325,/vol/aggr2_64_hs2032/istrans/:NFS,/vol/aggr2_64_hs2032/istrans/,NFS,10.17.124.10
1325559,1129008,33325,/vol/nsvfm0017_DEWDFGLD00603/:NFS,/vol/nsvfm0017_DEWDFGLD00603/,NFS,"10.21.188.115,10.70.151.138"
1325560,1129009,33325,/vol/nsvfm0017_vol0/:NFS,/vol/nsvfm0017_vol0/,NFS,"10.21.188.115,10.70.151.138"
1325561,1129011,33325,/vol/nsvfm0017a_ls2278/:NFS,/vol/nsvfm0017a_ls2278/,NFS,"10.21.188.115,10.70.151.138"
1325562,1129015,33325,/vol/nsvfm0051passive_vol0/:NFS,/vol/nsvfm0051passive_vol0/,NFS,10.17.144.249
1325563,1129017,33325,/vol/nsvfm0053_vol0/:NFS,/vol/nsvfm0053_vol0/,NFS,"10.21.189.251,10.70.151.109"
最佳答案
InnoDB 表必须有一个
主键
。
当您浏览表格时,LIMIT i,1
会变得越来越慢 - 在找到您需要的行之前,它必须跳过i
行.不要尝试在 SQL 中分割逗号分隔的文本;使用真正的语言(PHP/Perl/等)。或者,正如 Lew 建议的那样,写出该列,然后使用
LOAD DATA
将其放入另一个表中。LIMIT
前面应有ORDER BY
。
关于mysql 优化存储过程插入,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32055612/