我试图从一个旧的 MySQL 表中取出一个 blob 并为它创建一个新表,以努力达到第一范式。然而,事实证明,要将数据库中已有的数据从 blob 转换为新表中的多行并非易事。
用SQL命令实现转换最简单的方法是什么?
父表:
CREATE TABLE TEST.People (
`id` INT AUTO_INCREMENT,
`age` INT,
`height` INT,
`weight` INT ,
`variations` BLOB DEFAULT NULL,
PRIMARY KEY (`id`),
);
新表:
CREATE TABLE TEST.Variations (
`id` INT AUTO_INCREMENT,
`chr` INT,
`start` INT,
`stop` INT ,
`type` ENUM('SNP','INDEL','CNV') DEFAULT NULL,
PRIMARY KEY (`id`),
);
当我运行 SELECT id,variations FROM TEST.People; 我得到:
+----+----------------------------------------------------------------------------------------------------------------------+
| id | variations |
+----+----------------------------------------------------------------------------------------------------------------------+
| 3 | xp t !3:124093754-124467278/CNVt 7:78030601-79638023/CNV |
| 6 | xp |
| 9 | xp |
| 12 | xp t !1:84289718-85466763/CNV |
| 15 | xp |
| 18 | xp |
| 21 | xp |
| 24 | xp |
| 27 | xp |
| 30 | xp t !10:166909544-166909544/SNPt !2:66903445-66903445/SNPt !2:166897864-166897864/CNVt !7:6892788-6892788/SNP |
+----+----------------------------------------------------------------------------------------------------------------------+
所以我希望 TEST.Variations 表在转换后具有的是:
+----+-----+-----------+-----------+----------+
| id | chr | start | stop | type |
+----+-----+-----------+-----------+----------+
| 3 | 3 | 124093754 | 124467278 | CNV |
| 3 | 7 | 78030601 | 79638023 | CNV |
| 12 | 1 | 84289718 | 85466763 | CNV |
| 30 | 10 | 166909544 | 166909544 | SNP |
| 30 | 2 | 66903445 | 66903445 | SNP |
| 30 | 2 | 166897864 | 166897864 | CNV |
| 30 | 7 | 6892788 | 6892788 | SNP |
+----+-----+-----------+-----------+----------+
最佳答案
首先两件事:
您的 id 3 的数据不一致。
7:...
之前没有!
。我希望这只是一个错字xp t !3:124093754-124467278/CNVt 7:78030601-79638023/CNV ^^
如果你想在你的目标表中有一个
auto_increment
列,那么你的架构应该看起来像这样CREATE TABLE variations ( `var_id` INT NOT NULL AUTO_INCREMENT, `id` INT, -- id from People goes here and it's not UNIQUE `chr` INT, `start` INT, `stop` INT , `type` ENUM('SNP','INDEL','CNV') DEFAULT NULL, PRIMARY KEY (`var_id`) );
现在您可以通过查询将数据从People
传输到Variations
表
INSERT INTO variations (id, chr, start, stop, type)
SELECT id,
SUBSTRING_INDEX(variation, ':', 1) chr,
SUBSTRING_INDEX(SUBSTRING_INDEX(variation, '-', 1), ':', -1) start,
SUBSTRING_INDEX(SUBSTRING_INDEX(variation, '-', -1), '/', 1) stop,
SUBSTRING_INDEX(variation, '/', -1) type
FROM
(
SELECT p.id, SUBSTRING_INDEX(SUBSTRING_INDEX(p.variations, 't !', n.n), 't !', -1) variation
FROM
(
SELECT id, SUBSTR(variations, 9) variations
FROM people
WHERE variations LIKE 'xp t !%'
) p CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(p.variations) - LENGTH(REPLACE(p.variations, 't !', ''))) / 3
ORDER BY id
) q
ORDER BY id, chr, start, stop, type;
注意:此查询将拆分每个 ID 最多 100 个变体。如果您需要更多或更少,您可以通过使用 n
别名编辑内部子查询来调整限制,这会即时生成数字(计数)表。
结果:
| VAR_ID | ID | CHR | START | STOP | TYPE | |--------|----|-----|-----------|-----------|------| | 1 | 3 | 3 | 124093754 | 124467278 | CNV | | 2 | 3 | 7 | 78030601 | 79638023 | CNV | | 3 | 12 | 1 | 84289718 | 85466763 | CNV | | 4 | 30 | 10 | 166909544 | 166909544 | SNP | | 5 | 30 | 2 | 166897864 | 166897864 | CNV | | 6 | 30 | 2 | 66903445 | 66903445 | SNP | | 7 | 30 | 7 | 6892788 | 6892788 | SNP |
这是 SQLFiddle 演示
关于MySQL 将数据库列从 Blob 转换为单独的部分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18749635/