我有一个表包含一些损坏的记录,因为我忘记了
为两列添加一个 UNIQUE
索引。
请查看下表的示例:
+----+-------------+--------+------------+
| id | uuid | object | project_id |
+----+-------------+--------+------------+
| 1 | 73621000001 | screw | 1 |
| 2 | 73621000002 | screw | 1 |
| 3 | 73621000003 | screw | 1 |
| 4 | 73621000004 | tube | 1 |
| 5 | 73621000005 | plate | 2 |
| 6 | 73621000006 | plate | 2 |
| 7 | 73621000007 | plate | 2 |
| 8 | 73621000008 | plate | 2 |
| 9 | 73621000009 | plate | 2 |
| 10 | 73621000010 | gear | 4 |
| 11 | 73621000011 | gear | 4 |
+----+-------------+--------+------------+
如您所见,有一些object
-project_id
-组合出现多次,但具有不同的uuid
。
我想删除所有重复记录,但保留具有最高 uuid
的记录。
结果表应该是这样的:
+----+-------------+--------+------------+
| id | uuid | object | project_id |
+----+-------------+--------+------------+
| 3 | 73621000003 | screw | 1 |
| 4 | 73621000004 | tube | 1 |
| 9 | 73621000009 | plate | 2 |
| 11 | 73621000011 | gear | 4 |
+----+-------------+--------+------------+
我可以使用以下查询查看哪些对象
有重复项:
SELECT uuid, object, project_id, COUNT(*)
FROM uuid_object_mapping
GROUP BY object, project_id
HAVING COUNT(*) > 1;
我可以使用此查询获取“干净”的表:
SELECT MAX(uuid) as uuid, object, project_id
FROM uuid_object_mapping
GROUP BY object, project_id;
我可以使用以下命令验证“干净”表不包含重复项
SELECT uuid, object, project_id, COUNT(*)
FROM (
SELECT MAX(uuid) as uuid, object_name, project_id
FROM uuid_object_mapping
GROUP BY object_name, project_id
) AS clean
GROUP BY object_name, project_id
HAVING COUNT(*) > 1;
但是我怎样才能删除“干净”表中没有的所有内容呢?
最佳答案
在 MySQL 中,您可以使用 join
,但需要注意 NULL
值:
delete om
from uuid_object_mapping om join
(select MAX(uuid) as uuid, object, project_id
from uuid_object_mapping
group by object, project_id
) omkeep
on omkeep.object = om.object and
omkeep.project_id <=> om.project_id
where om.uuid <> omkeep.uuid;
NULL
值似乎已经消失,因此您可以使用此 on
子句:
on omkeep.object = om.object and
omkeep.project_id = om.project_id
关于mysql - 如何删除 'duplicate'记录?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32139297/