我有四个表,即城市、位置、客户和商店。设计此数据库的人已将主键设置为自动递增。结果导致DB超时有冗余数据。我正在尝试清理数据库,但更新和删除行需要很长时间。表格示例如下所示:
Table 1: City: ID_city(PK)
| City | ID_City |
|-----------|---------|
| Chennai | 1 |
| Benagluru | 2 |
| Chennai | 3 |
| Delhi | 4 |
| Chennai | 5 |
| Bengaluru | 6 |
Table 2: Location: ID_Location(PK), ID_City(FK)
| Zip | ID_location | ID_City |
|------|--------------------|---------|
| 0001 | 1 | 1 |
| 0011 | 2 | 2 |
| 0002 | 3 | 1 |
| 0021 | 4 | 3 |
| 0003 | 5 | 1 |
| 0012 | 6 | 2 |
| 0001 | 7 (duplicate of 1) | 1 |
Table 3: Customer: Cust_ID(PK), ID_Location(FK)
| Cust_ID | ID_location |
|---------|-------------|
| 1 | 1 |
| 2 | 3 |
| 3 | 5 |
| 4 | 2 |
| 5 | 7 |
Table 4: Shop: Shop_ID(PK), ID_Location(FK)
| Shop_ID | ID_location |
|---------|-------------|
| 1 | 1 |
| 2 | 2 |
| 3 | 6 |
| 4 | 3 |
| 5 | 7 |
期望表:
Table 1: City: ID_city(PK)
| City | ID_City |
|-----------|---------|
| Chennai | 1 |
| Benagluru | 2 |
| Delhi | 4 |
Table 2: Location: ID_Location(PK), ID_City(FK)
| Zip | ID_Location | ID_City |
|------|-------------|---------|
| 0001 | 1 | 1 |
| 0011 | 2 | 2 |
| 0002 | 3 | 1 |
| 0021 | 4 | 1 |
| 0003 | 5 | 1 |
| 0012 | 6 | 2 |
Table 3: Customer: Cust_ID(PK), ID_Location(FK)
| Cust_ID | ID_Location |
|---------|-------------|
| 1 | 1 |
| 2 | 3 |
| 3 | 5 |
| 4 | 2 |
| 5 | 1 |
Table 4: Shop: Shop_ID(PK), ID_Location(FK)
| Shop_ID | ID_Location |
|---------|-------------|
| 1 | 1 |
| 2 | 2 |
| 3 | 6 |
| 4 | 3 |
| 5 | 1 |
如你所见,到处都有重复的记录,需要 3 个更新语句(使用连接)和 2 个删除语句才能删除 1 个重复的城市。 有没有办法减少执行此任务的 SQL 语句的数量?
我写的查询是:
- 更新客户集 ID_location = 1,其中 Cust_ID = 5
- 更新商店集 ID_location = 1,其中 Shop_ID = 5
- 从 ID_location = 7 的位置删除
- 更新位置集 ID_City = 1,其中 ID_City = 3 或 ID_City = 5
- 从 ID_Location = 3 或 ID_Location = 5 的城市中删除
这是为了删除 1 个重复的城市,城市表中大约有 1300 个重复的城市。是否有一种简单的方法来检查重复项、更新并最终删除?
最佳答案
您可以根据条件一次更新整个表。在您的情况下,存在另一行具有重复值。
-- (1) UPDATE DUPLICATE CITIES ON LOCATION
UPDATE l SET l.Id_City = mstr.Id_City
-- SELECT c.Id_City oldId, mstr.Id_City newId -- Check this for your convenience
FROM [Location] l
INNER JOIN City c ON c.Id_City = l.Id_City
INNER JOIN (
SELECT City, MIN(Id_City) Id_City -- KEEP FIRST ONLY
FROM City
GROUP BY City
HAVING COUNT(1) > 1
) mstr ON mstr.City = c.City
AND mstr.Id_City < Id_City
-- (2) DELETE DUPLICATE CITIES
DELETE c
-- SELECT c.Id_City oldId, mstr.Id_City newId -- Check this for your convenience
FROM City c
INNER JOIN (
SELECT City, MIN(Id_City) Id_City -- KEEP FIRST ONLY
FROM City
GROUP BY City
HAVING COUNT(1) > 1
) mstr ON mstr.City = c.City
AND mstr.Id_City < Id_City
-- ...
其余的查询可以模拟这些例子
关于sql - 多个表中的多个更新和删除 SQL Server,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48912554/