我正在尝试清理包含重复记录的数据库。我需要将引用移动到一条记录并删除另一条记录。
我有两个表:Promoters 和 Venues,每个表都引用了一个名为 cities 的表。问题是有些城市同名不同id,与 field 和发起人有关系。
通过这个查询,我可以将所有发起人和场所与一个城市记录分组:
SELECT c.id as id, c.name as name, GROUP_CONCAT( DISTINCT p.id ) as promoters_ids, GROUP_CONCAT( DISTINCT v.id ) as venues_ids
FROM cities as c
LEFT JOIN promoters as p ON p.city_id = c.id
LEFT JOIN venues as v ON v.city_id = c.id
WHERE c.name IN ( SELECT name from cities group by name having count(cities.name) > 1 )
GROUP BY c.name
现在我想对发起人运行更新查询,将 city_id 设置为上面查询的结果。
像这样:
UPDATE promoters AS pr SET pr.city_id = (
SELECT ID
FROM (
SELECT c.id as id, c.name as name, GROUP_CONCAT( DISTINCT p.id ) as promoters_ids
FROM cities as c
LEFT JOIN promoters as p ON p.city_id = c.id
WHERE c.name IN ( SELECT name from cities group by name having count(cities.name) > 1 ) AND pr.id IN promoters_ids
GROUP BY c.name
) AS T1
)
我该怎么做?
谢谢
最佳答案
如果我没理解错的话,您想要删除重复的城市(最后),因此您需要更新链接到您要在此过程中删除的任何城市的推广者。
我认为使用任何具有相同名称的城市的最低 ID 是有意义的(也可以是最高的 ID,但我至少想指定它,不要让我来决定。
因此,为了获得发起人的正确 ID,我需要:选择与已链接到发起人的城市同名的所有城市的最低 ID。
幸运的是,这种需求恰好适合查询:
UPDATE promoters AS pr
SET pr.city_id = (
SELECT
-- Select the lowest ID ..
Min(c.id)
FROM
-- .. of all cities ..
Cities c
-- .. that have the same name ..
INNER JOIN Cities pc on pc.Name = c.Name
WHERE
.. as the city already linked to the promoter being updated
pc.id = pr.city_id
GROUP BY
c.name)
诀窍是按名称加入 Cities 本身,因此您可以轻松获得所有同名的城市。我认为您对 IN
子句进行了同样的尝试,但这比需要的要复杂一些。
我认为您根本不需要 group_concat
,除了检查内部查询是否确实返回了正确的城市,尽管这没有意义,因为您已经根据名称进行分组.当这样写时,您可以知道这应该不会出错:
SELECT
-- Select the lowest ID ..
MIN(c.id) AS id,
GROUP_CONCAT(c.name) AS names --< already grouped by this, so why...
FROM
-- .. of all cities ..
Cities c
-- .. that have the same name.
INNER JOIN Cities pc on pc.Name = c.Name
GROUP BY
c.name
我希望我正确理解了这个问题。
关于使用 IN 和 group_concat 结果的 Mysql 查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18119896/