这是我一直试图解决的 SQL 问题,但我还没有解决 到目前为止能够解决:
假设我有一张 table :
序列(number1 int、number2 int、number3 int、number4 int、number5 int)
如果序列中存在一行,例如: <1,3,4,2,5> 然后我想消除作为这一行排列的所有其他行, 例如行: <1,2,5,4,3>.
编辑: 主键是(number1,number2,number3,number4,number5)
最佳答案
这假设五列中的值不能重复,并且该表具有单列primary_key -
DELETE t2
FROM table t1
INNER JOIN table t2
ON (t1.col1 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
AND t1.col2 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
AND t1.col3 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
AND t1.col4 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
AND t1.col5 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
)
AND t1.primary_key < t2.primary_key
-- AND CONCAT(t1.col1, t1.col2, t1.col3, t1.col4, t1.col5) < CONCAT(t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
WHERE t1.col1 NOT IN (t1.col2, t1.col3, t1.col4, t1.col5)
AND t1.col2 NOT IN (t1.col3, t1.col4, t1.col5)
AND t1.col3 NOT IN (t1.col4, t1.col5)
AND t1.col4 <> t1.col5
我还没有尝试过这个,所以我建议在提交 DELETE 之前将其作为 SELECT 运行。
更新 以下查询适用于集合中存在重复值的情况(1,1,2,2,2 而不是 1,2,3,4,5)但连接非常昂贵,因此在针对非常大的数据集运行它时我会非常谨慎。
DELETE t2
FROM `table` t1
INNER JOIN `table` t2
ON ( t1.col1 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
AND t1.col2 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
AND t1.col3 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
AND t1.col4 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
AND t1.col5 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
)
AND (-- compare the number of occurrences of each value in each side
(IF(t1.col1=t1.col1, 1, 0)+IF(t1.col1=t1.col2, 1, 0)+IF(t1.col1=t1.col3, 1, 0)+IF(t1.col1=t1.col4, 1, 0)+IF(t1.col1=t1.col5, 1, 0)) = (IF(t1.col1=t2.col1, 1, 0)+IF(t1.col1=t2.col2, 1, 0)+IF(t1.col1=t2.col3, 1, 0)+IF(t1.col1=t2.col4, 1, 0)+IF(t1.col1=t2.col5, 1, 0))
AND (IF(t1.col2=t1.col1, 1, 0)+IF(t1.col2=t1.col2, 1, 0)+IF(t1.col2=t1.col3, 1, 0)+IF(t1.col2=t1.col4, 1, 0)+IF(t1.col2=t1.col5, 1, 0)) = (IF(t1.col2=t2.col1, 1, 0)+IF(t1.col2=t2.col2, 1, 0)+IF(t1.col2=t2.col3, 1, 0)+IF(t1.col2=t2.col4, 1, 0)+IF(t1.col2=t2.col5, 1, 0))
AND (IF(t1.col3=t1.col1, 1, 0)+IF(t1.col3=t1.col2, 1, 0)+IF(t1.col3=t1.col3, 1, 0)+IF(t1.col3=t1.col4, 1, 0)+IF(t1.col3=t1.col5, 1, 0)) = (IF(t1.col3=t2.col1, 1, 0)+IF(t1.col3=t2.col2, 1, 0)+IF(t1.col3=t2.col3, 1, 0)+IF(t1.col3=t2.col4, 1, 0)+IF(t1.col3=t2.col5, 1, 0))
AND (IF(t1.col4=t1.col1, 1, 0)+IF(t1.col4=t1.col2, 1, 0)+IF(t1.col4=t1.col3, 1, 0)+IF(t1.col4=t1.col4, 1, 0)+IF(t1.col4=t1.col5, 1, 0)) = (IF(t1.col4=t2.col1, 1, 0)+IF(t1.col4=t2.col2, 1, 0)+IF(t1.col4=t2.col3, 1, 0)+IF(t1.col4=t2.col4, 1, 0)+IF(t1.col4=t2.col5, 1, 0))
AND (IF(t1.col5=t1.col1, 1, 0)+IF(t1.col5=t1.col2, 1, 0)+IF(t1.col5=t1.col3, 1, 0)+IF(t1.col5=t1.col4, 1, 0)+IF(t1.col5=t1.col5, 1, 0)) = (IF(t1.col5=t2.col1, 1, 0)+IF(t1.col5=t2.col2, 1, 0)+IF(t1.col5=t2.col3, 1, 0)+IF(t1.col5=t2.col4, 1, 0)+IF(t1.col5=t2.col5, 1, 0))
)
AND t1.primary_key < t2.primary_key
-- AND CONCAT(t1.col1, t1.col2, t1.col3, t1.col4, t1.col5) < CONCAT(t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
如果表没有单列主键,您可以使用注释掉的比较而不是 PK 比较,但 PK 绝对是首选。
关于mysql - 从表中消除排列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9651498/