我正在使用 PHP 和 MySQL。 谁能告诉我一种根据优先级过滤掉重复结果的有效方法?
例子:
我有一张 table :
ID | Priority 1 | Priority 2 | Priority 3 | E-Mail
--------------------------------------------------------------
1 | Apple | One | Low | abc@abc.com
2 | Banana | Two | Medium | def@abc.com
3 | Banana | Two | High | def@abc.com
4 | Banana | Two | High | def@abc.com
5 | Peach | Three | Low | ghi@abc.com
6 | Peach | Four | High | ghi@abc.com
在上面的例子中,我正在寻找一种方法来只获取第 1、3(或 4)和 6 行。
也就是说,由于第 2、3、4 和 5、6 行的电子邮件相同,因此它们是重复记录。我想根据优先级选择记录。
如果重复记录的优先级 1 相同,则移至优先级 2。如果也相同,则移至优先级 3。如果相同,则选择哪个都无关紧要。
但是,如果有差异,我会选择优先级更高的记录。
在上面的例子中,优先级是
Peach -> Banana -> Apple
Four -> Three -> Two -> One
High -> Medium -> Low
然后我会将结果插入到不同的数据库中。
到目前为止,我有一个获取非重复项的查询。我正在考虑进行第二个查询来处理重复项。
第一个查询处理大约 20,000 条记录。第二个查询将处理大约 5,000 条记录。
但是,我不确定实现该目标的有效方法。
如果有任何帮助,我将不胜感激。
谢谢。
编辑:打字错误:需要第 1、3/4 和 6 行(不是 1,2 和 6)
最佳答案
这个查询应该给你你需要的结果:
SELECT
MIN(ID),
EMail,
MIN(Priority1),
MIN(Priority2),
MIN(Priority3)
FROM
yourtable
WHERE
(EMail, Priority1, Priority2, FIELD(Priority3, 'High', 'Medium', 'Low')) IN (
SELECT
EMail,
MIN(Priority1),
MIN(Priority2),
MIN(FIELD(Priority3, 'High', 'Medium', 'Low')) MinP3
FROM
yourtable
WHERE
(EMail, Priority1, FIELD(Priority2, 'Four', 'Three', 'Two', 'One')) IN (
SELECT
EMail,
MIN(Priority1),
MIN(FIELD(Priority2, 'Four', 'Three', 'Two', 'One')) MinP2
FROM
yourtable
WHERE
(EMail, FIELD(Priority1, 'Peach', 'Banana', 'Apple')) IN
(SELECT
EMail, MIN(FIELD(Priority1, 'Peach', 'Banana', 'Apple')) MinP1
FROM
yourtable
GROUP BY
EMail)
GROUP BY
EMail)
GROUP BY
EMail)
GROUP BY
EMail
(我返回第 3 行而不是第 2 行,但如果我正确理解你的问题,它应该是正确的)。请看 fiddle here .我怀疑它不会很快。我仍然想知道是否有办法让它更快。
编辑
您可以尝试以下查询。它使用不同的逻辑,但它也使用带有一些索引列的优先级表,它们应该比 FIELD 函数快得多,但是有许多连接可能会稍微减慢查询速度。
CREATE TABLE Priorities (
Num INT,
Des VARCHAR(10),
Priority INT,
PRIMARY KEY (Num, Des)
);
INSERT INTO Priorities VALUES
(1, 'Peach', 1),
(1, 'Banana', 2),
(1, 'Apple', 3),
(2, 'Four', 1),
(2, 'Three', 2),
(2, 'Two', 3),
(2, 'One', 4),
(3, 'High', 1),
(3, 'Medium', 2),
(3, 'Low', 3);
SELECT MIN(ID), yourtable.Email, MIN(Priority1) Priority1, MIN(Priority2) Priority2, MIN(Priority3) Priority3
FROM
yourtable
INNER JOIN Priorities p1 ON yourtable.Priority1=p1.Des AND p1.Num=1
INNER JOIN Priorities p2 ON yourtable.Priority2=p2.Des AND p2.Num=2
INNER JOIN Priorities p3 ON yourtable.Priority3=p3.Des AND p3.Num=3
INNER JOIN (
SELECT s1.EMail, MIN(MinP1) M1, MIN(MinP2) M2, MIN(MinP3) M3
FROM (
SELECT EMail, MIN(p1.Priority) MinP1
FROM yourtable INNER JOIN Priorities p1
ON yourtable.Priority1 = p1.Des AND p1.Num = 1
GROUP BY EMail) s1
INNER JOIN (
SELECT EMail, p1.Priority Pr1, MIN(p2.Priority) MinP2
FROM yourtable INNER JOIN Priorities p1
ON yourtable.Priority1 = p1.Des AND p1.Num = 1
INNER JOIN Priorities p2
ON yourtable.Priority2 = p2.Des AND p2.Num = 2
GROUP BY EMail, p1.Priority) s2
ON s1.EMail=s2.EMail AND s1.MinP1=s2.Pr1
INNER JOIN (
SELECT EMail, p1.Priority Pr1, p2.Priority Pr2, MIN(p3.Priority) MinP3
FROM yourtable INNER JOIN Priorities p1
ON yourtable.Priority1 = p1.Des AND p1.Num = 1
INNER JOIN Priorities p2
ON yourtable.Priority2 = p2.Des AND p2.Num = 2
INNER JOIN Priorities p3
ON yourtable.Priority3 = p3.Des AND p3.Num = 3
GROUP BY EMail, p1.Priority, p2.Priority) s3
ON s1.Email=s3.Email AND s1.MinP1=s3.Pr1 AND s2.MinP2=s3.Pr2
GROUP BY
s1.EMail) s
ON yourtable.EMail=s.Email
AND p1.Priority=s.M1
AND p2.Priority=s.M2
AND p3.Priority=s.M3
GROUP BY
yourtable.EMail
请参阅 fiddle here .如果它仍然太慢,我们可以尝试将我的第一个查询与第二个支持表一起使用。或者我们应该将查询分成两部分。
关于php - SQL 根据列优先级过滤重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16489198/