我正在尝试获取最频繁的数字对和三元组 (3/3),我的表格如下所示:
+----+------+------+------+------+------+------+------+------+------+------+
| id | nr1 | nr2 | nr3 | nr4 | nr5 | nr6 | nr7 | nr8 | nr9 | nr10 |
+----+------+------+------+------+------+------+------+------+------+------+
| 1 | 1 | 39 | 19 | 23 | 28 | 80 | 3 | 42 | 60 | 32 |
+----+------+------+------+------+------+------+------+------+------+------+
| 2 | 43 | 18 | 3 | 24 | 29 | 33 | 15 | 1 | 61 | 80 |
+----+------+------+------+------+------+------+------+------+------+------+
| 3 | 11 | 25 | 33 | 2 | 30 | 3 | 1 | 44 | 62 | 78 |
+----+------+------+------+------+------+------+------+------+------+------+
我想知道我所有行中最常见的 3 对或三元组数字是什么。
示例:
1,3(3 次)
1,80(2 次)
3,80(2 次) 1,3,80(2 次)
我可以尝试按 1、2、3 之类的顺序添加数字,然后从数据库中提取它们,但我想出的脚本仍然很糟糕,需要数小时才能检查 10000 行
欢迎任何想法.. 非常感谢。
最佳答案
你需要对你的表进行逆透视,但是mysql没有逆透视函数,所以你需要这样做
CREATE TABLE unpivot
SELECT *
FROM (
SELECT id, nr1 as n_value FROM tuple union all
SELECT id, nr2 as n_value FROM tuple union all
SELECT id, nr3 as n_value FROM tuple union all
SELECT id, nr4 as n_value FROM tuple union all
SELECT id, nr5 as n_value FROM tuple union all
SELECT id, nr6 as n_value FROM tuple union all
SELECT id, nr7 as n_value FROM tuple union all
SELECT id, nr8 as n_value FROM tuple union all
SELECT id, nr9 as n_value FROM tuple union all
SELECT id, nr10 as n_value FROM tuple
) as T
现在寻找与自身进行连接的对数。
SELECT n1, n2, count(*) as total
FROM
(
SELECT up1.n_value as n1, up2.n_value as n2
FROM unpivot up1
JOIN unpivot up2
ON up1.`id` = up2.`id`
AND up1.n_value < up2.n_value
) T
GROUP BY n1, n2
ORDER BY total desc
LIMIT 3;
对于三胞胎,你加入了 3 次 table
SELECT n1, n2, n3, count(*) as total
FROM
(
SELECT up1.n_value as n1, up2.n_value as n2, up3.n_value as n3
FROM unpivot up1
JOIN unpivot up2
ON up1.`id` = up2.`id`
AND up1.n_value < up2.n_value
JOIN unpivot up3
ON up2.`id` = up3.`id`
AND up2.n_value < up3.n_value
) T
GROUP BY n1, n2, n3
ORDER BY total desc
LIMIT 3;
更新:
我在 postgresql 上做了测试
创建 50k 行,随机值从 1 到 90
创建索引后,查询只需2秒即可完成。
关于php - 获取 20 个最频繁的数字对 (2/2) 和三元组 (3/3),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34637127/