php - 获取 20 个最频繁的数字对 (2/2) 和三元组 (3/3)

标签 php mysql

我正在尝试获取最频繁的数字对和三元组 (3/3),我的表格如下所示:

+----+------+------+------+------+------+------+------+------+------+------+
| id | nr1  | nr2  | nr3  | nr4  | nr5  | nr6  | nr7  | nr8  | nr9  | nr10 |
+----+------+------+------+------+------+------+------+------+------+------+
| 1  | 1    | 39   | 19   | 23   | 28   | 80   | 3    | 42   | 60   | 32   |
+----+------+------+------+------+------+------+------+------+------+------+
| 2  | 43   | 18   | 3    | 24   | 29   | 33   | 15   | 1    | 61   | 80   |
+----+------+------+------+------+------+------+------+------+------+------+
| 3  | 11   | 25   | 33   | 2    | 30   | 3    | 1    | 44   | 62   | 78   |
+----+------+------+------+------+------+------+------+------+------+------+

我想知道我所有行中最常见的 3 对或三元组数字是什么。

示例:

1,3(3 次)

1,80(2 次)

3,80(2 次) 1,3,80(2 次)

我可以尝试按 1、2、3 之类的顺序添加数字,然后从数据库中提取它们,但我想出的脚本仍然很糟糕,需要数小时才能检查 10000 行

欢迎任何想法.. 非常感谢。

最佳答案

你需要对你的表进行逆透视,但是mysql没有逆透视函数,所以你需要这样做

SQL Fiddle Demo

CREATE TABLE unpivot
SELECT *
FROM ( 
        SELECT id, nr1 as n_value FROM tuple union all 
        SELECT id, nr2 as n_value FROM tuple union all 
        SELECT id, nr3 as n_value FROM tuple union all 
        SELECT id, nr4 as n_value FROM tuple union all 
        SELECT id, nr5 as n_value FROM tuple union all 
        SELECT id, nr6 as n_value FROM tuple union all 
        SELECT id, nr7 as n_value FROM tuple union all 
        SELECT id, nr8 as n_value FROM tuple union all 
        SELECT id, nr9 as n_value FROM tuple union all 
        SELECT id, nr10 as n_value FROM tuple
     ) as T

现在寻找与自身进行连接的对数。

SELECT n1, n2, count(*) as total
FROM 
    (
    SELECT up1.n_value as n1, up2.n_value as n2
    FROM unpivot up1
    JOIN unpivot up2
      ON up1.`id` = up2.`id`        
     AND up1.n_value < up2.n_value  
   ) T
GROUP BY n1, n2
ORDER BY total desc
LIMIT 3;

对于三胞胎,你加入了 3 次 table

SELECT n1, n2, n3, count(*) as total
FROM 
    (
    SELECT up1.n_value as n1, up2.n_value as n2, up3.n_value as n3
    FROM unpivot up1
    JOIN unpivot up2
      ON up1.`id` = up2.`id`        
     AND up1.n_value < up2.n_value  
     JOIN unpivot up3
      ON up2.`id` = up3.`id`        
     AND up2.n_value < up3.n_value  
   ) T
GROUP BY n1, n2, n3
ORDER BY total desc
LIMIT 3;

更新:

我在 postgresql 上做了测试

创建 50k 行,随机值从 1 到 90

创建索引后,查询只需2秒即可完成。

enter image description here

关于php - 获取 20 个最频繁的数字对 (2/2) 和三元组 (3/3),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34637127/

相关文章:

javascript - 更新从另一个站点提取数据的 div 标签,我希望该 div 自动刷新而无需重新加载页面

javascript - 如何使用jquery动态设置colspan?

javascript - 动态打印两个不同字符串中的替代字符串字符

java - 在 JDBC 连接字符串中选择负载平衡策略

php - 高效 MySQL 查询 : running various queries through multiple tables, 将结果表分配给数组

php - 带彩色按钮的 Laravel 查询

php - 如何计算 PHP 表中所有带撇号的行?

mysql - 我的 sql 表中的日期列

sql - 如何同时支持一个web上的多个sql查询?

php - 更新mysql数据时遇到问题