mysql - Make HAVING count(*) percentage based - 带百分比计算的复杂查询

标签 mysql sql

此查询根据用户的共同词数来建议友谊。 in_common 设置此阈值。

我想知道是否有可能使这个查询完全基于百分比。

我想做的是,如果用户的单词匹配 30%,则向当前用户推荐用户。

当前用户总字数 100

in_common 阈值 30

some_other_user 总字数 10

其中 3 个匹配 current_users 列表。

由于 3 是 10 的 30%,因此这是当前用户的匹配项。

可能吗?

SELECT users.name_surname, users.avatar, t1.qty, GROUP_CONCAT(words_en.word) AS in_common, (users.id) AS friend_request_id
    FROM (
      SELECT c2.user_id, COUNT(*) AS qty
      FROM `connections` c1
      JOIN `connections` c2
        ON c1.user_id <> c2.user_id 
          AND c1.word_id = c2.word_id
      WHERE c1.user_id = :user_id
      GROUP BY c2.user_id
      HAVING count(*) >= :in_common) as t1
     JOIN users
       ON t1.user_id = users.id
     JOIN connections
       ON connections.user_id = t1.user_id
     JOIN words_en
       ON words_en.id = connections.word_id
     WHERE EXISTS(SELECT * 
                  FROM connections 
                  WHERE connections.user_id = :user_id
                    AND connections.word_id = words_en.id)
     GROUP BY users.id, users.name_surname, users.avatar, t1.qty
     ORDER BY t1.qty DESC, users.name_surname ASC

SQL fiddle :http://www.sqlfiddle.com/#!2/c79a6/9

最佳答案

好的,所以问题是“共同用户”被定义为不对称关系。为了解决这个问题,我们假设针对使用最少单词的用户检查 in_common 百分比阈值。

试试这个查询(fiddle),它会为您提供至少有 1 个共同词的完整用户列表,标记友谊建议:

SELECT user1_id, user2_id, user1_wc, user2_wc,
       count(*) AS common_wc, count(*) / least(user1_wc, user2_wc) AS common_wc_pct,
       CASE WHEN count(*) / least(user1_wc, user2_wc) > 0.7 THEN 1 ELSE 0 END AS frienship_suggestion
FROM (
    SELECT u1.user_id AS user1_id, u2.user_id AS user2_id,
           u1.word_count AS user1_wc, u2.word_count AS user2_wc,
           c1.word_id AS word1_id, c2.word_id AS word2_id
      FROM connections c1
      JOIN connections c2 ON (c1.user_id < c2.user_id AND c1.word_id = c2.word_id)
      JOIN (SELECT user_id, count(*) AS word_count
            FROM connections
            GROUP BY user_id) u1 ON (c1.user_id = u1.user_id)
      JOIN (SELECT user_id, count(*) AS word_count
            FROM connections
            GROUP BY user_id) u2 ON (c2.user_id = u2.user_id)
) AS shared_words
GROUP BY user1_id, user2_id, user1_wc, user2_wc;

为清楚起见,Friendship_suggestion 位于 SELECT 上,您可能需要对其进行过滤,因此您可以将其移至 HAVING 子句。

关于mysql - Make HAVING count(*) percentage based - 带百分比计算的复杂查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20008766/

相关文章:

mysql - 我的 pg_dump 文件去哪里了?

sql - 从大数据集上昂贵的其他表更新值

sql - 进行多个连接时是否需要连续关系

MySQL 工作台 : TroubleShooting Keys

php - 使用带有一系列 id 的 where 语句

sql - 如何在 Firebird 上执行超过 32767 个字符的查询?

sql - 在postgresql中按经纬度查找最近的位置

sql - 如何从XML列获取元素值?

mysql - 使用带 IN 子句的索引并按主键排序

c# - 查询 MySQL 时指定的强制转换无效