sql - 列 SQL 的多数表决

我需要对 SQL 数据库中的列进行“多数表决”之类的操作。这意味着，具有列:c0、c1、...、cn，我想在每一行的其他列中提到的列中最常见的值(和 null 或随机 - 这并不重要)。例如，如果我们有下表:

+--+--+--+------+
|c0|c1|c2|result|
+--+--+--+------+
| 0| 1| 0|     0|
| 0| 1| 1|     1|
| 2| 2| 0|     2|
| 0| 3| 1|  null|

这就是我所说的对 c0、c1、c2 列进行多数投票的意思:在第一行中，我们有 2 行具有值0 和 1 与 1，所以 result = 0。在第二个中，我们有一个 0 与两个 1，因此 result = 1 等等。我们假设所有列都具有相同的类型。

如果查询简洁(可以动态构建)，那就太好了。首选原生 SQL，但 PL/SQL、psql 也可以。

提前谢谢你。

最佳答案

这可以通过从三列中创建一个表并对其使用聚合函数来轻松完成:

Postgres 中的以下作品:

select c0,c1,c2,
       (select c 
       from unnest(array[c0,c1,c2]) as t(c) 
       group by c 
       having count(*) > 1 
       order by count(*) desc 
       limit 1)
from the_table;

如果不想硬编码列名，也可以使用 Postgres 的 JSON 函数:

select t.*,
       (select t.v
        from jsonb_each_text(to_jsonb(t)) as t(c,v)
        group by t.v
        having count(*) > 1
        order by count(*) desc
        limit 1) as result
from the_table t;

请注意，以上内容考虑了所有列。如果您想删除特定列(例如 id 列)，您需要使用 to_jsonb(t) - 'id' 从 JSON 值中删除该键。

这些解决方案都不处理平局(两个不同的值出现相同次数)。

在线示例:https://rextester.com/PJR58760

第一个解决方案可以在某种程度上“适应”Oracle，特别是如果您可以动态构建 SQL:

select t.*, 
       (select c
        from (
          -- this part would need to be done dynamically
          -- if you don't know the columns
          select t.c0 as c from dual union all 
          select t.c1 from dual union all 
          select t.c2 from dual
        ) x
        group by c
        having count(*) > 1
        order by count(*) desc
        fetch first 1 rows only) as result
from the_table t;

关于sql - 列 SQL 的多数表决，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54162694/

sql - 列 SQL 的多数表决

上一篇：postgresql - kafka-connect-jdbc 不从源中获取连续的时间戳

下一篇：python - 在 SQL 连接中使用 Pandas Dataframe