ruby - Ruby 中的 MapReduce 数组

标签 ruby mapreduce

我有两个这样的数组:

["1","7","8","10"]

["1","2","3","6","9","11"]

这些数组表示来自用户选择的名为 Place 的类的 ID。我想选择得票最多的地点 ID。我尝试了 transpose 但由于数组的大小不同,因此无法进行转置。

这个例子的预期输出是:

{ "1" => 2, "7" => 1, "8" => 1, "10" => 1, "2" => 1, "3" => 1, "6" => 1, "9" => 1, "11" => 1 }

最佳答案

您可以连接所有数组并计算相同元素的数量,如下所示:

arrays = [["1","7","8","10"], ["1","2","3","6","9","11"]].reduce(:+)
arrays.inject(Hash.new(0)) { |memo, e| memo.update(e => memo[e] + 1) }
# "{ "1" => 2, "7" => 1, "8" => 1, "10" => 1, "2" => 1, "3" => 1, "6" => 1, "9" => 1, "11" => 1 }"

一旦你得到这个中间结果,使用 max_by 从散列中选择具有最大值的键:

arrays = [["1","7","8","10"], ["1","2","3","6","9","11"]].reduce(:+)
arrays.inject(Hash.new(0)) { |memo, e| memo.update(e => memo[e] + 1) }
      .max_by { |_, count| count }[0]
#=> "1"

关于ruby - Ruby 中的 MapReduce 数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33424300/

相关文章:

html - 刮刀 : distinguishing meaningful text from meaningless items, hadoop

python - 在 Dataproc 集群中查找 Hadoop 流媒体 jar

ruby-on-rails - 渲染部分时出现未定义方法错误

Ruby 语音识别库

ruby-on-rails - 获取大量图像,确定它们是否损坏

ruby - 如何通过AJAX进行搜查排序?

hadoop - 如何确定向Hadoop集群提交作业的边缘节点的IP

hadoop - 如何将 AvroKeyValueOutputFormat 文件导入配置单元?

ruby - 优化基本方法内存并提前返回

hadoop - 养 pig 运算符(operator)的逻辑计划和物理计划