<分区>
决定为给定数据集使用多少映射器和缩减器以实现最佳性能的因素是什么?我说的是 Apache Hadoop Map Reduce 平台。
<分区>
决定为给定数据集使用多少映射器和缩减器以实现最佳性能的因素是什么?我说的是 Apache Hadoop Map Reduce 平台。
最佳答案
根据 Cloudera blog
Have you set the optimal number of mappers and reducers?
The number of mappers is by default set to one per HDFS block. This is usually a good default, but see tip 2.
The number of reducers is best set to be the number of reduce slots in the cluster (minus a few to allow for failures). This allows the reducers to complete in a single wave.
关于java - 什么决定了给定一组指定数据要使用的映射器/缩减器的数量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12932044/