我尝试使用开放源代码BlazingCache http://blazingcache.org/为我的应用程序实现协调器缓存的想法。
因此,我仅使用WordCount示例https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v2.0来测试此缓存库。这是我的整个代码:
public class WordCount2 {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
//...
private static Cache<String, String> cache;
@Override
public void setup(Context context) throws IOException,
InterruptedException {
//...
initCache();
}
private void initCache() {
CachingProvider provider = Caching.getCachingProvider();
Properties properties = new Properties();
properties.put("blazingcache.mode","clustered");
properties.put("blazingcache.zookeeper.connectstring","localhost:1281");
properties.put("blazingcache.zookeeper.sessiontimeout","40000");
properties.put("blazingcache.zookeeper.path","/blazingcache");
CacheManager cacheManager = provider.getCacheManager(provider.getDefaultURI(), provider.getDefaultClassLoader(), properties);
MutableConfiguration<String, String> cacheConfiguration = new MutableConfiguration<>();
cache = cacheManager.createCache("example", cacheConfiguration);
}
@Override
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
//...
cache.put(word.toString(), one.toString());
}
}
}
//...
}
问题在于:
cache.put(word.toString(), one.toString());
在 map 功能中。
当将此行插入代码时,整个作业的性能突然下降。 (我正在使用Eclipse在本地模式下运行WordCount示例)。
为什么会发生这种情况,我该如何解决?
最佳答案
如果您在本地模式(单个JVM)下进行测试,则最好删除这些行,然后重试:
properties.put("blazingcache.mode","clustered");
properties.put("blazingcache.zookeeper.connectstring","localhost:1281");
properties.put("blazingcache.zookeeper.sessiontimeout","40000");
properties.put("blazingcache.zookeeper.path","/blazingcache");
关于java - 在Hadoop中使用BlazingCache开源会降低性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37725470/