我想使用 lambda 函数来计算 a ( JavaPairRDD<Integer, Double> pairs
) 的按键平均值。为此,我开发了以下代码:
java.util.function.Function<Double, Tuple2<Double, Integer>> createAcc = x -> new Tuple2<Double, Integer>(x, 1);
BiFunction<Tuple2<Double, Integer>, Double, Tuple2<Double, Integer>> addAndCount = (Tuple2<Double, Integer> x, Double y) -> { return new Tuple2(x._1()+y, x._2()+1 ); };
BiFunction<Tuple2<Double, Integer>, Tuple2<Double, Integer>, Tuple2<Double, Integer>> combine = (Tuple2<Double, Integer> x, Tuple2<Double, Integer> y) -> { return new Tuple2(x._1()+y._1(), x._2()+y._2() ); };
JavaPairRDD<Integer, Tuple2<Double, Integer>> avgCounts = pairs.combineByKey(createAcc, addAndCount, combine);
但是,eclipse 显示此错误:
The method combineByKey(Function<Double,C>, Function2<C,Double,C>, Function2<C,C,C>) in the type JavaPairRDD<Integer,Double> is not applicable for the arguments (Function<Double,Tuple2<Double,Integer>>,
BiFunction<Tuple2<Double,Integer>,Double,Tuple2<Double,Integer>>, BiFunction<Tuple2<Double,Integer>,Tuple2<Double,Integer>,Tuple2<Double,Integer>>)
最佳答案
combineByKey 方法需要 org.apache.spark.api.java.function.Function2
而不是 java.util.function.BiFunction
。所以要么你写:
java.util.function.Function<Double, Tuple2<Double, Integer>> createAcc =
x -> new Tuple2<Double, Integer>(x, 1);
Function2<Tuple2<Double, Integer>, Double, Tuple2<Double, Integer>> addAndCount =
(Tuple2<Double, Integer> x, Double y) -> { return new Tuple2(x._1()+y, x._2()+1 ); };
Function2<Tuple2<Double, Integer>, Tuple2<Double, Integer>, Tuple2<Double, Integer>> combine =
(Tuple2<Double, Integer> x, Tuple2<Double, Integer> y) -> { return new Tuple2(x._1()+y._1(), x._2()+y._2() ); };
JavaPairRDD<Integer, Tuple2<Double, Integer>> avgCounts =
pairs.combineByKey(createAcc, addAndCount, combine);
关于java - Spark Combinebykey JAVA lambda 表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28806792/