java - 线程 "main"java.lang.NoClassDefFoundError : scala/Product$class ( Java) 中的异常

标签 java scala apache-spark apache-kafka

我运行一个用 Java 编写的 Spark Streaming 程序从 Kafka 读取数据,但出现此错误,我试图找出这可能是因为我使用 scala 或 java 的版本较低。我使用 JDK 版本 15 仍然出现此错误,任何人都可以帮助我解决此错误吗?谢谢。

这是我运行项目时的终端:

Exception in thread "main" java.lang.NoClassDefFoundError: scala/Product$class
        at org.apache.spark.streaming.kafka010.PreferConsistent$.<init>(LocationStrategy.scala:42)
        at org.apache.spark.streaming.kafka010.PreferConsistent$.<clinit>(LocationStrategy.scala)
        at org.apache.spark.streaming.kafka010.LocationStrategies$.PreferConsistent(LocationStrategy.scala:66)
        at org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent(LocationStrategy.scala)
        at demo.KafkaDemo.main(KafkaDemo.java:47)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:564)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: scala.Product$class
        at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:435)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
        ... 17 more
21/05/31 14:42:51 INFO SparkContext: Invoking stop() from shutdown hook
21/05/31 14:42:51 INFO SparkUI: Stopped Spark web UI at http://192.168.1.24:4040
21/05/31 14:42:51 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
21/05/31 14:42:51 INFO MemoryStore: MemoryStore cleared
21/05/31 14:42:51 INFO BlockManager: BlockManager stopped
21/05/31 14:42:51 INFO BlockManagerMaster: BlockManagerMaster stopped
21/05/31 14:42:51 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
21/05/31 14:42:51 INFO SparkContext: Successfully stopped SparkContext
21/05/31 14:42:51 INFO ShutdownHookManager: Shutdown hook called
21/05/31 14:42:51 INFO ShutdownHookManager: Deleting directory /tmp/spark-9bf7b2b8-aa48-4d13-91d6-7efd096200ef
21/05/31 14:42:51 INFO ShutdownHookManager: Deleting directory /tmp/spark-0cabea66-391d-4376-b851-02b923209992

这是项目的pom.xml文件:

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>TikiData</groupId>
<artifactId>TikiData</artifactId>
<version>V1</version>
<dependencies>
    <!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
    <dependency>
        <groupId>com.google.code.gson</groupId>
        <artifactId>gson</artifactId>
        <version>2.8.6</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>3.3.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.12</artifactId>
        <version>3.1.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.0.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>2.3.3</version>
        <scope>provided</scope>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.12.12</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.scala-lang/scala-compiler -->
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-compiler</artifactId>
        <version>2.12.12</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.scala-lang/scala-reflect -->
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-reflect</artifactId>
        <version>2.12.13</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients -->
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-clients</artifactId>
        <version>2.7.0</version>
    </dependency>
</dependencies>
<build>
    <sourceDirectory>src</sourceDirectory>
    <plugins>
        <plugin>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.8.1</version>
            <configuration>
                <release>15</release>
            </configuration>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-assembly-plugin</artifactId>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>single</goal>
                    </goals>
                    <configuration>
                        <archive>
                            <manifest>
                                <mainClass>
                                    demo.KafkaDemo
                                </mainClass>
                            </manifest>
                        </archive>
                        <descriptorRefs>
                            <descriptorRef>jar-with-dependencies</descriptorRef>
                        </descriptorRefs>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>4.4.0</version>
            <configuration>
                <scalaVersion>2.12.2</scalaVersion>
            </configuration>
        </plugin>
    </plugins>
</build>

这是项目的主要文件:

package demo;

import java.util.Arrays;
import java.util.Collection;
import java.util.HashMap;
import java.util.Map;

import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import org.apache.spark.streaming.Durations;
import org.apache.spark.streaming.api.java.JavaDStream;
import org.apache.spark.streaming.api.java.JavaInputDStream;
import org.apache.spark.streaming.api.java.JavaPairDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.kafka010.ConsumerStrategies;
import org.apache.spark.streaming.kafka010.KafkaUtils;
import org.apache.spark.streaming.kafka010.LocationStrategies;

import scala.Tuple2;

public class KafkaDemo {
    public static void main(String[] args) throws InterruptedException {
        // Create a local StreamingContext and batch interval of 10 second
        SparkConf conf = new SparkConf().setMaster("local").setAppName("Kafka Spark Integration");
        JavaStreamingContext jssc = new JavaStreamingContext(conf, Durations.seconds(10));

        //Define Kafka parameter
        Map<String, Object> kafkaParams = new HashMap<String, Object>();
        kafkaParams.put("bootstrap.servers", "localhost:9092");
        kafkaParams.put("key.deserializer", StringDeserializer.class);
        kafkaParams.put("value.deserializer", StringDeserializer.class);
        kafkaParams.put("group.id", "0");
        // Automatically reset the offset to the earliest offset
        kafkaParams.put("auto.offset.reset", "earliest");
        kafkaParams.put("enable.auto.commit", false);

        //Define a list of Kafka topic to subscribe
        Collection<String> topics = Arrays.asList("hello-kafka");

        //Create an input Dstream which consume message from Kafka topics
        JavaInputDStream<ConsumerRecord<String, String>> stream;
        stream = KafkaUtils.createDirectStream(jssc,LocationStrategies.PreferConsistent(),ConsumerStrategies.Subscribe(topics, kafkaParams));


        // Read value of each message from Kafka
        JavaDStream<String> lines = stream.map((Function<ConsumerRecord<String, String>, String>) kafkaRecord -> kafkaRecord.value());

        // Split message into words
        JavaDStream<String> words = lines.flatMap((FlatMapFunction<String, String>) line -> Arrays.asList(line.split(" ")).iterator());

        // Take every word and return Tuple with (word,1)
        JavaPairDStream<String,Integer> wordMap = words.mapToPair((PairFunction<String, String, Integer>) word -> new Tuple2<>(word,1));

        // Count occurance of each word
        JavaPairDStream<String,Integer> wordCount = wordMap.reduceByKey((Function2<Integer, Integer, Integer>) (first, second) -> first+second);

        //Print the word count
        wordCount.print();

        // Start the computation
        jssc.start();
        jssc.awaitTermination();
    }
} 

最佳答案

Spark 和 Scala 版本不匹配是造成这种情况的原因。如果您使用以下一组依赖项,则应该解决此问题。

我的一个观察(也可能不是 100% 正确)是如果我们有 spark-core_2.11(或任何 spark-xxxx_2.11)但 scala-library 版本是 2.12.X 我总是遇到问题。容易记住的事情可能就像如果我们有 spark-xxxx_2.11 然后使用 scala-library 2.11.X 而不是 2.12.X

请修复 scala-reflectscala-compile 版本为 2.11.X

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.4.2</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.4.2</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql-kafka-0-10_2.11</artifactId>
        <version>2.4.2</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>2.4.2</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_2.11</artifactId>
        <version>2.4.2</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.8</version>
    </dependency>

关于java - 线程 "main"java.lang.NoClassDefFoundError : scala/Product$class ( Java) 中的异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67769876/

相关文章:

java - 类型不匹配 : cannot convert from String to int - java

apache-spark - Flink 或 Sparks vs Akka 流中的非阻塞操作

apache-spark - Hortonworks Hive Warehouse 连接器和模式更新

java - 如何在java中使用apache SPARK分割和过滤字符串

java - Java中的持久优先级队列和消费者线程池

java - 我如何向这些图片添加一些文字 android

java - 检测 ArrayList 是否包含同一对象的多个实例

java - 如何管理来自在单独线程中运行的多个外部命令的多个标准输出?

带有案例类的抽象类的 scala circe 编码器/解码器

string - 如何从 csv 字符串中获取 map