apache-spark - SparkConf 类未找到

标签 apache-spark

我正在尝试使用 Spark 分析 Facebook 网络。当我创建 Spark Conf 对象时,出现以下错误。 线程“main”中出现异常 java.lang.NoClassDefFoundError: org/apache/spark/SparkConf。

我正在使用 scala 版本 2.11,它应该可以正常工作。在我的 intelliJ IDEA 中还加载了 scala sdk 2.12。 相同的代码是:-

import org.apache.spark.SparkConf
import org.apache.spark.graphx.GraphLoader
import org.apache.spark.sql.SparkSession

object Sample {

   def main(args: Array[String]): Unit = {
    print("Hello World")

    val conf = new SparkConf().setMaster("local[2]")
    val spark = SparkSession
      .builder
      .appName("SampleApp").config(conf)
      .getOrCreate()
    val sc = spark.sparkContext

    val graph = GraphLoader.edgeListFile(sc, "facebook_combined.txt")

    val ranks = graph.pageRank(1).vertices

    print(ranks.collect().mkString("\n"))

    spark.stop()

  }
}

我的 pom.xml 文件之前已说明:-

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>Sample</groupId>
<artifactId>CMPE256</artifactId>
<version>1.0-SNAPSHOT</version>



<properties>
    <spark.version>2.1.1</spark.version>
    <scala.dep.version>2.11</scala.dep.version>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_${scala.dep.version}</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_${scala.dep.version}</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>4.12</version>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-graphx_${scala.dep.version}</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-log4j12</artifactId>
        <version>1.6.6</version>
    </dependency>
    <dependency>
        <groupId>log4j</groupId>
        <artifactId>log4j</artifactId>
        <version>1.2.14</version>
    </dependency>
</dependencies>



<build>
    <plugins>
        <!-- mixed scala/java compile -->
        <plugin>
            <groupId>org.scala-tools</groupId>
            <artifactId>maven-scala-plugin</artifactId>
            <version>2.15.2</version>
            <executions>
                <execution>
                    <id>compile</id>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                    <phase>compile</phase>
                </execution>
                <execution>
                    <id>test-compile</id>
                    <goals>
                        <goal>testCompile</goal>
                    </goals>
                    <phase>test-compile</phase>
                </execution>
                <execution>
                    <phase>process-resources</phase>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.6.1</version>
            <configuration>
                <source>1.8</source>
                <target>1.8</target>
            </configuration>
        </plugin>
    </plugins>
  </build>
</project>

最佳答案

注释掉<scope>provided</scope>在你的pom.xml这应该可以解决问题。

有关原因的更多信息请参见:MavenDoc

关于apache-spark - SparkConf 类未找到,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47623275/

相关文章:

python - 使用 header 和特定文件名将 spark 数据帧导出到 .csv

machine-learning - 使用哪种 Spark MLIB 算法?

apache-spark - 为什么写入的数据帧读取后不保留其顺序?

python - 如何在 Spark RDD 中比较不区分大小写的字符串?

scala - 从 Spark 数据框中提取列值并将其添加到另一个数据框中

apache-spark - Spark 执行器 GC 需要很长时间

java - Spark SQL : using collect_set over array values?

apache-spark - JAVA 中的 Spark IntArrayParm

python - 将 RDD 转换为列联表 : Pyspark

scala - Spark <控制台> :12: error: not found: value sc