每当我从终端运行任何 apache pig 代码时,一切顺利,我得到了结果。所以我得出结论,我的 Pig 0.15.0 和 Hadoop 2.7.0 安装没问题。 问题是当我从 java 代码中运行 pigServer 时:
PigServer pigServer = new PigServer(ExecType.MAPREDUCE, conf);
pigServer.setBatchOn();
pigServer.debugOff();
pigServer.setJobName(JobId);
pigServer.registerScript(scriptUrl, params);
pigServer.executeBatch();
我的 Maven 依赖项是:
<dependency>
<groupId>org.apache.pig</groupId>
<artifactId>pig</artifactId>
<version>0.15.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.0</version>
</dependency>
我收到以下错误。
WARN org.apache.pig.backend.hadoop20.PigJobControl - falling back to default JobControl (not using hadoop 0.20 ?)
java.lang.NoSuchFieldException: runnerState
at java.lang.Class.getDeclaredField(Class.java:1948)
at org.apache.pig.backend.hadoop20.PigJobControl.<clinit>(PigJobControl.java:51)
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.newJobControl(HadoopShims.java:100)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:313)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:199)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:277)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1367)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1352)
at org.apache.pig.PigServer.execute(PigServer.java:1341)
at org.apache.pig.PigServer.executeBatch(PigServer.java:392)
at org.apache.pig.PigServer.executeBatch(PigServer.java:375)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:170)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:479)
我曾经在 Hadoop 1 上运行上面的代码,它可以正常工作,但现在不行了。
最佳答案
默认情况下,pig 使用 Hadoop 0.20 版本,因此在运行 pig 时假定您使用的是 Hadoop 0.20,因此您会收到该错误
您可以通过将 HADOOP_HOME 设置为指向您安装 Hadoop 的目录来使用不同版本的 Hadoop 运行 Pig。如果不设置HADOOP_HOME,Pig默认运行嵌入式版本,目前Hadoop 0.20.2.--写于Apache pig site https://pig.apache.org/docs/r0.9.2/start.html
在eclipse中设置HADOOP_HOME
Run Configurations-->ClassPath-->User Entries-->Advanced-->Add ClassPath Variables-->New-->Name(HADOOP_HOME)-->Path(You Hadoop directory path)
需要 Maven 依赖项
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.7.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.1</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.16</version>
</dependency>
<dependency>
<groupId>org.apache.pig</groupId>
<artifactId>pig</artifactId>
<version>0.15.0</version>
</dependency>
<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr-runtime</artifactId>
<version>3.4</version>
</dependency>
</dependencies>
关于hadoop - 在 Hadoop 2 上的 pig 15 上运行时出现嵌入式 pig 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31999398/