java - 在 hadoop-examples jar 文件上运行 wordcount 时出现 "Not a valid JAR"

标签 java hadoop mapreduce word-count

我尝试运行 wordcount 示例,但出现 jar 文件无效错误

hduser@tong-VirtualBox:/usr/local/hadoop$ bin/hadoop jar hadoop-examples-1.0.3.jar wordcount /user/hduser/Text /user/hduser/Text-output
Not a valid JAR: /usr/local/hadoop/hadoop-examples-1.0.3.jar

我该如何解决?我在哪里可以找到这些示例 jar 文件?其实我有字数的源代码。如何自己创建 Jar 文件?

hduser@tong-VirtualBox:/usr/local/hadoop$ bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /user/hduser/Text /user/hduser/Text-out
15/10/14 15:41:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/10/14 15:41:18 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
15/10/14 15:41:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
15/10/14 15:41:19 INFO input.FileInputFormat: Total input paths to process : 1
15/10/14 15:41:19 INFO mapreduce.JobSubmitter: number of splits:1
15/10/14 15:41:19 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1980044638_0001
15/10/14 15:41:20 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
15/10/14 15:41:20 INFO mapreduce.Job: Running job: job_local1980044638_0001
15/10/14 15:41:20 INFO mapred.LocalJobRunner: OutputCommitter set in config null
15/10/14 15:41:20 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
15/10/14 15:41:20 INFO mapred.LocalJobRunner: Waiting for map tasks
15/10/14 15:41:20 INFO mapred.LocalJobRunner: Starting task: attempt_local1980044638_0001_m_000000_0
15/10/14 15:41:20 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
15/10/14 15:41:20 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/Text/Text:0+0
15/10/14 15:41:20 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
15/10/14 15:41:20 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
15/10/14 15:41:20 INFO mapred.MapTask: soft limit at 83886080
15/10/14 15:41:20 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
15/10/14 15:41:20 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
15/10/14 15:41:20 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
15/10/14 15:41:21 INFO mapred.MapTask: Starting flush of map output
15/10/14 15:41:21 INFO mapred.LocalJobRunner: map task executor complete.
15/10/14 15:41:21 WARN mapred.LocalJobRunner: job_local1980044638_0001
java.lang.Exception: java.io.FileNotFoundException: Path is not a file: /user/hduser/Text/Text
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:70)
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1891)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1832)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1812)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1784)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.FileNotFoundException: Path is not a file: /user/hduser/Text/Text
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:70)
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1891)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1832)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1812)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1784)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
	at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1222)
	at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1210)
	at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1200)
	at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:271)
	at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:238)
	at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:231)
	at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1498)
	at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:302)
	at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:298)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:298)
	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
	at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:85)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:545)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:783)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path is not a file: /user/hduser/Text/Text
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:70)
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1891)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1832)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1812)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1784)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

	at org.apache.hadoop.ipc.Client.call(Client.java:1468)
	at org.apache.hadoop.ipc.Client.call(Client.java:1399)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
	at com.sun.proxy.$Proxy9.getBlockLocations(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:254)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1220)
	... 21 more
15/10/14 15:41:21 INFO mapreduce.Job: Job job_local1980044638_0001 running in uber mode : false
15/10/14 15:41:21 INFO mapreduce.Job:  map 0% reduce 0%
15/10/14 15:41:21 INFO mapreduce.Job: Job job_local1980044638_0001 failed with state FAILED due to: NA
15/10/14 15:41:21 INFO mapreduce.Job: Counters: 0

最佳答案

您的 jar 路径不存在,因此 CLI 正在提示。确保您拥有 jar 的完整路径。我的 hadoop-mapreduce-examples-2.7.0.jar位于 /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar ,但如果您使用的是 Cloudera 发行版或其他任何东西,那么它可能会有所不同。

关于如何创建 jar,见 this回答。

关于java - 在 hadoop-examples jar 文件上运行 wordcount 时出现 "Not a valid JAR",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33134196/

相关文章:

hadoop - SQOOP导入失败,找不到文件异常

java - 在HDFS上存储WEKA jar文件并从mapreduce调用WEKA方法

hadoop - 组合文件输入格式始终只启动一张 map Hadoop 1.2.1

java - 多个正则表达式模式来替换多次出现的图像标签

java - Android 错误:Telephony.Threads 类型的 getOrCreateThreadId 方法未定义

java - 画一个圆的半径和点周围的边缘

java - 单元测试 DynamoDB PaginatedQueryList 时如何返回列表

hadoop - 如何获取现有的 Hive 表分隔符

hadoop - 修复-运行hadoop作业时警告 “Use GenericOptionsParser for parsing the arguments”?

hadoop - 是否可以将map-reduce的输出直接输出到多个Map文件?