python-2.7 - 权限被拒绝错误 13 - Hadoop 上的 Python

标签 python-2.7 hadoop-streaming

我正在运行一个简单的 Python 映射器和缩减器,并收到 13 permission denied 错误。需要帮助。

我不确定这里发生了什么,需要帮助。 Hadoop 世界的新手。

我正在运行简单的 map reduce 来计算单词。 mapper和reducer在linus或windows powershell上独立运行

======================================================================


hadoop@ubuntu:~/hadoop-1.2.1$ bin/hadoop jar contrib/streaming/hadoop-streaming-1.2.1.jar -file /home/hadoop/mapper.py -mapper mapper.py -file /home/hadoop/reducer.py -reducer reducer.py -input /deepw/pg4300.txt -output /deepw/pg3055
Warning: $HADOOP_HOME is deprecated.

packageJobJar: [/home/hadoop/mapper.py, /home/hadoop/reducer.py, /tmp/hadoop-hadoop/hadoop-unjar2961168567699201508/] [] /tmp/streamjob4125164474101219622.jar tmpDir=null
15/09/23 14:39:16 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/09/23 14:39:16 WARN snappy.LoadSnappy: Snappy native library not loaded
15/09/23 14:39:16 INFO mapred.FileInputFormat: Total input paths to process : 1
15/09/23 14:39:16 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop-hadoop/mapred/local]
15/09/23 14:39:16 INFO streaming.StreamJob: Running job: job_201509231312_0003
15/09/23 14:39:16 INFO streaming.StreamJob: To kill this job, run:
15/09/23 14:39:16 INFO streaming.StreamJob: /home/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job -Dmapred.job.tracker=192.168.56.102:9001 -kill job_201509231312_0003
15/09/23 14:39:16 INFO streaming.StreamJob: Tracking URL: http://192.168.56.102:50030/jobdetails.jsp?jobid=job_201509231312_0003
15/09/23 14:39:17 INFO streaming.StreamJob: map 0% reduce 0%
15/09/23 14:39:41 INFO streaming.StreamJob: map 100% reduce 100%
15/09/23 14:39:41 INFO streaming.StreamJob: To kill this job, run:
15/09/23 14:39:41 INFO streaming.StreamJob: /home/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job -Dmapred.job.tracker=192.168.56.102:9001 -kill job_201509231312_0003
15/09/23 14:39:41 INFO streaming.StreamJob: Tracking URL: http://192.168.56.102:50030/jobdetails.jsp?jobid=job_201509231312_0003
15/09/23 14:39:41 ERROR streaming.StreamJob: Job not successful. Error: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201509231312_0003_m_000000
15/09/23 14:39:41 INFO streaming.StreamJob: killJob...
Streaming Command Failed!

================================================================
java.io.IOException: Cannot run program "/tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201509231312_0003/attempt_201509231312_0003_m_000001_3/work/./mapper.py": error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: error=13, Permission denied
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
... 24 more

最佳答案

您的映射器文件似乎不可执行。在提交作业之前尝试 chmod a+x mapper.py

或者,在你的命令中,你可以替换

-mapper mapper.py

-mapper "python mapper.py"

关于python-2.7 - 权限被拒绝错误 13 - Hadoop 上的 Python,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32735668/

相关文章:

json - 警告 hdfs.DFSClient : DataStreamer Exception org. apache.hadoop.ipc.RemoteException(java.io.IOException) : File/in/recipeitems-latest. json._COPYING_

python - 内部服务器错误 Flask

hadoop - 完全分布式的 Hadoop/MapReduce 程序是否有任何方法可以让其各个节点读取本地输入文件?

Python:基于键值长度的OrderedDictionary排序

python - 在 python 中 pickle 数据时出现 MemoryError

python - 使用 MRJob 更改 Mapreduce 中间输出位置

scala - Spark 1.6:将数据帧存储到hdfs中的多个csv文件中(按ID划分)

hadoop - 给两个任务同名是否会引起问题

python - 在 python2.7 上使用带有 __slots__ 的 unicode_literals

mysql - 如何使用 MySQL python 连接器修复批量加载