python-2.7 - Hadoop 命令在 python3 下失败并在 python 2.7 上工作

标签 python-2.7 python-3.x hadoop mrjob

我有一台 macbook pro,我已经在上面安装了 hadoop 2.7.3,如下所示: https://www.youtube.com/watch?v=06hpB_Rfv-w

我正在尝试通过 python3 运行 hadoop MRJob 命令,它给了我这个错误:。

bhoots21304s-MacBook-Pro:2.7.3 bhoots21304$ python3 /Users/bhoots21304/PycharmProjects/untitled/MRJobs/Mr_Jobs.py -r hadoop /Users/bhoots21304/PycharmProjects/untitled/MRJobs/File.txt
No configs found; falling back on auto-configuration
Looking for hadoop binary in /usr/local/Cellar/hadoop/2.7.3/bin...
Found hadoop binary: /usr/local/Cellar/hadoop/2.7.3/bin/hadoop
Using Hadoop version 2.7.3
Looking for Hadoop streaming jar in /usr/local/Cellar/hadoop/2.7.3...
Found Hadoop streaming jar: /usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar
Creating temp directory /var/folders/53/lvdfwyr52m1gbyf236xv3x1h0000gn/T/Mr_Jobs.bhoots21304.20170328.165022.965610
Copying local files to hdfs:///user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.165022.965610/files/...
Running step 1 of 1...
  Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  packageJobJar: [/var/folders/53/lvdfwyr52m1gbyf236xv3x1h0000gn/T/hadoop-unjar5078580082326840824/] [] /var/folders/53/lvdfwyr52m1gbyf236xv3x1h0000gn/T/streamjob2711596457025539343.jar tmpDir=null
  Connecting to ResourceManager at /0.0.0.0:8032
  Connecting to ResourceManager at /0.0.0.0:8032
  Total input paths to process : 1
  number of splits:2
  Submitting tokens for job: job_1490719699504_0003
  Submitted application application_1490719699504_0003
  The url to track the job: http://bhoots21304s-MacBook-Pro.local:8088/proxy/application_1490719699504_0003/
  Running job: job_1490719699504_0003
  Job job_1490719699504_0003 running in uber mode : false
   map 0% reduce 0%
  Task Id : attempt_1490719699504_0003_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

  Task Id : attempt_1490719699504_0003_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

  Task Id : attempt_1490719699504_0003_m_000001_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

  Task Id : attempt_1490719699504_0003_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

  Task Id : attempt_1490719699504_0003_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

  Task Id : attempt_1490719699504_0003_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

   map 100% reduce 100%
  Job job_1490719699504_0003 failed with state FAILED due to: Task failed task_1490719699504_0003_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

  Job not successful!
  Streaming Command Failed!
Counters: 17
    Job Counters 
        Data-local map tasks=2
        Failed map tasks=7
        Killed map tasks=1
        Killed reduce tasks=1
        Launched map tasks=8
        Other local map tasks=6
        Total megabyte-milliseconds taken by all map tasks=18991104
        Total megabyte-milliseconds taken by all reduce tasks=0
        Total time spent by all map tasks (ms)=18546
        Total time spent by all maps in occupied slots (ms)=18546
        Total time spent by all reduce tasks (ms)=0
        Total time spent by all reduces in occupied slots (ms)=0
        Total vcore-milliseconds taken by all map tasks=18546
        Total vcore-milliseconds taken by all reduce tasks=0
    Map-Reduce Framework
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
Scanning logs for probable cause of failure...
Looking for history log in hdfs:///tmp/hadoop-yarn/staging...
STDERR: 17/03/28 22:21:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
STDERR: ls: `hdfs:///user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.165022.965610/output/_logs': No such file or directory
STDERR: 17/03/28 22:21:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
STDERR: ls: `hdfs:///tmp/hadoop-yarn/staging/userlogs/application_1490719699504_0003': No such file or directory
STDERR: 17/03/28 22:21:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
STDERR: ls: `hdfs:///user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.165022.965610/output/_logs/userlogs/application_1490719699504_0003': No such file or directory
Probable cause of failure:

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Step 1 of 1 failed: Command '['/usr/local/Cellar/hadoop/2.7.3/bin/hadoop', 'jar', '/usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar', '-files', 'hdfs:///user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.165022.965610/files/Mr_Jobs.py#Mr_Jobs.py,hdfs:///user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.165022.965610/files/mrjob.zip#mrjob.zip,hdfs:///user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.165022.965610/files/setup-wrapper.sh#setup-wrapper.sh', '-input', 'hdfs:///user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.165022.965610/files/File.txt', '-output', 'hdfs:///user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.165022.965610/output', '-mapper', 'sh -ex setup-wrapper.sh python3 Mr_Jobs.py --step-num=0 --mapper', '-reducer', 'sh -ex setup-wrapper.sh python3 Mr_Jobs.py --step-num=0 --reducer']' returned non-zero exit status 256.

问题是如果我用 python2.7 运行相同的命令然后它运行正常并显示正确的输出。

在 bash_profile 中添加了 Python3。

导出 JAVA_HOME=$(/usr/libexec/java_home)

export PATH=/usr/local/bin:$PATH
export PATH=/usr/local/bin:/usr/local/sbin:$PATH

# Setting PATH for Python 2.6
PATH="/System/Library/Frameworks/Python.framework/Versions/2.6/bin:${PATH}"
export PATH

# Setting PATH for Python 2.7
PATH="/System/Library/Frameworks/Python.framework/Versions/2.7/bin:${PATH}"
export PATH

# added by Anaconda2 4.2.0 installer
export PATH="/Users/bhoots21304/anaconda/bin:$PATH"

export HADOOP_HOME=/usr/local/Cellar/hadoop/2.7.3
export PATH=$HADOOP_HOME/bin:$PATH

export HIVE_HOME=/usr/local/Cellar/hive/2.1.0/libexec
export PATH=$HIVE_HOME:$PATH

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/libexec/share/hadoop/common 
export PATH=$HADOOP_COMMON_LIB_NATIVE_DIR:$PATH

export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/libexec/share/hadoop"
export PATH=$HADOOP_OPTS:$PATH

export PYTHONPATH="$PYTHONPATH:/usr/local/Cellar/python3/3.6.1/bin"

# Setting PATH for Python 3.6
# The original version is saved in .bash_profile.pysave
PATH="/usr/local/Cellar/python3/3.6.1/bin:${PATH}"
export PATH

这是我的 MR_Jobs.py:

from mrjob.job import MRJob
import re

WORD_RE = re.compile(r"[\w']+")


class MRWordFreqCount(MRJob):

    def mapper(self, _, line):
        for word in WORD_RE.findall(line):
            yield (word.lower(), 1)

    def combiner(self, word, counts):
        yield (word, sum(counts))

    def reducer(self, word, counts):
        yield (word, sum(counts))


if __name__ == '__main__':
     MRWordFreqCount.run()

&&

我使用这个命令在 hadoop 上运行它:

python3/Users/bhoots21304/PycharmProjects/untitled/MRJobs/Mr_Jobs.py -r hadoop/Users/bhoots21304/PycharmProjects/untitled/MRJobs/File.txt

如果我在我的 ubuntu 机器上使用上述命令运行相同的文件..它可以工作但是当我在我的 mac 机器上运行相同的东西时它会给我一个错误。

这是我的 mac 机器的日志:

2017-03-28 23:05:51,751 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-03-28 23:05:51,863 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2017-03-28 23:05:51,965 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2017-03-28 23:05:51,965 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2017-03-28 23:05:51,976 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2017-03-28 23:05:51,976 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1490719699504_0005, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@209da20d)
2017-03-28 23:05:52,254 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2017-03-28 23:05:52,632 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
2017-03-28 23:05:52,632 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
2017-03-28 23:05:52,632 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.

+ __mrjob_PWD=/tmp/nm-local-
dir/usercache/bhoots21304/appcache/application_1490719699504_0005/ 
container_1490719699504_0005_01_000010
+ exec
+ python3 -c 'import fcntl; fcntl.flock(9, fcntl.LOCK_EX)'
setup-wrapper.sh: line 6: python3: command not found


2017-03-28 23:05:47,691 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-03-28 23:05:47,802 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2017-03-28 23:05:47,879 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2017-03-28 23:05:47,879 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2017-03-28 23:05:47,889 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2017-03-28 23:05:47,889 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1490719699504_0005, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@209da20d)
2017-03-28 23:05:48,079 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2017-03-28 23:05:48,316 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /tmp/nm-local-dir/usercache/bhoots21304/appcache/application_1490719699504_0005
2017-03-28 23:05:48,498 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2017-03-28 23:05:48,805 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1
2017-03-28 23:05:48,810 INFO [main] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
2017-03-28 23:05:48,810 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : null
2017-03-28 23:05:48,908 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://localhost:9000/user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.173517.724664/files/File.txt:0+32
2017-03-28 23:05:48,923 INFO [main] org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2017-03-28 23:05:48,983 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
2017-03-28 23:05:48,984 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 100
2017-03-28 23:05:48,984 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 83886080
2017-03-28 23:05:48,984 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 104857600
2017-03-28 23:05:48,984 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 26214396; length = 6553600
2017-03-28 23:05:48,989 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2017-03-28 23:05:49,001 INFO [main] org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/bin/sh, -ex, setup-wrapper.sh, python3, Mr_Jobs.py, --step-num=0, --mapper]
2017-03-28 23:05:49,010 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.work.output.dir is deprecated. Instead, use mapreduce.task.output.dir
2017-03-28 23:05:49,010 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.start is deprecated. Instead, use mapreduce.map.input.start
2017-03-28 23:05:49,011 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: job.local.dir is deprecated. Instead, use mapreduce.job.local.dir
2017-03-28 23:05:49,011 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
2017-03-28 23:05:49,011 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2017-03-28 23:05:49,011 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
2017-03-28 23:05:49,011 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir
2017-03-28 23:05:49,012 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.file is deprecated. Instead, use mapreduce.map.input.file
2017-03-28 23:05:49,012 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
2017-03-28 23:05:49,012 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.length is deprecated. Instead, use mapreduce.map.input.length
2017-03-28 23:05:49,012 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.cache.localFiles is deprecated. Instead, use mapreduce.job.cache.local.files
2017-03-28 23:05:49,012 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
2017-03-28 23:05:49,013 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
2017-03-28 23:05:49,025 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
2017-03-28 23:05:49,026 INFO [Thread-14] org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done
2017-03-28 23:05:49,027 INFO [main] org.apache.hadoop.streaming.PipeMapRed: PipeMapRed failed!
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
2017-03-28 23:05:49,028 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 127
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

2017-03-28 23:05:49,031 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2017-03-28 23:05:49,035 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete hdfs://localhost:9000/user/bhoots21304/tmp/mrjob/Mr_Jobs.bhoots21304.20170328.173517.724664/output/_temporary/1/_temporary/attempt_1490719699504_0005_m_000000_2
2017-03-28 23:05:49,140 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
2017-03-28 23:05:49,141 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
2017-03-28 23:05:49,141 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.

Mar 28, 2017 11:05:33 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class
Mar 28, 2017 11:05:33 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Mar 28, 2017 11:05:33 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class
Mar 28, 2017 11:05:33 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
Mar 28, 2017 11:05:33 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Mar 28, 2017 11:05:34 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Mar 28, 2017 11:05:34 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
log4j:WARN No appenders could be found for logger (org.apache.hadoop.ipc.Server).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

最佳答案

只需创建一个包含以下内容的 ~/.mrjob.conf 文件:

runners:
  hadoop:
    python_bin: /usr/local/bin/python3
    hadoop_bin: /usr/local/opt/hadoop/bin/hadoop
    hadoop_streaming_jar: /usr/local/opt/hadoop/libexec/share/hadoop/tools/lib/hadoop-streaming-*.jar

然后用这个命令运行你的程序:

python3 your_program.py -r hadoop input.txt

关于python-2.7 - Hadoop 命令在 python3 下失败并在 python 2.7 上工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43075674/

相关文章:

hadoop - 在 Mapper 或 Reducer 中处理异常的 Hadoop 最佳实践是什么?

python-2.7 - 令人沮丧的python语法错误

Python - 与 __contains__ 相反

python - Couchbase,4 个具有副本的节点(不分片),从 127.0.0.1 中选择

hadoop - 第一代HDFS中是否存在次要NameNode?

hadoop - 安装期间Ambari心跳丢失

python - 我认为不存在索引错误

python-3.x - 扭曲的端口 80 需要手动端口 80 并且无法正常工作

python - 在 tkinter 中分配和更新 Canvas 小部件的图像部分(错误)

python-3.x - 打印各种组合中的第一个组合