python - Cloudera CDH 5群集的mrjob NoFIleFound异常

标签 python hadoop cloudera mrjob

尝试在hadoop群集上运行mrjob示例时出现此错误。
我已经设置了hadoop_home,还可以在hdfs文件系统上创建一个新目录。
如果使用hadoop流,则可以运行python map-reduce。这只是与mrjob我得到这个问题。

当我运行此命令时:

python mr_word_freq_count.py -r hadoop --hadoop-bin /usr/bin/hadoop -o hdfs:///user/zkdmkrq/out1 hdfs:///user/zkdmkrq/input1

我得到:
no configs found; falling back on auto-configuration no configs found;
falling back on auto-configuration creating tmp directory
/tmp/mr_word_freq_count.zkdmkrq.20150226.172000.917957 writing wrapper
script to
/tmp/mr_word_freq_count.zkdmkrq.20150226.172000.917957/setup-wrapper.sh
STDERR: mkdir:
`hdfs:///user/zkdmkrq/tmp/mrjob/mr_word_freq_count.zkdmkrq.20150226.172000.917957/files/':
No such file or directory Traceback (most recent call last):   File
"mr_word_freq_count.py", line 37, in <module>
    MRWordFreqCount.run()   File "/usr/lib/python2.6/site-packages/mrjob/job.py", line 494, in run
    mr_job.execute()   File "/usr/lib/python2.6/site-packages/mrjob/job.py", line 512, in execute
    super(MRJob, self).execute()   File "/usr/lib/python2.6/site-packages/mrjob/launch.py", line 147, in
execute
    self.run_job()   File "/usr/lib/python2.6/site-packages/mrjob/launch.py", line 208, in
run_job
    runner.run()   File "/usr/lib/python2.6/site-packages/mrjob/runner.py", line 458, in run
    self._run()   File "/usr/lib/python2.6/site-packages/mrjob/hadoop.py", line 238, in _run
    self._upload_local_files_to_hdfs()   File "/usr/lib/python2.6/site-packages/mrjob/hadoop.py", line 265, in
_upload_local_files_to_hdfs
    self._mkdir_on_hdfs(self._upload_mgr.prefix)   File "/usr/lib/python2.6/site-packages/mrjob/hadoop.py", line 273, in
_mkdir_on_hdfs
    self.invoke_hadoop(['fs', '-mkdir', path])   File "/usr/lib/python2.6/site-packages/mrjob/fs/hadoop.py", line 109, in
invoke_hadoop
    raise CalledProcessError(proc.returncode, args) subprocess.CalledProcessError: Command '['/usr/bin/hadoop', 'fs',
'-mkdir',
'hdfs:///user/zkdmkrq/tmp/mrjob/mr_word_freq_count.zkdmkrq.20150226.172000.917957/files/']'
returned non-zero exit status 1

最佳答案

我实际上找到了解决此问题的方法。
我不得不更改mrjob / hadoop.py文件。这是确切的解决方案

https://github.com/Yelp/mrjob/issues/850

希望它对遇到此问题的任何人有所帮助。

关于python - Cloudera CDH 5群集的mrjob NoFIleFound异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28748933/

相关文章:

python - 在 Haskell 中调用 Python 代码

python - 没有重复 ALLCAPS 的 argparse 帮助

hadoop - 在 MapReduce 中以最佳方式执行 HBase 查询

python - 在 csv 中搜索字符串并保存该列

python - 根据条件反转python中的列表

java - 如何使用 Java 正则表达式提取以下数据?

hadoop - 无法安装Pig版本0.17.0;错误:无法找到Pig-core-h2.jar。做 'ant jar',然后再试一次

java - NameNode没有启动start-all.sh

java - 连接到Cloudera Impala环境时出现Kerberos错误

hadoop - 在 hadoop 类路径中添加自定义位置