python - 流式传输MapReduce文件时出错

标签 python hadoop mapreduce

错误

hadoop jar ~/hadoop-streaming-0.23.6.jar -files ~/word_mapper.py,word_reducer.py -mapper word_mapper.py -reducer word_reducer.py -input count_of_monte_cristo.txt -output ~/output
packageJobJar: [] [/opt/cloudera/parcels/CDH-5.3.2-1.cdh5.3.2.p711.386/jars/hadoop-streaming-2.5.0-cdh5.3.2.jar] /tmp/streamjob36518    15242888241212.jar tmpDir=null
15/06/08 04:55:00 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 330541 for a565240 on ha-hdfs:nameservice1
15/06/08 04:55:00 ERROR hdfs.KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
15/06/08 04:55:00 INFO security.TokenCache: Got dt for hdfs://nameservice1; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (HDFS_DELEGATION_TOKEN token 330541 for a565240)
15/06/08 04:55:00 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm511
15/06/08 04:55:00 ERROR hdfs.KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
15/06/08 04:55:00 ERROR hdfs.KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
15/06/08 04:55:01 ERROR hdfs.KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
15/06/08 04:55:01 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
15/06/08 04:55:01 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 892e1b26e03f741c349833217a4d416bb27eada1]
15/06/08 04:55:01 ERROR hdfs.KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
15/06/08 04:55:01 INFO mapred.FileInputFormat: Total input paths to process : 1
15/06/08 04:55:01 INFO mapreduce.JobSubmitter: number of splits:2
15/06/08 04:55:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1430001030776_41947
15/06/08 04:55:01 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (HDFS_DELEGATION_TOKEN token 330541 for a565240)
15/06/08 04:55:01 INFO impl.YarnClientImpl: Submitted application application_1430001030776_41947
15/06/08 04:55:01 INFO mapreduce.Job: The url to track the job: http://dojo3m20003.rtp1.hadoop.fmr.com:8088/proxy/application_1430001030776_41947/
15/06/08 04:55:01 INFO mapreduce.Job: Running job: job_1430001030776_41947
15/06/08 04:55:17 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
15/06/08 04:55:17 INFO mapreduce.Job: Job job_1430001030776_41947 running in uber mode : false
15/06/08 04:55:17 INFO mapreduce.Job:  map 0% reduce 0%
15/06/08 04:55:17 INFO mapreduce.Job: Job job_1430001030776_41947 failed with state FAILED due to:
15/06/08 04:55:17 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!

为什么这样呢?
为什么我无法流化word_mapper和word_reducer脚本?

任何帮助表示赞赏。

最佳答案

请查看here以获取任何信息。看起来它最近已修复。验证您的版本。编码愉快。

关于python - 流式传输MapReduce文件时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30703658/

相关文章:

hadoop - 在Apache Pig中创建时间序列

xml - 如何使用 Elastic MapReduce 对数百万个小型 S3 xml 文件运行 XSLT 转换?

hadoop - 如何在hadoop reducer中编写不同格式的多个输出?

python - 有没有办法从 python 中的迭代输出实例化变量?

python - 如何浏览或搜索 Odoo 中的 One2many 字段?

python - 如何使用python将mp4转换为mp3

hadoop - 在cloudera中安装apache hadoop-tools

hadoop - Hive Table保留支持

hadoop - 发生故障转移时webhdfs是否支持高可用性

Python 3 : Sympy: Include list information to optimize lambdify