c# - 奇怪的错误! HdInsight Hadoop MapReduce失败,代码为255

标签 c# azure hadoop azure-hdinsight

我正在使用带有1个头节点和1个数据节点的Microsoft Azure的HdInsight。

如果我在编写的MapReduce程序中使用小的数据集(85 MB),则一切正常,并且在容器/ blob中获得了所需的输出。较大的文件失败,并出现以下错误。

我读过一些文章,提到将设置mapreduce.map.memory.mb设置为“1024”,以便映射器有更多的内存。考虑到我要处理190 GB的文件,而且群集的任何计算机都没有接近该RAM的数量,因此我不知道如何扩展。

我敢肯定我缺少一些小东西,但是没有人知道我应该怎么做1)解决这个问题2)做到这一点,这样我就可以将mapReduce进程扩展到大型输入文件而没有这些错误?

如果使用8 GB的输入文件,则会出现以下错误:

15/08/01 05:43:17 INFO mapreduce.Job:  map 56% reduce 0%
15/08/01 05:43:23 INFO mapreduce.Job:  map 57% reduce 0%
15/08/01 05:43:29 INFO mapreduce.Job:  map 58% reduce 0%
15/08/01 05:43:30 INFO mapreduce.Job: Task Id : attempt_1438405138600_0006_m_0
010_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
iled with code 255
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed
ava:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.ja
:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInfor
tion.java:1594)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

15/08/01 05:43:31 INFO mapreduce.Job:  map 57% reduce 0%
15/08/01 05:43:36 INFO mapreduce.Job:  map 58% reduce 0%
15/08/01 05:43:43 INFO mapreduce.Job:  map 59% reduce 0%
15/08/01 05:43:47 INFO mapreduce.Job:  map 60% reduce 0%
15/08/01 05:43:53 INFO mapreduce.Job:  map 61% reduce 0%
15/08/01 05:43:59 INFO mapreduce.Job:  map 62% reduce 0%
15/08/01 05:44:05 INFO mapreduce.Job:  map 63% reduce 0%
15/08/01 05:44:05 INFO mapreduce.Job: Task Id : attempt_1438405138600_0006_m_0
010_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
iled with code 255
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed
ava:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.ja
:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInfor
tion.java:1594)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

15/08/01 05:44:06 INFO mapreduce.Job:  map 61% reduce 0%
15/08/01 05:44:07 INFO mapreduce.Job:  map 62% reduce 0%
15/08/01 05:44:16 INFO mapreduce.Job:  map 63% reduce 0%
15/08/01 05:44:20 INFO mapreduce.Job:  map 64% reduce 0%
15/08/01 05:44:27 INFO mapreduce.Job:  map 65% reduce 0%
15/08/01 05:44:35 INFO mapreduce.Job:  map 66% reduce 0%
15/08/01 05:44:41 INFO mapreduce.Job:  map 69% reduce 0%
15/08/01 05:44:48 INFO mapreduce.Job:  map 70% reduce 0%
15/08/01 05:44:48 INFO mapreduce.Job: Task Id : attempt_1438405138600_0006_m_0
010_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
iled with code 255
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed
ava:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.ja
:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInfor
tion.java:1594)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

15/08/01 05:44:49 INFO mapreduce.Job:  map 68% reduce 0%
15/08/01 05:44:50 INFO mapreduce.Job:  map 71% reduce 0%
15/08/01 05:45:02 INFO mapreduce.Job:  map 72% reduce 0%
15/08/01 05:45:05 INFO mapreduce.Job:  map 75% reduce 0%
15/08/01 05:45:13 INFO mapreduce.Job:  map 76% reduce 0%
15/08/01 05:45:18 INFO mapreduce.Job:  map 77% reduce 0%
15/08/01 05:45:21 INFO mapreduce.Job:  map 78% reduce 0%
15/08/01 05:45:23 INFO mapreduce.Job:  map 100% reduce 100%
15/08/01 05:45:28 INFO mapreduce.Job: Job job_1438405138600_0006 failed with s
te FAILED due to: Task failed task_1438405138600_0006_m_000010
Job failed as tasks failed. failedMaps:1 failedReduces:0

15/08/01 05:45:28 INFO mapreduce.Job: Counters: 35
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=605590393
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                WASB: Number of bytes read=5915160278
                WASB: Number of bytes written=0
                WASB: Number of read operations=0
                WASB: Number of large read operations=0
                WASB: Number of write operations=0
        Job Counters
                Failed map tasks=4
                Killed map tasks=3
                Launched map tasks=18
                Other local map tasks=3
                Rack-local map tasks=15
                Total time spent by all maps in occupied slots (ms)=1441061
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=1441061
                Total vcore-seconds taken by all map tasks=1441061
                Total megabyte-seconds taken by all map tasks=1475646464
        Map-Reduce Framework
                Map input records=16375
                Map output records=319985
                Map output bytes=5051193751
                Map output materialized bytes=604451353
                Input split bytes=1210
                Combine input records=0
                Spilled Records=319985
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=2836
                CPU time spent (ms)=1003820
                Physical memory (bytes) snapshot=9417678848
                Virtual memory (bytes) snapshot=13221376000
                Total committed heap usage (bytes)=11603542016
        File Input Format Counters
                Bytes Read=5915148288
15/08/01 05:45:28 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!

最佳答案

第一:1024是1GB的RAM。 A1通常是人们在测试时使用的标准大小,即1.75 GB,那么您将使用A0,即768mb。但是,这读起来不像是内存问题。我期望与此类似的错误:

“Container [pid = container_1406552545451_0009_01_000002,containerID = container_234132_0001_01_000001]正在运行,超出了物理内存限制。当前使用:已使用569.1 MB的512 MB物理内存; 970.1 MB的1.0 GB的虚拟内存。正在杀死容器。”

此错误输出像作业配置问题一样向我显示。您是否已确定构建目标是x64,并且未选中32位?在msdn上查看此线程:https://social.msdn.microsoft.com/Forums/en-US/d79befb1-be5d-4c5a-bb05-30ea9fccc475/hdinsight-mapreduce-fails-with-pipemapredwaitoutputthreads-subprocess-failed-with-code-255?forum=hdinsight

关于c# - 奇怪的错误! HdInsight Hadoop MapReduce失败,代码为255,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31766674/

相关文章:

c# - UWP、XAML - 使 CheckBox 为空

c# - 找不到 ReportingService2010

python - 如何使用azure DevOps python API的文档,我试图在进行API调用时获取对象拥有哪些成员?

Azure 函数 - 将功能键添加到 Http 触发器或连接字符串会导致停机吗?

hadoop - 容器运行超出物理内存限制

hadoop - 如何在Hadoop HDFS中解压缩.Snappy文件?

c# - 如何从 CultureInfo 获取国家代码?

C# Linq OrderBy 过滤 null 或空值到最后

java - org.glassfish.jersey.server.internal.process.MappableException : org. apache.catalina.connector.ClientAbortException : java.net.SocketException:

java - 运行 Hadoop MapReduce 作业时获取文件名/文件数据作为 Map 的键/值输入