当我运行级联作业时,出现错误:
Split metadata size exceeded 10000000
我尝试通过将以下内容传递给命令行来增加每个作业级别的限制
xxx.jar -D mapreduce.job.split.metainfo.maxsize=30000000
我也试过
xxx.jar -D mapreduce.jobtracker.split.metainfo.maxsize=30000000
但是两者都不起作用,我仍然得到同样的错误,所以没有选择参数。我正在使用 hadoop 2.5。谁能指出我做错了什么?
最佳答案
您可以尝试在 conf/mapred-site.xml
中设置以下属性吗:
<!-- No limits if set to -1 -->
<property>
<name>mapreduce.jobtracker.split.metainfo.maxsize</name>
<value>-1</value>
</property>
不确定以下是否有帮助(试一试)
xxx.jar -D mapreduce.jobtracker.split.metainfo.maxsize=-1
引用:https://archive.cloudera.com/cdh/3/hadoop/mapred-default.html
| Name | Default Value | Description |
|---------------------------------------------|---------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| mapred.jobtracker.job.history.block.size | 3145728 | The block size of the job history file. Since the job recovery uses job,history, its important to dump job history to disk as soon as possible.,Note that this is an expert level parameter. The default value is set to,3 MB |
| mapreduce.jobtracker.split.metainfo.maxsize | 10000000 | The maximum permissible size of the split metainfo file. The JobTracker,won't attempt to read split metainfo files bigger than the configured,value. No limits if set to -1. |
关于Hadoop:拆分元数据大小超过 10000000,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39039945/