标题已经把问题说得很清楚了。为什么作业跟踪器中有两个不同的部分来显示已完成的作业和已退休的作业?
谢谢。
最佳答案
作业退役是作业跟踪器将作业保留到磁盘并清除内存的正常过程。您可以在the Cloudera blog上阅读更多详细信息:
Once a job is complete it is kept in memory (up to
mapred.jobtracker.completeuserjobs.maximum
) and on disk as per the above. There is a configuration value that controls the overall retirement policy of completed jobs:Key:
mapred.jobtracker.retirejob.interval
Default: 24 * 60 * 60 * 1000 (1 day)
In other words, completed jobs are retired after one day by default. The check for jobs to be retired is done by default every minute and can be controlled with:Key:
mapred.jobtracker.retirejob.check
Default: 60 * 1000 (60s in msecs)
The check runs continually while the JobTracker is running. If a job is retired it is simply removed from the in-memory list of the JobTracker (it also removes all Tasks for the job etc.). Jobs are not retired under at least 1 minute (hardcoded in JobTracker.java) of their finish time. The retire call also removes the JobTracker Local (see above) file for the job. All that is left are the two files per retired job in the history directory (hadoop.job.history.location
) plus – if enabled – the Per Job files (hadoop.job.history.user.location
).
关于hadoop - Hadoop 已完成工作和已退休工作之间的差异,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17869490/