hadoop - Hadoop 已完成工作和已退休工作之间的差异

标签 hadoop jobs

标题已经把问题说得很清楚了。为什么作业跟踪器中有两个不同的部分来显示已完成的作业和已退休的作业?

谢谢。

最佳答案

作业退役是作业跟踪器将作业保留到磁盘并清除内存的正常过程。您可以在the Cloudera blog上阅读更多详细信息:

Once a job is complete it is kept in memory (up to mapred.jobtracker.completeuserjobs.maximum) and on disk as per the above. There is a configuration value that controls the overall retirement policy of completed jobs:

Key: mapred.jobtracker.retirejob.interval
Default: 24 * 60 * 60 * 1000 (1 day)
In other words, completed jobs are retired after one day by default. The check for jobs to be retired is done by default every minute and can be controlled with:

Key: mapred.jobtracker.retirejob.check
Default: 60 * 1000 (60s in msecs)
The check runs continually while the JobTracker is running. If a job is retired it is simply removed from the in-memory list of the JobTracker (it also removes all Tasks for the job etc.). Jobs are not retired under at least 1 minute (hardcoded in JobTracker.java) of their finish time. The retire call also removes the JobTracker Local (see above) file for the job. All that is left are the two files per retired job in the history directory (hadoop.job.history.location) plus – if enabled – the Per Job files (hadoop.job.history.user.location).

关于hadoop - Hadoop 已完成工作和已退休工作之间的差异,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17869490/

相关文章:

hadoop - 使Impala中无法识别的元数据无效

scala - 无法使用 Flink 1.5 集群提交作业

php - Laravel 5.2 带有延迟的作业立即触发而不是等待

Hadoop 与凤凰 : how to write the phoenix table object to hdfs filesystem

java - 如何将多个 Hadoop MapReduce Job 合并为一个?

image-processing - 异步处理上传图片 : what to do in the meantime?

php - Laravel 排队的作业在失败时不会重试

azure-devops - Azure Devops 计划的管道不会触发

java - 用另一个方法替换作业类中的方法 waitForCompletion()

python - Hadoop沙盒上的Os X流上的python客户端