HDFS 复制因子更改

标签 hdfs

如果集群中的复制因子发生变化，例如从 5 更改为 3 并且集群重新启动，旧文件块会发生什么情况？它们会被视为过度复制并被删除还是复制因子仅适用于新文件？这意味着旧文件块被复制 5 次，新文件块(重启后)被复制 3 次。
如果集群没有重新启动会发生什么？

最佳答案

If the replication factor is changed in the cluster,say, from 5 to 3 and the cluster is restarted, what happens to the old file blocks?

现有/旧文件块没有任何 react 。

Will they be considered as over replicated and get deleted or replication factor is applicable to only new files?

新的复制因子仅适用于新文件，因为复制因子不是 HDFS 范围的设置，而是每个文件的属性。

Which means old file blocks are replicated 5 times and the new file blocks (after restart) are replicated 3 times.

它的倒置。复制因子设置为 3 的现有文件将继续携带 3 个块。使用更高的默认复制因子创建的新文件将携带 5 个块。

What happens if the cluster is not restarted?

如果您重新启动或不重新启动集群，则不会发生任何事情。由于该属性是针对每个文件的，并且在创建文件时由客户端引导，因此也不需要重新启动集群来更改此配置。您只需要更新您的客户端配置。

如果您希望更改所有旧文件的复制因子，请考虑运行复制更改程序命令:hadoop fs -setrep -R 5 /

关于HDFS 复制因子更改，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/17079513/

相关文章：

hadoop - 为什么伪分布式需要hadoop命令？