hadoop - hdfs + namenode +编辑文件随着大小的增加以及如何限制编辑文件的大小

标签 hadoop hdfs ambari hdp namenode

我们有带有7个datanodes机器的HDP集群

/hadoop/hdfs/namenode/current/

我们可以看到更多然后1500编辑文件
每个文件都在7M20M周围,如下所示

7.8M    /hadoop/hdfs/namenode/current/edits_0000000002331008695-0000000002331071883
7.0M    /hadoop/hdfs/namenode/current/edits_0000000002331071884-0000000002331128452
7.8M    /hadoop/hdfs/namenode/current/edits_0000000002331128453-0000000002331189702
7.1M    /hadoop/hdfs/namenode/current/edits_0000000002331189703-0000000002331246584
11M     /hadoop/hdfs/namenode/current/edits_0000000002331246585-0000000002331323246
8.0M    /hadoop/hdfs/namenode/current/edits_0000000002331323247-0000000002331385595
7.7M    /hadoop/hdfs/namenode/current/edits_0000000002331385596-0000000002331445237
7.9M    /hadoop/hdfs/namenode/current/edits_0000000002331445238-0000000002331506718
9.1M    /hadoop/hdfs/namenode/current/edits_0000000002331506719-0000000002331573154
9.0M    /hadoop/hdfs/namenode/current/edits_0000000002331573155-0000000002331638086
7.8M    /hadoop/hdfs/namenode/current/edits_0000000002331638087-0000000002331697435
7.8M    /hadoop/hdfs/namenode/current/edits_0000000002331697436-0000000002331755881
8.0M    /hadoop/hdfs/namenode/current/edits_0000000002331755882-0000000002331814933
9.8M    /hadoop/hdfs/namenode/current/edits_0000000002331814934-0000000002331884369
11M     /hadoop/hdfs/namenode/current/edits_0000000002331884370-0000000002331955341
8.7M    /hadoop/hdfs/namenode/current/edits_0000000002331955342-0000000002332019335
7.8M    /hadoop/hdfs/namenode/current/edits_0000000002332019336-0000000002332074498

通过一些HDFS配置可以最小化文件大小吗? (或最小化编辑文件数)
因为我们的磁盘很小,而磁盘现在是100%
/dev/sdb                   100G   100G     0 100% /hadoop/hdfs

最佳答案

You can configure the dfs.namenode.num.checkpoints.retained and dfs.namenode.num.extra.edits.retained properties to control the size of the directory that holds the NameNode edits directory.

  • dfs.namenode.num.extra.edits.retained‍‍ properties to control the size of the directory that holds the NameNode edits directory. dfs.namenode.num.checkpoints.retained: The number of image checkpoint files that are retained in storage directories. All edit logs necessary to recover an up-to-date namespace from the oldest retained checkpoint are also retained.
  • dfs.namenode.num.extra.edits.retained: The number of extra transactions that should be retained beyond what is minimally necessary for a NameNode restart. This can be useful for audit purposes, or for an HA setup where a remote Standby Node may have been offline for some time and require a longer backlog of retained edits in order to start again.


资源:https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/data-storage/content/properties_to_set_the_size_of_the_namenode_edits_directory.html

关于hadoop - hdfs + namenode +编辑文件随着大小的增加以及如何限制编辑文件的大小,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61849771/

相关文章:

hadoop - 大小为零的输出文件

java - Hadoop/Eclipse - 线程 "main"java.lang.NoClassDefFoundError : org/apache/hadoop/fs/FileSystem 中的异常

json - Ambari集群+服务自动启动配置API

hadoop - Apache ZooKeeper 网络用户界面

apache-spark - Spark Connect Hive 到 HDFS vs Spark 直接连接 HDFS 和 Hive 在它上面?

hadoop - Hortonworks Hive用户名密码配置

hadoop - 如何描述与Hive关键字同名的Hive表?

hadoop - Hadoop中的复制会导致数据冗余,那么为什么要在HDFS中进行呢?

rest - 是否可以构建查询HDFS数据的REST接口(interface)?

hadoop - 从源代码构建Ambari HDP堆栈