hadoop - 在Hive SQL中使用Insert into时限制文件编号

标签 hadoop hive hiveql

每次我在Hive sql中执行insert into时,都会创建一个文件,在使用insert into时如何限制文件数?

恐怕有一天hdfs系统中的文件太多会破坏它。

hive> insert into table bi_st.st_usr_member_active_day
    > select * from bi_temp.zjy_ini_st_usr_member_active_day_temp88;
Query ID = root_20170209100404_5acdd3bf-071d-4178-aeff-b40d16499aac
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 2
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1484675879577_4078, Tracking URL = http://hadoopmaster:8088/proxy/application_1484675879577_4078/
Kill Command = /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop/bin/hadoop job  -kill job_1484675879577_4078
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 2
2017-02-09 10:04:41,247 Stage-1 map = 0%,  reduce = 0%
2017-02-09 10:04:47,425 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.17 sec
2017-02-09 10:04:53,598 Stage-1 map = 100%,  reduce = 50%, Cumulative CPU 3.02 sec
2017-02-09 10:04:57,727 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.81 sec
MapReduce Total cumulative CPU time: 4 seconds 810 msec
Ended Job = job_1484675879577_4078
Loading data to table bi_st.st_usr_member_active_day
Table bi_st.st_usr_member_active_day stats: [numFiles=8, numRows=548, totalSize=31267, rawDataSize=0]
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 2   Cumulative CPU: 4.81 sec   HDFS Read: 56745 HDFS Write: 10220 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 810 msec
OK

最佳答案

关于hadoop - 在Hive SQL中使用Insert into时限制文件编号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42127547/

相关文章:

java - Hive 在创建表 "Cannot validate serde: com.cloudera.hive.serde.JSONSerDe"时抛出错误

Hadoop 查询将行值与组值进行比较,带条件

hadoop - 无法将数据从 Apache 配置单元加载到 ElasticSearch -

hadoop - 使用文本文件中的列创建配置单元表

hadoop - Hive:在Hive SQL中转置的方法

hadoop - 如何在 pig 中使用 CASE 语句?

java - hadoop分布式副本覆盖不起作用

hadoop - 在Hive的存储桶表中增量加载数据?

hadoop - 时间戳列中的年,月和日

hadoop - Hive中的Count(*)和As