hadoop - hive >插入覆盖表/本地目录不起作用

标签 hadoop insert hive overwrite

尝试联接两个表并将输出存储在某个表或本地目录中

map reduce作业成功,但是输出路径/表中没有任何内容。
有人能帮我吗 ?

hive> insert overwrite table order_result select e.emp_id as emp_id, count(distinct p.product_id) as product_id, sum(p.quantity) as quantity  from emp e join orders p on e.emp_id = p.emp_id group by e.emp_id order by quantity desc, product_id asc;
Total jobs = 3
Stage-1 is selected by condition resolver.
Launching Job 1 out of 3
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1438631656520_0053, Tracking URL = http://localhost:8088/proxy/application_1438631656520_0053/
Kill Command = /usr/lib/hadoop-2.2.0/bin/hadoop job  -kill job_1438631656520_0053
Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
2015-08-04 07:45:28,470 Stage-1 map = 0%,  reduce = 0%
2015-08-04 07:45:58,648 Stage-1 map = 22%,  reduce = 0%, Cumulative CPU 11.62 sec
2015-08-04 07:46:01,302 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 12.05 sec






MapReduce Total cumulative CPU time: 3 seconds 0 msec
Ended Job = job_1438631656520_0055
Loading data to table test_join.order_result
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://localhost:8020/user/hive/warehouse/test_join.db/order_result
Table test_join.order_result stats: [numFiles=1, numRows=0, totalSize=0, rawDataSize=0]
MapReduce Jobs Launched: 
Job 0: Map: 3  Reduce: 1   Cumulative CPU: 305.34 sec   HDFS Read: 354101279 HDFS Write: 96 SUCCESS
Job 1: Map: 1  Reduce: 1   Cumulative CPU: 2.76 sec   HDFS Read: 462 HDFS Write: 96 SUCCESS
Job 2: Map: 1  Reduce: 1   Cumulative CPU: 3.0 sec   HDFS Read: 462 HDFS Write: 48 SUCCESS
Total MapReduce CPU Time Spent: 5 minutes 11 seconds 100 msec
OK
Time taken: 817.424 seconds
hive> select * from order_result;
OK
Time taken: 0.146 seconds

最佳答案

您可以根据共享的MR日志检查查询是否获取输出,可以看到输入数据大小为354101279,其中输出仅为96

HDFS Read: 354101279 HDFS Write: 96 SUCCESS



相信查询工作正常,但没有产生任何输出。

可能是类似的原因。

Both Input Table is having data but the Data Type for emp_id is not corrent or not matching

关于hadoop - hive >插入覆盖表/本地目录不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31799905/

相关文章:

r - 在 Windows 上为 sparklyr 安装 Spark

algorithm - 制作距离矩阵或重复计算距离

iphone - 准备插入语句时 SQLite 内存不足

hadoop - 此Hive查询创建外部表有什么问题?

python - 如何使用 python pyhs2 连接到配置单元?

sql - Hadoop-创建表时格式化日期

hadoop - 新创建的 S3 目录的时间戳为 1969-12-31

hadoop - 无法让 HBase 连接到 Hadoop

mysql - 20K 读取是否会使插入到我的表中变慢?外键失败怎么办?

php - 在谷歌日历中插入事件