hadoop - Hive 横向 View 分解内部机制

标签 hadoop hive mapreduce

我在单个表(大小约为 12GB)的单个查询中多次使用横向 View 爆炸(大约 9 次)。这产生了大量的 map 侧数据(100Pb+)。我不明白它是如何从 12GB 生成这么多数据的。

谁能解释一下横向爆炸的工作原理(内部)?

提前致谢

最佳答案

演示

create table mytable (a1 array<int>,a2 array<int>,a3 array<int>);
insert into mytable select array(1,2),array(3,4,5),array(6,7,8,9);

select  *

from    mytable
        lateral view explode (a1) e1 as a1_val
        lateral view explode (a2) e2 as a2_val
        lateral view explode (a3) e3 as a3_val
;        

+-------+---------+-----------+--------+--------+--------+
|  a1   |   a2    |    a3     | a1_val | a2_val | a3_val |
+-------+---------+-----------+--------+--------+--------+
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      3 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      3 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      3 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      3 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      4 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      4 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      4 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      4 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      5 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      5 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      5 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      1 |      5 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      3 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      3 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      3 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      3 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      4 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      4 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      4 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      4 |      9 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      5 |      6 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      5 |      7 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      5 |      8 |
| [1,2] | [3,4,5] | [6,7,8,9] |      2 |      5 |      9 |
+-------+---------+-----------+--------+--------+--------+    

关于hadoop - Hive 横向 View 分解内部机制,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43961256/

相关文章:

hadoop - hadoop映射器程序中的空指针异常

c# - 乌鸦数据库 : What's wrong with this multimap/reduce index?

hadoop - 错误:找不到或加载主类org.apache.hadoop.hdfs.server.namenode.NameNode尝试了所有解决方案,仍然存在错误

hadoop - 无法连接到 oozie 服务器(错误代码 : 500)

hadoop - Cloudera CDH4安装

hadoop - 在配置单元的分区级别添加列

csv - 使用 Trino(以前称为 PrestoDB)将非 varchar 数据导出到 CSV 表

hadoop - 如何在使用 Impala 从 Tableau 连接 Hive 表元数据时刷新它

hadoop - CDH 5.7 上的 Streamsets solrcloud 无法连接到 Solr

hadoop - Hive:对指定组求和 (HiveQL)