arrays - 如何将字符串转换为结构的复杂数组并在 hive 中 explode

标签 arrays hadoop struct hive explode

我有下面的 hive 表

id     string 
code   string
config string  

值:
dummyID|codeA|[{"pmc":"111","scc":"aa1","pgtp":"a22","pgn":"a33","pgrc":"a44"},{"pmc":"222","scc":"bb1","pgtp":"b22","pgn":"b33","pgrc":"b44","sen":"b77"},{"pmc":"333","scc":"cc1","pgtp":"c22","pgn":"c33","pgrc":"c44","pscc":[],"mapb":"c88"},{"pmc":"444","scc":"dd1","pgtp":"d22","pgn":"d33","pgrc":"d44","pscc":["ghgh"],"mapb":"d88"},{"pmc":"555","scc":"ee1","pgtp":"e22","pgn":"e33","pgrc":"e44","mapb":"e88"}]

我需要像下面的输出一样爆炸数组:(struct下的任何元素都可以是可选的)
dummyID|codeA|{"pmc":"111","scc":"aa1","pgtp":"a22","pgn":"a33","pgrc":"a44"}
dummyID|codeA|{"pmc":"222","scc":"bb1","pgtp":"b22","pgn":"b33","pgrc":"b44","sen":"b77"}
dummyID|codeA|{"pmc":"333","scc":"cc1","pgtp":"c22","pgn":"c33","pgrc":"c44","pscc":[{"qtgm":"tt1","swrt":"rr2"}],"mapb":"c88"}
dummyID|codeA|{"pmc":"444","scc":"dd1","pgtp":"d22","pgn":"d33","pgrc":"d44","pscc":["ghgh"],"mapb":"d88"}
dummyID|codeA|{"pmc":"555","scc":"ee1","pgtp":"e22","pgn":"e33","pgrc":"e44","mapb":"e88"}

我试过了:
select 
id,
code,
exp_val   
FROM   temp 
LATERAL VIEW explode(array(config)) temp AS exp_val ;

上面的查询没有给出任何错误,但没有爆炸并得到单行,
侧面内联也不起作用

我试图用下面的模式创建表,并试图从上面的字符串配置字段插入记录,但是由于数据类型不匹配错误而失败
id    string,
code  string,
config  array<struct<pmc:String,scc:String,pgtp:string,pgn:string,pgrc:string,pscc:Array<String>,sen:Array<String>,mapb:Array<String>>> 

当我尝试为配置运行选择查询时,我得到了以下结果
|dummyID|codeA|{"pmc":"[{\"pmc\":\"111\",\"scc\":\"aa1\",\"pgtp\":\"a22\",\"pgn\":\"a33\",\"pgrc\":\"a44\"},{\"pmc\":\"222\",\"scc\":\"bb1\",\"pgtp\":\"b22\",\"pgn\":\"b33\",\"pgrc\":\"b44\",\"sen\":\"b77\"},{\"pmc\":\"333\",\"scc\":\"cc1\",\"pgtp\":\"c22\",\"pgn\":\"c33\",\"pgrc\":\"c44\",\"pscc\":[],\"mapb\":\"c88\"},{\"pmc\":\"444\",\"scc\":\"dd1\",\"pgtp\":\"d22\",\"pgn\":\"d33\",\"pgrc\":\"d44\",\"pscc\":[\"ghgh\"],\"mapb\":\"d88\"},{\"pmc\":\"555\",\"scc\":\"ee1\",\"pgtp\":\"e22\",\"pgn\":\"e33\",\"pgrc\":\"e44\",\"mapb\":\"e88\"}]","scc":null,"pgtp":null,"pgn":null,"pgrc":null,"pscc":null,"sen":null,"mapb":null} 

爆炸也不适用于此数据集

我有什么想念的吗?

最佳答案

删除explode函数中的array并尝试以下操作

select 
 id,
 code,
 exp_val   
FROM temp 
LATERAL VIEW explode(config) temp AS exp_val ;

第二个选项:
select 
 t.id,
 t.code,
 e.*   
FROM temp t
LATERAL VIEW outer inline(t.config) e ;

关于arrays - 如何将字符串转换为结构的复杂数组并在 hive 中 explode ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61261872/

相关文章:

C++ (gcc/g++) 两个巨大的字符串数组需要很长时间才能编译

arrays - 为什么我不能调用reduce(到:) on an array literal in Xcode 9. 2?

java - Hadoop 的分布式缓存文件程序不生成任何输出

hadoop - 线程 "main"java.io.IOException : Incomplete HDFS URI, 中的异常没有主机:hdfs Spark RDD

java - 无法在java中实例化结构内部的结构

go - 如何转换结构字段名称中的任何字符串,例如

Ruby:对象/类数组

c - 我如何从二维数组中获取每列的总和并将其保存到 C 中的一维数组

hadoop - 当我给出位置时,在哪里可以找到配置单元数据库的位置?

C 指向另一个结构体的指针