我是 Hive 的新手,我正在解决以下问题,但我无法解决。请帮助我。
我有以下类型的 json 记录。
{"issues":[{"key":"COV-2073","labels":[ "java","db"]}]}
我想取消嵌套它或像下面那样转换它。
Key labels
"COV-2073" "java","db"
我使用了以下类型的查询,但无法获得所需的输出。
select v2.key from demo_example as d lateral view json_tuple(d.a1,'issues') v1 as issue lateral view json_tuple(v1.issue,'key') v2 as key;
即使我也接受以下类型的输出。
Key labels
"COV-2073" "java"
"COV-2073" "db"
请帮我解决上面的问题。
最佳答案
这是SparkSQL中的一个例子
//data.json
{"name":"John","age":"30","cars": [{ "name":"Ford", "models":["Fiesta", "Focus", "Mustang"]}, {"name":"BMW", "models":["320", "X3", "X5"]}, {"name":"Fiat", "models":["500", "Panda"]}]}
//SparkSQL
>>> sqlContext.sql("""select name,age,col1.name, col2 from json.`data.json` lateral view explode(cars) v1 as col1 lateral view explode(col1.models) v2 as col2""").show()
+----+---+----+-------+
|name|age|name| col2|
+----+---+----+-------+
|John| 30|Ford| Fiesta|
|John| 30|Ford| Focus|
|John| 30|Ford|Mustang|
|John| 30| BMW| 320|
|John| 30| BMW| X3|
|John| 30| BMW| X5|
|John| 30|Fiat| 500|
|John| 30|Fiat| Panda|
+----+---+----+-------+
当 json 的某些行没有特定列的值时,如果您想显示 NULL
则使用 lateral view outer
而不是 lateral view
。
例如,下面的 json 有 2 个条目,一个包含所有详细信息,一个没有汽车、模型等。
{"name":"John","age":"30","cars": [{ "name":"Ford", "models":["Fiesta", "Focus", "Mustang"]}, {"name":"BMW", "models":["320", "X3", "X5"]}, {"name":"Fiat", "models":["500", "Panda"]}]}
{"name":"Dough","age":"90"}
在这种情况下,使用 outer 会为条目 Dough
生成 null
>>> sqlContext.sql("""select name,age,col1.name, col2 from json.`data.json` lateral view outer explode(cars) v1 as col1 lateral view outer explode(col1.models) v2 as col2 order by col2""").show()
+-----+---+----+-------+
| name|age|name| col2|
+-----+---+----+-------+
|Dough| 90|null| null|
| John| 30| BMW| 320|
| John| 30|Fiat| 500|
| John| 30|Ford| Fiesta|
| John| 30|Ford| Focus|
| John| 30|Ford|Mustang|
| John| 30|Fiat| Panda|
| John| 30| BMW| X3|
| John| 30| BMW| X5|
+-----+---+----+-------+
如果您希望所有模型都作为一个数组,那么
>>> sqlContext.sql("""select name, age, car.name as car, car.models from json.`data.json` lateral view outer explode(cars) v1 as car""").show()
+-----+---+----+--------------------+
| name|age| car| models|
+-----+---+----+--------------------+
| John| 30|Ford|[Fiesta, Focus, M...|
| John| 30| BMW| [320, X3, X5]|
| John| 30|Fiat| [500, Panda]|
|Dough| 90|null| null|
+-----+---+----+--------------------+
关于json - 将json数据转换为配置单元中的普通行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48442068/