json - 在配置单元中加载 json 数据时出错

标签 json hadoop hive

JSON 数据看起来像:

{"id":"U101", "name":"Rakesh", "place":{"city":"MUMBAI","state":"MAHARASHTRA"}, "age":20, "occupation":"STUDENT"}
{"id":"","name":"Rakesh", "place":{"city":"MUMBAI","state":"MAHARASHTRA"}, "age":20, "occupation":"STUDENT"}
{"id":"U103", "name":"Rakesh", "place":{"city":"","state":""}, "age":20, "occupation":"STUDENT"}

尝试从表中选择数据时出现以下错误:

hive (ecom)> select * from users_info_raw; 
OK 
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException:
org.codehaus.jackson.JsonParseException: Unexpected character ('2'
(code 50)): was expecting comma to separate OBJECT entries  at
[Source: java.io.StringReader@15b0734; line: 1, column: 222] 
Time taken: 0.144 seconds

创建表 DDL 查询:

CREATE TABLE users_info_raw(
       > id string,
       > name string,
       > place struct<city:string,state:string>,
       > age INT,
       > occupation string
       > )
       > ROW FORMAT SERDE
       > 'com.cloudera.hive.serde.JSONSerDe'
       > STORED AS INPUTFORMAT
       > 'org.apache.hadoop.mapred.TextInputFormat'
       > OUTPUTFORMAT
       > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';

最佳答案

我使用了 hive hcatalog serde,它可以很好地处理您的输入数据。

CREATE TABLE info_raw( id string, name string, place struct<city:string,state:string>, age INT, occupation string ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';

enter image description here

关于json - 在配置单元中加载 json 数据时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46295638/

相关文章:

hadoop - 即使在名称节点安全模式之后,我也无法启动 'Hive'

java - boolean 值总是返回 true

c++ - 使用 boost 属性树解析 JSON

windows - NameNode:无法在 Windows 7 中启动名称节点

java - 访问MapReduce中的args[0]值

hadoop - 使用Flume将推文写入HDFS不起作用

oracle - 在哪里做连接以展平表..? Hive或Oracle

hadoop - 在 java 中使用 hiveContext 修复配置单元表

json - Backbone JS将json属性解析为集合的模型

c# - 将日期时间从 javascript 传递给 c# (Controller)