hadoop - 将数据加载到 Hive 表中

标签 hadoop mapreduce hive

CREATE TABLE IF NOT EXISTS TestingTable2 
( 
USER_ID BIGINT, 
PURCHASED_ITEM ARRAY<STRUCT<PRODUCT_ID: BIGINT,TIMESTAMPS:STRING>> 
) ROW FORMAT
DELIMITED FIELDS TERMINATED BY '-'
collection items terminated by ','
map keys terminated by ':'
LINES TERMINATED BY '\n' 
STORED AS TEXTFILE
LOCATION '/user/rkost/output2';

下面是我的数据,只有一行数据,我需要将其上传到上表中。

1015826235-[{"product_id":220003038067,"timestamps":"1340321132000"},{"product_id":300003861266,"timestamps":"1340271857000"},{"product_id":140002997245,"timestamps":"1339694926000"},{"product_id":200002448035,"timestamps":"1339172659000"},{"product_id":260003553381,"timestamps":"1339072514000"}]-

在我选择查询时上传数据后,我看不到正确的数据。我应该只得到如下一行,但我没有在表中得到下面的结果

**USER_ID**     **PURCHASED_ITEM**
1015826235     [{"product_id":220003038067,"timestamps":"1340321132000"},    {"product_id":300003861266,"timestamps":"1340271857000"},    {"product_id":140002997245,"timestamps":"1339694926000"},    {"product_id":200002448035,"timestamps":"1339172659000"},    {"product_id":260003553381,"timestamps":"1339072514000"}]

在我执行选择查询后,我的表数据中得到的不是上面的数据,而是类似这样的数据。分隔符有什么问题吗?

1015826235      [{"product_id":null,"timestamps":" 220003038067"},{"product_id":null,"timestamps":" \"1340321132000\"}"},{"product_id":null,"timestamps":"  
                                 300003861266"},{"product_id":null,"timestamps":" \"1340271857000\"}"},{"product_id":null,"timestamps":" 140002997245"},
                                      {"product_id":null,"timestamps":" \"1339694926000\"}"},{"product_id":null,"timestamps":" 200002448035"},
                                            {"product_id":null,"timestamps":" \"1339172659000\"}"},{"product_id":null,"timestamps":" 260003553381"},
                                                       {"product_id":null,"timestamps":" \"1339072514000\"}]"}]

谁能指出我做错了什么?

最佳答案

给产品id加上双引号

1015826235-[{"product_id":"220003038067","timestamps":"1340321132000"},{"product_id":"300003861266","timestamps":"1340271857000"},{"product_id":"140002997245","timestamps":"1339694926000"},{"product_id":"200002448035","timestamps":"1339172659000"},{"product_id":"260003553381","timestamps":"1339072514000"}]-

关于hadoop - 将数据加载到 Hive 表中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11367684/

相关文章:

hadoop - 使用 Spark 对 Parquet 文件进行计数操作

hadoop - 如果一个系统的输入依赖于另一个系统的输出,那么在hadoop中会发生什么?

map - 如何限制并发运行的 map task ?

hadoop - 在插入另一个表之前转换配置单元表中的数据

sql - RDBMS 和 Hive 有什么区别?

hadoop - Databricks是否为给定的群集和数据集提供建议的 Spark 参数?

java - 在两个独立的 MapReduce 作业之间传递值

hadoop - 在 MapReduce 程序的 Reduce 方法中使用 iterable 的集合对象的类型是什么

sql - 配置单元 SQL : Other ways to delete rows with a similar ID

regex - Hive - 多个字符串的 regexp_replace 函数