我有两个表oldTable
和newTable
,其内容为:oldTable
:
key value volume
======================
1 abc 10000
2 def 5000
newTable
: key value volume
======================
1 abc 2000
2 def 3000
3 xyz 7000
我想创建一个新表,该表汇总两个表中的
volume
。也就是说,新表格应包含以下内容:joined_table
: key value volume
======================
1 abc 12000
2 def 8000
3 xyz 7000
我尝试使用以下语句,但没有结果:
CREATE TABLE joined_table AS
SELECT key, value, volume
FROM (
SELECT IF(oldTable.key != NULL, oldTable.key, newTable.key) AS key,
IF(oldTable.value != NULL, oldTable.value, newTable.value) AS value,
IF(oldTable.volume AND newTable.volume, oldTable.volume + newTable.volume,
IF(oldTable.volume != NULL, oldTable.volume, newTable.volume)) AS volume
FROM(
SELECT oldTable.key, oldTable.value, oldTable.volume, newTable.key, newTable.value, newTable.volume
FROM newTable FULL OUTER JOIN oldTable ON newTable.key = oldTable.key
)alias
)anotherAlias;
但这给我说了
Query returned non-zero code: 10, cause: FAILED: Error in semantic analysis: Ambiguous column reference key
一个错误。我尝试在上述查询中更改
joined_table
中的列名,但这给了我同样的错误。如何实现这一目标有帮助吗?另外,有什么方法可以将结果覆盖到现有表中,比如
oldTable
而不是创建这个新表?
最佳答案
您在查询中使用的单词key
是保留关键字。这可能是解析器抛出歧义错误的原因。您可以使用反勾号来避免解析器将其读取为保留文字。
CREATE TABLE joined_table AS
SELECT `key`, value, volume
FROM (
SELECT IF(oldTable.`key` != NULL, oldTable.`key`, newTable.`key`) AS `key`,
IF(oldTable.value != NULL, oldTable.value, newTable.value) AS value,
IF(oldTable.volume AND newTable.volume, oldTable.volume + newTable.volume,
IF(oldTable.volume != NULL, oldTable.volume, newTable.volume)) AS volume
FROM(
SELECT oldTable.`key`, oldTable.value, oldTable.volume, newTable.`key`, newTable.value, newTable,volume
FROM newTable FULL OUTER JOIN oldTable ON newTable.`key` = oldTable.`key`;
)alias
)anotherAlias;
关于hadoop - 在Shark Hive中创建连接两个现有表的表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22525293/