hadoop - HIVE:如何创建一个表，其中包含另一个表中的所有列，除了其中一个？

当我需要将一列更改为一个分区(convert normal column as partition column in hive)时，我想创建一个新表来复制除一列以外的所有列。我目前在原始表中有 >50 列。有什么干净的方法可以做到这一点吗？

类似于:

CREATE student_copy LIKE student 除了年龄和头发颜色；

谢谢!

最佳答案

您可以使用正则表达式: CTAS using REGEX column spec. :

set hive.support.quoted.identifiers=none;
CREATE TABLE student_copy AS SELECT `(age|hair_color)?+.+` FROM student;
set hive.support.quoted.identifiers=column;

但是(如 Kishore Kumar Suthar 所述: 这不会创建分区表，因为 CTAS 不支持分区表(创建表作为选择)。

我认为您获取分区表的唯一方法是获取表的完整创建语句(如 Abraham 所述):

SHOW CREATE TABLE student;

改变它以在你想要的列上创建一个分区。之后，您可以在插入新表时使用带正则表达式的选择。如果您的分区列已经是此选择的一部分，那么您需要确保它是 last column you insert .如果不是，您可以在正则表达式中排除该列并将其作为最后包含。此外，如果您希望根据插入语句创建多个分区，则需要启用“动态分区”:

set hive.support.quoted.identifiers=none;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
INSERT INTO TABLE student_copy PARTITION(partcol1) SELECT `(age|hair_color|partcol1)?+.+`, partcol1 FROM student;
set hive.support.quoted.identifiers=column;

“hive.support.quoted.identifiers=none”需要在查询的正则表达式部分使用反引号“`”。我在声明后将此参数设置为其原始值:'hive.support.quoted.identifiers=column'

关于hadoop - HIVE:如何创建一个表，其中包含另一个表中的所有列，除了其中一个？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/32260483/

hadoop - HIVE:如何创建一个表，其中包含另一个表中的所有列，除了其中一个？

上一篇：hadoop - apache 配置单元无法连接到 derby :metastore_db although ij from derby can

下一篇：c# - 将文件上传到 Hbase HDInsight