我采取以下措施:
A = LOAD 'a.txt' USING PigStorage('\\u001') AS (
foo:int
,bar:chararray
);
B = LOAD 'b.txt' USING PigStorage('\\u001') AS (
foo:int
,baz:long
);
C = JOIN A BY foo, B BY foo;
D = FOREACH C GENERATE
A::foo AS foo
,A::bar AS bar
,B::baz AS baz
;
如何一步加入和定义模式?
最佳答案
根据documentation加入关系时不能定义模式。
笔记:
从句法上讲,您可以嵌套命令以节省一些步骤,例如:
D = foreach
(join (LOAD 'a.txt' USING PigStorage('\\u001') AS (foo:int ,bar:chararray)) by foo,
(LOAD 'b.txt' USING PigStorage('\\u001') AS (foo:int ,baz:long)) by foo
) generate $0 as foo, $1 as bar, $3 as baz;
但我会避免这样做。它很困惑,但它会生成与原始计划相同的解释计划。
关于hadoop - Pig - 如何一步加入和定义模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23080279/