我正在使用cloudera yarn VMware Player(非商业用途)进行练习。
我在 pig 里的剧本是a1 = load '/user/training/my_hdfs/id' using PigStorage('\t') as(id:int,name:chararray,desig:chararray);
a2 = load '/user/training/my_hdfs/trips' using PigStorage('\t') as(id:int,place:chararray,no_trips:int);
a3 = join a1 by id,a2 by id;
a4 = group a3 by a1::id;
illustrate a4;
说明后,它显示消息为,2017-08-21 07:52:11,926 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered IOException. Exception : Error compiling operator POLocalRearrange
数据集是Table id
101 aaa executive
102 bbb manager
104 hhh manager
106 ccc trainee
109 hhh trainee
Table trips
101 pune 1
101 hyd 2
102 pune 2
102 hyd 3
102 bang 4
最佳答案
当我尝试使用提供的数据运行程序时,由于文件中的分隔符不一致,我也遇到一些错误。有的地方有空间,有的地方有标签(可能是因为复制粘贴)。我使定界符通用(使用制表符),并且一切正常。
尝试使用转储a1或转储a2,看看是否可以在正确的列中看到数据。
对我来说,在使定界符通用之后,它可以完美工作并说明a4给出以下输出:
------------------------------------------------------------------
| a1 | id:int | name:chararray | desig:chararray |
------------------------------------------------------------------
| | 101 | aaa | executive |
| | 101 | aaa | executive |
------------------------------------------------------------------
----------------------------------------------------------------
| a2 | id:int | place:chararray | no_trips:int |
----------------------------------------------------------------
| | 101 | pune | 1 |
| | 101 | hyd | 2 |
----------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
| a3 | a1::id:int | a1::name:chararray | a1::desig:chararray | a2::id:int | a2::place:chararray | a2::no_trips:int |
------------------------------------------------------------------------------------------------------------------------------------------------
| | 101 | aaa | executive | 101 | pune | 1 |
| | 101 | aaa | executive | 101 | hyd | 2 |
| | 101 | aaa | executive | 101 | pune | 1 |
| | 101 | aaa | executive | 101 | hyd | 2 |
------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| a4 | group:int | a3:bag{:tuple(a1::id:int,a1::name:chararray,a1::desig:chararray,a2::id:int,a2::place:chararray,a2::no_trips:int)} |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| | 101 | {(101, ..., 1), ..., (101, ..., 2)} |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
关于hadoop - 在Pig中,出现 'Error compiling operator POLocalRearrange'错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45797284/