我正在尝试过滤位置变量。
X = FILTER C BY($14 matches '.*USD.*');
STORE X into '$output' using PigStorage(',');
上面的语句不起作用,但如果我尝试只输出 $14
E = FOREACH C GENERATE FLATTEN($14);
STORE C into '$output' using PigStorage(',');
它工作正常
样本数据:
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,USD120
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,USD0
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,GBP0
样本输出
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,USD0
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,GBP0
最佳答案
在 'BY' 和 '(' 之间添加一个空格
X = FILTER C BY (FLATTEN($14) matches '.*USD.*');
STORE X into '$output' using PigStorage(',');
关于hadoop - 位置变量上的 pig 过滤器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39376986/