Hadoop 'grep' 示例

在 Hadoop 'grep' 示例(Hadoop 包附带)中，组参数是什么。你能给我一个例子吗？

最佳答案

免责声明:我还没有运行这个例子，我只是在看了 http://wiki.apache.org/hadoop/Grep 之后才开始回答。

CLI 调用是:bin/hadoop org.apache.hadoop.examples.Grep <indir> <outdir> <regex> [<group>]你想知道 <group> .

我怀疑这是正则表达式中的分组。 (随机链接 - http://www.exampledepot.com/egs/java.util.regex/Group.html)

如 Hadoop Grep 链接所述

The command works different than the Unix grep call: it doesn't display the complete matching line, but only the matching string

我从这里得到的是，如果您指定 <group>值(一个数字)它将只输出该组的值。

举个例子(从群组链接中拉取)

input: aba
regex: (a(b)*)+
group 0: aba
group 1: a
group 2: b

如果值为 <group>是1那么结果将是a .第 0 组是完全匹配，而不是原始字符串，在这种情况下它恰好是相同的。

第一个

关于Hadoop 'grep' 示例，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/6250784/

相关文章：

hadoop - 查询以仅显示 impala 中的列名