这是一个例子:
The two (Senior Officer Stuart & Officer Jess) were intercepted by Officer George.
现在,假设我有两个军衔“军官”和“高级军官”,并且想要
将它们后面的名字替换为通用标记“PERSON”。正如您所看到的,排名后有三个名称 Stuart, Jess, George
。我不知道为什么我的正则表达式解决方案无法捕获所有这些。这是我的代码:
public static void main(String[] args) {
String input = "The two (Senior Officer Stuart & Officer Jess) were intercepted by Officer George.";
ArrayList<String> ranks = new ArrayList<String>();
ranks.add("Senior Officer");
ranks.add("Officer");
for (String rank : ranks) {
Pattern pattern = Pattern.compile(".*" + rank + " ([a-zA-Z]*?) .*");
Matcher m = pattern.matcher(input);
if (m.find()) {
System.out.println(rank);
System.out.println(m.group(1));
}
}
}
这是它的输出:
Senior Officer
Stuart
Officer
Stuart
两次捕获斯图尔特(通过高级军官和军官),但忽略杰西和乔治。我期望得到这个作为输出:
Senior Officer
Stuart
Officer
Stuart
Officer
Jess
Officer
George
最佳答案
这就足够了
for (String rank : ranks) {
Pattern pattern = Pattern.compile("\\b" + rank + "\\s+([a-zA-Z]*)");
Matcher m = pattern.matcher(input);
while (m.find()) {
System.out.println(rank);
System.out.println(m.group(1));
}
}
<强> Ideone Demo
正则表达式分解(根据评论)
Officer #Match Officer literally
( #Capturing group
(?: #Non-capturing group
\s #Match space
(?!(?:Senior\s+)?Officer) #Negative lookahead assures that its impossible to match the word Senior(which is optional) and Officer literally
[A-Z][a-zA-Z]* #Match capital letter followed by combination of capital and small letter
)* #Repeat the previous step any number of time till one of the condition of first letter being capital fails or word Officer is found
)
关于java - 正则表达式无法捕获所有匹配项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38038516/