Java Regular Expression Matcher 没有找到所有可能的匹配项

标签 java regex

我在 TutorialsPoint 上查看一段代码,从那以后有些事情一直困扰着我……看看这段代码:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches
{
    public static void main( String args[] ){

      // String to be scanned to find the pattern.
      String line = "This order was placed for QT3000! OK?";
      String pattern = "(.*)(\\d+)(.*)";

      // Create a Pattern object
      Pattern r = Pattern.compile(pattern);

      // Now create matcher object.
      Matcher m = r.matcher(line);
      while(m.find( )) {
         System.out.println("Found value: " + m.group(1));
         System.out.println("Found value: " + m.group(2));
         System.out.println("Found value: " + m.group(3));
      }
   }
}

此代码成功打印:

Found value: This was placed for QT300 
Found value: 0
Found value: ! OK?

但是根据正则表达式 "(.*)(\\d+)(.*)",为什么它不返回其他可能的结果,例如:

Found value: This was placed for QT30 
Found value: 00
Found value: ! OK?

Found value: This was placed for QT 
Found value: 3000
Found value: ! OK?

如果此代码不适合这样做,那么我如何编写一个可以找到所有可能匹配项的代码?

最佳答案

这是因为 greediness*然后是backtracking .

字符串:

This order was placed for QT3000! OK?

正则表达式:

(.*)(\\d+)(.*)

我们都知道.*是贪心的,尽可能匹配所有字符。所以第一个.*匹配所有字符直到最后一个字符 ?然后它回溯以提供匹配。我们正则表达式中的下一个模式是 \d+ ,所以它回溯到一个数字。一旦它找到一个数字,\d+匹配该数字,因为此处满足条件( \d+ 匹配一个或多个数字)。现在第一个(.*)捕获 This order was placed for QT300和以下 (\\d+)捕获数字 0位于 ! 之前符号。

现在下一个模式(.*)捕获所有剩余字符 !<space>OK? . m.group(1)指的是存在于组索引 1 和 m.group(2) 中的字符指的是索引 2,就这样继续下去。

查看演示 here .

得到你想要的输出。

String line = "This order was placed for QT3000! OK?";
  String pattern = "(.*)(\\d{2})(.*)";

  // Create a Pattern object
  Pattern r = Pattern.compile(pattern);

  // Now create matcher object.
  Matcher m = r.matcher(line);
  while(m.find( )) {
     System.out.println("Found value: " + m.group(1));
     System.out.println("Found value: " + m.group(2));
     System.out.println("Found value: " + m.group(3));
  }

输出:

Found value: This order was placed for QT30
Found value: 00
Found value: ! OK?

(.*)(\\d{2}) , 回溯最多两位数以提供匹配。

把你的模式改成这个,

String pattern = "(.*?)(\\d+)(.*)";

要得到这样的输出,

Found value: This order was placed for QT
Found value: 3000
Found value: ! OK?

?*之后强制 *进行非贪婪匹配。

使用额外的捕获组从单个程序中获取输出。

String line = "This order was placed for QT3000! OK?";
String pattern = "((.*?)(\\d{2}))(?:(\\d{2})(.*))";
Pattern r = Pattern.compile(pattern);
      Matcher m = r.matcher(line);
      while(m.find( )) {
         System.out.println("Found value: " + m.group(1));
         System.out.println("Found value: " + m.group(4));
         System.out.println("Found value: " + m.group(5));
         System.out.println("Found value: " + m.group(2));
         System.out.println("Found value: " + m.group(3) + m.group(4));
         System.out.println("Found value: " + m.group(5));
     }

输出:

Found value: This order was placed for QT30
Found value: 00
Found value: ! OK?
Found value: This order was placed for QT
Found value: 3000
Found value: ! OK?

关于Java Regular Expression Matcher 没有找到所有可能的匹配项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28038364/

相关文章:

regex - 打开文件并使用正则表达式对其进行过滤

java - 添加图像到java小程序?

正则表达式子匹配替换

java - 链表的动态实现

java - 如何在Java中读取用户的输入并将其写入文件

javascript - url 资源部分的正则表达式

regex - 如何使用 sed 按正则表达式对行进行排序

regex - 比较两个文件并按顺序打印它们之间的差异

java - IntelliJ 仍然截断输出

java - AppEngine 应用停止更新 Firebase,没有错误