java - 如何使用 Pattern 和 Matcher 类查找模式?

标签 java regex

我正在编写一个程序,需要检查一组字符的位置。我的代码目前是:

String checkerLoop = "ForeclosureResutls_CaseNum_"; 
          Pattern checkerLoopPattern = Pattern.compile("(?<="+Pattern.quote(checkerLoop)+").*?(?="+checkerNumber+")");
          Matcher checkerLoopMatcher = checkerLoopPattern.matcher(scraper.getPage().getWebResponse().getContentAsString()); 

          while (checkerLoopMatcher.find()) {
            checker = true;
          }

我需要查找的句子是“ForeclosureResutls_CaseNum_”+ checkerNumber,其中 checker number 是一个 int。我尝试根据以前的代码编写此代码以查找两组之间的一组字符,所以我相信这可能是此代码无法正常工作的原因。

示例输入字符串如下:

<a id="SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_0" href="javascript:__doPostBack(&#39;ctl00$SheetContentPlaceHolder$ctl00$gvForeclosureResutls$ctl02$lbCaseNum&#39;,&#39;&#39;)" style="display:inline-block;width:100px;">CV-13-798497</a>
                    </td><td align="center">488-05-029</td><td align="center">I</td><td align="center">01/02/2013</td>
        </tr><tr style="background-color:Gainsboro;">
            <td align="left">UNKNOWN HEIRS, ETC OF D.C. RUFUS, ET AL  </td><td align="left">10603 HAMPDEN AVENUE</td><td align="center">CLEVELAND</td><td align="center">44108-0000</td><td align="center">
                        <a id="SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_1" href="javascript:__doPostBack(&#39;ctl00$SheetContentPlaceHolder$ctl00$gvForeclosureResutls$ctl03$lbCaseNum&#39;,&#39;&#39;)" style="display:inline-block;width:100px;">CV-13-798498</a>
                    </td><td align="center">109-16-094</td><td align="center">A</td><td align="center">01/02/2013</td>
        </tr><tr style="background-color:LightGrey;">
            <td align="left">SHARECE MILLER, ET AL  </td><td align="left">13514 ALVIN AVENUE</td><td align="center">GARFIELD HTS</td><td align="center">44105-0000</td><td align="center">
                        <a id="SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_2" href="javascript:__doPostBack(&#39;ctl00$Shee

最佳答案

好的,这就是我所拥有的。我没有完全满足您的要求,但这应该有助于您走上正确的道路。

首先,在此演示数据中根本找不到 ForeclosureResutls_CaseNum_ForeclosureResutls_lbCaseNum 是,所以这就是我所用的。

此外,我忽略了 checkerNumber 并假设您想检查任何 数字,因为此输入中有三个,我不知道如何你的是派生的。因此 \\d

据我所知,考虑到您需要做的事情,您在帖子中使用的正则表达式很疯狂。相比之下,我使用的那个微不足道。

试试这个:

import  java.util.regex.Matcher;
import  java.util.regex.Pattern;

/**
   <P>{@code java ParseForclosureResultsXmpl}</P>
 **/
public class ParseForclosureResultsXmpl  {
   public static final void main(String[] igno_red)  {
      String sLS = System.getProperty("line.separator", "\n");

      StringBuilder sdInput = new StringBuilder().
         append("<a id=\"SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_0\" href=\"javascript:__doPostBack(&#39;ctl00$SheetContentPlaceHolder$ctl00$gvForeclosureResutls$ctl02$lbCaseNum&#39;,&#39;&#39;)\" style=\"display:inline-block;width:100px;\">CV-13-798497</a>").append(sLS).
         append("              </td><td align=\"center\">488-05-029</td><td align=\"center\">I</td><td align=\"center\">01/02/2013</td>").append(sLS).
         append("  </tr><tr style=\"background-color:Gainsboro;\">").append(sLS).
         append("      <td align=\"left\">UNKNOWN HEIRS, ETC OF D.C. RUFUS, ET AL  </td><td align=\"left\">10603 HAMPDEN AVENUE</td><td align=\"center\">CLEVELAND</td><td align=\"center\">44108-0000</td><td align=\"center\">").append(sLS).
         append("                  <a id=\"SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_1\" href=\"javascript:__doPostBack(&#39;ctl00$SheetContentPlaceHolder$ctl00$gvForeclosureResutls$ctl03$lbCaseNum&#39;,&#39;&#39;)\" style=\"display:inline-block;width:100px;\">CV-13-798498</a>").append(sLS).
         append("              </td><td align=\"center\">109-16-094</td><td align=\"center\">A</td><td align=\"center\">01/02/2013</td>").append(sLS).
         append("  </tr><tr style=\"background-color:LightGrey;\">").append(sLS).
         append("      <td align=\"left\">SHARECE MILLER, ET AL  </td><td align=\"left\">13514 ALVIN AVENUE</td><td align=\"center\">GARFIELD HTS</td><td align=\"center\">44105-0000</td><td align=\"center\">").append(sLS).
         append("                  <a id=\"SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_2\" href=\"javascript:__doPostBack(&#39;ctl00$Shee").append(sLS);

      String sRqdValuePrefix = "ForeclosureResutls_lbCaseNum_";
      Pattern checkerLoopPattern = Pattern.compile(sRqdValuePrefix + "\\d");
      Matcher m = checkerLoopPattern.matcher("");  //Unused. so the matcher can be reused in the loop.

      int iLn = 0;
      String[] asInput = sdInput.toString().split(sLS);
      for(String s : asInput)  {
         iLn++;    //1st iteration: Was zero, now 1

         //Resuing matcher instead of retrieving new one from Pattern each iteration
         m.reset(s);

         if(m.find())  {
            int iCheckerNumber = Integer.parseInt(s.substring(m.start() + sRqdValuePrefix.length(), m.end()));
            System.out.println("Found on line " + iLn + ", at index " + m.start() + " with checker number " + iCheckerNumber);
         }
      }
   }
}

输出:

[C:\java_code\]java ParseForclosureResultsXmpl
Found on line 1, at index 39 with checker number 0
Found on line 5, at index 57 with checker number 1
Found on line 9, at index 57 with checker number 2

提出任何问题。

关于java - 如何使用 Pattern 和 Matcher 类查找模式?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21371165/

相关文章:

java - 无法使用 datastax java 驱动程序通过 UDT key 从 cassandra 检索

python - 如何捕捉一组最长的序列

regex - Perl正则表达式匹配问题

regex - 在 Django url 正则表达式模式中使用括号和破折号

java - 使用包装器通过 Runtime.exec() 运行命令

java - 使用 Ivy-Ant : Unauthorized by Credentials? 在 nexus 中发布工件

正则表达式匹配多个字符

php - 将函数从 Javascript 转换为 PHP

java - 在 imageview JavaFx 上显示图像

java - 泛型类的新对象上的绑定(bind)不匹配