java - 行为 |正则表达式中的符号

标签 java regex split

这是我的字符串

String s = "asadsdas357902||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC";

我把它拆分成

String a[] = s.split(s, i);

输出:i=0

        |   |   1   9   0   |   |   R   U   E       R   A   C   H   E   L   L   E   |   |   S   T   |   |   |   L   E   S       C   È   D   R   E   S   |   J   7   T   1   J   9   |   Q   C   

数组的前两个索引为空,然后每个索引有一个字符。

当i=1时,输出的是整个原始字符串

asadsdas357902||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC

当i=2时,输出为

    ||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC  

数组的第一个索引为空,第二个包含第一个|符号的子串

当i=3时,输出为

        ||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC

前两个索引为空,最后一个索引具有与 i=2 相同的子字符串

当i=4时,输出为

        |   |190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC   

前两个索引为空,下一个包含一个管道,最后一个是其余的

当i=5时,输出为

        |   |   190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC    

前两个空,接下来两个管道字符,最后剩下的。

随着i值的增加,输出为

first two indexes empty
next all indexes except last contains one character each
last index contains the remaining string

我的问题是

  1. 为什么不考虑第一个管道符号之前的第一个单词?
  2. 为什么除了 1 之外,i 的每个值都使前两个索引为空?
  3. 这里的pattern是同一个字符串,那么这里匹配的是什么,输出是怎么来的?

另一件事是,如果我将 pipe 符号替换为任何其他符号,例如 @ 或 !或%,输出为

array length is 2 with both indexes has empty strings. this is for i>=2

对于 i=0

the array length is also 0

对于 i=1

the array length is 1 containing the whole string.

是否将管道符号作为特殊的正则表达式符号?

任何有用的帮助。

最佳答案

split 方法将正则表达式作为输入参数。现在你的正则表达式是 asadsdas357902||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC,第二个参数i是应用拆分操作的次数。这是你的正则表达式的解释

                         // Match either the regular expression below (attempting the next alternative only if this one fails)
   "asadsdas357902" +       // Match the characters “asadsdas357902” literally
"|" +                    // Or match regular expression number 2 below (attempting the next alternative only if this one fails)
   "|" +                    // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match
                         // Or match regular expression number 3 below (attempting the next alternative only if this one fails)
   "190" +                  // Match the characters “190” literally
"|" +                    // Or match regular expression number 4 below (attempting the next alternative only if this one fails)
   "|" +                    // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match
                         // Or match regular expression number 5 below (attempting the next alternative only if this one fails)
   "RUE\\ RACHELLE" +        // Match the characters “RUE RACHELLE” literally
"|" +                    // Or match regular expression number 6 below (attempting the next alternative only if this one fails)
   "|" +                    // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match
                         // Or match regular expression number 7 below (attempting the next alternative only if this one fails)
   "ST" +                   // Match the characters “ST” literally
"|" +                    // Or match regular expression number 8 below (attempting the next alternative only if this one fails)
   "|" +                    // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match
                         // Or match regular expression number 9 below (attempting the next alternative only if this one fails)
   "|" +                    // Empty alternative effectively truncates the regex at this point because it will always find a zero-width match
                         // Or match regular expression number 10 below (attempting the next alternative only if this one fails)
   "LES\\ CÈDRES" +          // Match the characters “LES CÈDRES” literally
"|" +                    // Or match regular expression number 11 below (attempting the next alternative only if this one fails)
   "J7T1J9" +               // Match the characters “J7T1J9” literally
"|" +                    // Or match regular expression number 12 below (the entire match attempt fails if this one fails to match)
   "QC"                     // Match the characters “QC” literally

因此,您的正则表达式在某种程度上实际上等同于 asadsdas357902|,因为它之后的正则表达式从未经过测试。请在此处查看 split 方法文档 String#split

这段代码会给你相同的输出

private static void splitWithPipe() {
    String s = "asadsdas357902||190||RUE RACHELLE||ST|||LES CÈDRES|J7T1J9|QC";
    for (int i = 0; i < 10; i++) {
        String a[] = s.split("asadsdas357902|", i); 
        System.out.println(Arrays.toString(a));
    }
}

关于java - 行为 |正则表达式中的符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9919971/

相关文章:

java - 如何克隆java TreeModel?

javascript - 使用 javascript 的电子邮件查找器正则表达式

python - 在 Python/numpy 中,如何根据数组的第一列(即索引)解开/拆分数组?

Java:单独的时间数字

java - 如何用 MyBatis/Spring 实现批量操作?

java - 如何在Grails的脚手架 View 中显示ID列?

java - 检查输入到树中的有效路径(字符串)时遇到问题

javascript - 必要时用空格填充相等性/同一性 (==/===) (vim)

jquery - 改进了使用正则表达式替换的性能

javascript - 如何在第一个 "("上拆分 javascript