java - 将由特定单词分隔的句子分组

我正在尝试将 2 个由特定单词分隔的合理长度的子句子(在示例中为“AND”)分组，其中第二个可以是可选的。一些例子:

案例 1:

foo sentence A AND foo sentence B

应该给予

"foo sentence A" --> matching group 1

"AND" --> matching  group 2 (optionally)

"foo sentence B" --> matching  group 3

案例 2:

foo sentence A

应该给予

"foo sentence A" --> matching  group 1
"" --> matching  group 2 (optionally)
"" --> matching  group 3

我尝试了以下正则表达式

(.*) (AND (.*))?$

它有效，但前提是，在 CASE2 中，我在字符串的最后位置放置了一个空格，否则模式不匹配。如果我在圆括号组内包含“AND”之前的空格，则在情况 1 中，匹配器将整个字符串包含在第一组中。我想知道前瞻和后视断言，但不确定它们是否能帮助我。有什么建议吗？谢谢

最佳答案

我会使用这个正则表达式:

^(.*?)(?: (AND) (.*))?$

解释:

The regular expression:

(?-imsx:^(.*?)(?: (AND) (.*))?$)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
----------------------------------------------------------------------
                             ' '
----------------------------------------------------------------------
    (                        group and capture to \2:
----------------------------------------------------------------------
      AND                      'AND'
----------------------------------------------------------------------
    )                        end of \2
----------------------------------------------------------------------
                             ' '
----------------------------------------------------------------------
    (                        group and capture to \3:
----------------------------------------------------------------------
      .*                       any character except \n (0 or more
                               times (matching the most amount
                               possible))
----------------------------------------------------------------------
    )                        end of \3
----------------------------------------------------------------------
  )?                       end of grouping
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

关于java - 将由特定单词分隔的句子分组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/16753986/

java - 将由特定单词分隔的句子分组

上一篇：java - 无法从 long 转换为 int，为什么我不能使用 Math.round 将此 double 舍入到最接近的 int

下一篇：java - 在 Java 中使用 BufferedReader 一个接一个地读取不同的文件