java - 将由特定单词分隔的句子分组

标签 java regex

我正在尝试将 2 个由特定单词分隔的合理长度的子句子(在示例中为“AND”)分组,其中第二个可以是可选的。 一些例子:

案例 1:

foo sentence A AND foo sentence B

应该给予

"foo sentence A" --> matching group 1

"AND" --> matching  group 2 (optionally)

"foo sentence B" --> matching  group 3

案例 2:

foo sentence A

应该给予

"foo sentence A" --> matching  group 1
"" --> matching  group 2 (optionally)
"" --> matching  group 3

我尝试了以下正则表达式

(.*) (AND (.*))?$

它有效,但前提是,在 CASE2 中,我在字符串的最后位置放置了一个空格,否则模式不匹配。 如果我在圆括号组内包含“AND”之前的空格,则在情况 1 中,匹配器将整个字符串包含在第一组中。 我想知道前瞻和后视断言,但不确定它们是否能帮助我。 有什么建议吗? 谢谢

最佳答案

我会使用这个正则表达式:

^(.*?)(?: (AND) (.*))?$

解释:

The regular expression:

(?-imsx:^(.*?)(?: (AND) (.*))?$)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
----------------------------------------------------------------------
                             ' '
----------------------------------------------------------------------
    (                        group and capture to \2:
----------------------------------------------------------------------
      AND                      'AND'
----------------------------------------------------------------------
    )                        end of \2
----------------------------------------------------------------------
                             ' '
----------------------------------------------------------------------
    (                        group and capture to \3:
----------------------------------------------------------------------
      .*                       any character except \n (0 or more
                               times (matching the most amount
                               possible))
----------------------------------------------------------------------
    )                        end of \3
----------------------------------------------------------------------
  )?                       end of grouping
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

关于java - 将由特定单词分隔的句子分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16753986/

相关文章:

Android - 电子邮件验证

java - 奇怪的 Java 导入

java - ActionListener 在某些计算机上不起作用?

java - 类似于 Toast 的通知,但持续时间更长

java - 如何使用 JSMPP 发送 WAP SI 消息

Python 正则表达式命名组以

Java 模式匹配无法找到具有正则表达式 [A-Z0-9._%+-]+@[A-Z0-9.-]{3,65}\.[A-Z]{2,4} 的电子邮件

java - Scala:查找不以特定字符串结尾的最大字符串的正则表达式

java - 从覆盖传递数组

javascript - 将字符串(Revit 公式)转换为 JavaScript 对象