我正在尝试从段落中提取句子到,模式如下
Current. time is six thirty at Scotland. Past. time was five thirty at India; Current. time is five thirty at Scotland. Past. time was five thirty at Scotland. Current. time is five ten at Scotland.
当我将正则表达式用作
/current\..*scotland\./i
这匹配所有字符串
Current. time is six thirty at Scotland. Past. time was six thirty at India; Current. time is five thirty at Scotland. Past. time was five thirty at Scotland. Current. time is five ten at Scotland.
相反,我想在第一次出现“.”时停止。到所有捕获组,如
Current. time is six thirty at Scotland.
Current. time is five ten at Scotland.
类似的文本如
Past. time was five thirty at India; Current. time is six thirty at Scotland. Past. time was five thirty at Scotland. Past. time was five ten at India;
当我像这样使用正则表达式时
/past\..*india\;/i
这个匹配将整个字符串
Past. time was five thirty at India; Current. time is six thirty at Scotland. Past. time was five thirty at Scotland. Past. time was five ten at India;
这里我想捕获所有组或第一组,以及如何在第一次出现“;”时停止
Past. time was five thirty at India;
Past. time was five ten at India;
如何让正则表达式在“,”或“;”处停止有上面的例子吗?
最佳答案
有几件事你真的不应该用你的正则表达式来做,首先,正如 Arnal Murali 所指出的,你不应该使用贪婪的正则表达式,而应该使用惰性版本:
/current\..*?scotland\./i
我认为首先选择惰性选项是正则表达式的一般规则,因为它通常是您想要的。其次,您真的不想使用 .
来匹配所有内容,因为您不想让正则表达式的这一部分匹配 .
或 ;
您可以将它们放在负捕获组中以捕获除它们之外的任何内容:
/current\.[^.]*?scotland\./i
和
/current\.[^;]*?india;/i
或同时覆盖:
/(current|past)\.[^.;]*?(india|scotland)[.;]/i
(显然这可能不是你想要做的,只是包括演示如何扩展它)
这也是一个很好的经验法则,如果您在使用正则表达式时遇到问题,请使任何通配符更具体(在这种情况下,从匹配所有 .
更改为匹配除 之外的所有内容。
和 ;
与 [^.;]
)
关于ruby - 希望正则表达式在第一次出现 "."和 ";"时停止,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24204735/