regex - 匹配两个字符串之间的所有内容

标签 regex r

假设我有这个字符串:

string <- "I2-1-EX-1-I3-1-EX-1-I2-1-I1-1-EX-1-I3-1-I2-1-EX-1-I2-1-I2-1-I1-1-I3-1-N2-1-I1-1-I1-1-I2-1-N2-1-N3-1-I1-1-NR-1-FA-1-NR-1-I3-1-I1-1-NR-1-N1-1-EX-1-QU-1-I3-1-NR-1-FA-1-EX-1-QU-1-NR-1-I2-1-I2-1-I2-1-NR-1-TR-1-I1-1-I2-1-I3-1-NR-1-I1-1-I1-1-EX-1-NR-1-NR-1-I1-1-NR-1-NR-1-I3-1-I2-1-NR-1-I1-1-QU-1-QU-1-I1-1-TR-1-QU-1-NR-1-NR-1-QU-1-TR-1-NR-1-I1-1-TR-1-I1-1-FA-1-I1-1-I2-1-QU-1-TR-1-FA-1-EX-1-QU-1-QU-1-QU-1-NR-1-QU-1-I1-1-TR-1-FA-1-QU-1-FA-1-FA-1-TR-1-FA-1-QU-1-EX-1-QU-1-I1-1-QU-1-QU-1-FA-1-FA-1-QU-1-QU-1-FA-1-FA-1-I3-1-NR-1-FA-1-I1-1-I2-1-FA-1-QU-1-FA-1-I2-1-FA-1-NR-1-I1-1-NR-1-TR-1-NR-1-EX-1-NR-1-NR-1-EX-1-TR-1-I3-1-I1-1-NR-1-NR-1-FA-1-I1-1-TR-1-EX-1-NR-1-NR-1-I1-1-I1-1-NR-1-I1-1-NR-1-EX-1-EX-1-EX-1-NR-1-NR-1-NR-1-FA-1-FA"

我想匹配两个包含 "I" 的标记之间出现的所有内容.例如,这意味着匹配,从字符串的开头:

-EX-
-EX-
-EX-
-EX-
-N2-
-N2-1-N3-
-NR-1-FA-1-NR-
etc...

我如何使用正则表达式(非常适合 R)实现这种匹配?

我尝试了类似 (?=<1|2|3).*(?=I) 的方法,但它似乎不起作用。我对上面的正则表达式的基本原理是,所有 I 都以 1、2 或 3 结尾,这将是后视应该找到的左手边界,而 I 是前瞻应该找到的右手边界。

最佳答案

似乎您正在尝试获取 I[123]-1 之间的所有字符和 1-I[123] . \K keeps the text matched so far out of the overall regex match . (?:(?!I[123]).)*?只有当它不是起始 I 时才会匹配任何单个字符在I[123] , 否则匹配失败。

> x <- "I2-1-EX-1-I3-1-EX-1-I2-1-I1-1-EX-1-I3-1-I2-1-EX-1-I2-1-I2-1-I1-1-I3-1-N2-1-I1-1-I1-1-I2-1-N2-1-N3-1-I1-1-NR-1-FA-1-NR-1-I3-1-I1-1-NR-1-N1-1-EX-1-QU-1-I3-1-NR-1-FA-1-EX-1-QU-1-NR-1-I2-1-I2-1-I2-1-NR-1-TR-1-I1-1-I2-1-I3-1-NR-1-I1-1-I1-1-EX-1-NR-1-NR-1-I1-1-NR-1-NR-1-I3-1-I2-1-NR-1-I1-1-QU-1-QU-1-I1-1-TR-1-QU-1-NR-1-NR-1-QU-1-TR-1-NR-1-I1-1-TR-1-I1-1-FA-1-I1-1-I2-1-QU-1-TR-1-FA-1-EX-1-QU-1-QU-1-QU-1-NR-1-QU-1-I1-1-TR-1-FA-1-QU-1-FA-1-FA-1-TR-1-FA-1-QU-1-EX-1-QU-1-I1-1-QU-1-QU-1-FA-1-FA-1-QU-1-QU-1-FA-1-FA-1-I3-1-NR-1-FA-1-I1-1-I2-1-FA-1-QU-1-FA-1-I2-1-FA-1-NR-1-I1-1-NR-1-TR-1-NR-1-EX-1-NR-1-NR-1-EX-1-TR-1-I3-1-I1-1-NR-1-NR-1-FA-1-I1-1-TR-1-EX-1-NR-1-NR-1-I1-1-I1-1-NR-1-I1-1-NR-1-EX-1-EX-1-EX-1-NR-1-NR-1-NR-1-FA-1-FA"
> regmatches(x, gregexpr("I[123]-1\\K-(?:(?!I[123]).)*?-(?=1-I[123])", x , perl=TRUE))
[[1]]
 [1] "-EX-"                                             
 [2] "-EX-"                                             
 [3] "-EX-"                                             
 [4] "-EX-"                                             
 [5] "-N2-"                                             
 [6] "-N2-1-N3-"                                        
 [7] "-NR-1-FA-1-NR-"                                   
 [8] "-NR-1-N1-1-EX-1-QU-"                              
 [9] "-NR-1-FA-1-EX-1-QU-1-NR-"                         
[10] "-NR-1-TR-"                                        
[11] "-NR-"                                             
[12] "-EX-1-NR-1-NR-"                                   
[13] "-NR-1-NR-"                                        
[14] "-NR-"                                             
[15] "-QU-1-QU-"                                        
[16] "-TR-1-QU-1-NR-1-NR-1-QU-1-TR-1-NR-"               
[17] "-TR-"                                             
[18] "-FA-"                                             
[19] "-QU-1-TR-1-FA-1-EX-1-QU-1-QU-1-QU-1-NR-1-QU-"     
[20] "-TR-1-FA-1-QU-1-FA-1-FA-1-TR-1-FA-1-QU-1-EX-1-QU-"
[21] "-QU-1-QU-1-FA-1-FA-1-QU-1-QU-1-FA-1-FA-"          
[22] "-NR-1-FA-"                                        
[23] "-FA-1-QU-1-FA-"                                   
[24] "-FA-1-NR-"                                        
[25] "-NR-1-TR-1-NR-1-EX-1-NR-1-NR-1-EX-1-TR-"          
[26] "-NR-1-NR-1-FA-"                                   
[27] "-TR-1-EX-1-NR-1-NR-"                              
[28] "-NR-" 

DEMO

关于regex - 匹配两个字符串之间的所有内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29025619/

相关文章:

python - 使用 Tornado 框架在 Python 中进行正则表达式路由

jquery - 正则表达式允许字符串在输入开头确定

regex - yii2 验证匹配正则表达式模式得到无效输入

r - 格直方图轴 : how to fix lower limit at 0, 但保持默认上限?

r - 用 R 中的因子绘制线图

r - 在函数中使用 ddply 并包含感兴趣的变量作为参数

regex - 如何在 Stata 中每次出现字符串时向字符串添加递增值?

r - 在 Caret 中安装 bartMachine 获取长度为零的参数/维数不正确

R:在 ggplot2 中绘制线性判别分析的后验分类概率

JavaScript 正则表达式上的转义星号