我试图在 R 中提出一个正则表达式来匹配重复两个不同字符的字符串。
x <- c("aaaaaaah" ,"aaaah","ahhhh","cooee","helloee","mmmm","noooo","ohhhh","oooaaah","ooooh","sshh","ummmmm","vroomm","whoopee","yippee")
此正则表达式匹配以上所有内容,包括诸如“mmmm”和“ohhhh”之类的字符串,其中第一次和第二次重复中的重复字母相同:grep(".*([a-z])\\1.*([a-z])\\2", x, value = T)
我想匹配的内容 x
这些字符串的重复字母是不同的:"cooee","helloee","oooaaah","sshh","vroomm","whoopee","yippee"
如何调整正则表达式以确保第二个重复字符与第一个不同?
最佳答案
您可以使用 negative lookahead 限制第二个字符模式。 :
grep(".*([a-z])\\1.*(?!\\1)([a-z])\\2", x, value=TRUE, perl=TRUE)
# ^^^^^
见 regex demo .(?!\\1)([a-z])
表示匹配并捕获与组 1 中的值不同的任何小写 ASCII 字母并将其捕获到组 2 中。R demo :
x <- c("aaaaaaah" ,"aaaah","ahhhh","cooee","helloee","mmmm","noooo","ohhhh","oooaaah","ooooh","sshh","ummmmm","vroomm","whoopee","yippee")
grep(".*([a-z])\\1.*(?!\\1)([a-z])\\2", x, value=TRUE, perl=TRUE)
# => "cooee" "helloee" "oooaaah" "sshh" "vroomm" "whoopee" "yippee"
关于r - 如何匹配不同的重复字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62551649/