regex - 如何匹配非连续重复n次以上的模式?

标签 regex r

我正在尝试在此示例向量上匹配模式“城市,州”(例如“奥斯汀,德克萨斯州”)

> s <- c("Austin, TX", "Forth Worth, TX", "Ft. Worth, TX", 
"Austin TX", "Austin, TX, USA", "Ft. Worth, TX, USA")

> grepl('[[:alnum:]], [[:alnum:]$]', s)
[1]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE

但是,有两种情况我想检索 FALSE:

-当超过 1 个逗号时(即 "Austin, TX, USA")

-当逗号之前有另一个标点符号时(即“Ft. Worth, TX”)

最佳答案

您可以使用以下正则表达式模式:

grepl("^[a-z ]+, [a-z]+$", subject, perl=TRUE, ignore.case=TRUE);

Regex101 Demo

正则表达式说明:

^[a-z ]+, [a-z]+$/gmi

    ^ assert position at start of a line
    [a-z ]+ match a single character present in the list below
        Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
        a-z a single character in the range between a and z (case insensitive)
         the literal character  
    , matches the characters , literally
    [a-z]+ match a single character present in the list below
        Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
        a-z a single character in the range between a and z (case insensitive)
    $ assert position at end of a line
    ignore.case: insensitive. Case insensitive match (ignores case of [a-zA-Z])

关于regex - 如何匹配非连续重复n次以上的模式?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37380687/

相关文章:

algorithm - R中2个顶点之间的所有路径

r - 获取不同组的相同个体数

c# - 仅使用正则表达式在字符串中附加和前置 '#'

jquery - 增加字符串中间的数字

java - 云形成 : What is a RegEx to match S3 bucket names that do not have periods (dots)

r - 脉冲响应函数

R:引用带有特殊字符的变量名

java - 使用 ruby​​ 从 javap 中解析方法名称

正则表达式 - 如何查找被两个特定字符包围的文本部分?

r - (速度挑战)任何更快的方法来计算两个矩阵的行之间的距离矩阵,就欧几里德距离而言?