我正在尝试在此示例向量上匹配模式“城市,州”(例如“奥斯汀,德克萨斯州”)
> s <- c("Austin, TX", "Forth Worth, TX", "Ft. Worth, TX",
"Austin TX", "Austin, TX, USA", "Ft. Worth, TX, USA")
> grepl('[[:alnum:]], [[:alnum:]$]', s)
[1] TRUE TRUE TRUE FALSE TRUE TRUE
但是,有两种情况我想检索 FALSE:
-当超过 1 个逗号时(即 "Austin, TX, USA"
)
-当逗号之前有另一个标点符号时(即“Ft. Worth, TX”
)
最佳答案
您可以使用以下正则表达式模式:
grepl("^[a-z ]+, [a-z]+$", subject, perl=TRUE, ignore.case=TRUE);
正则表达式说明:
^[a-z ]+, [a-z]+$/gmi
^ assert position at start of a line
[a-z ]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
the literal character
, matches the characters , literally
[a-z]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
$ assert position at end of a line
ignore.case: insensitive. Case insensitive match (ignores case of [a-zA-Z])
关于regex - 如何匹配非连续重复n次以上的模式?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37380687/