挣扎了几个小时才能得到这个匹配并在 R 中替换 gsub
工作,仍然没有成功。
我正在尝试匹配模式 "Reason:"
在一个字符串中,并在此模式之后提取所有内容,直到第一次出现点( .
)
例如:
Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE.
会返回
"Not interested"
最佳答案
这是一个解决方案:
s <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
sub(".*Reason: (.*?)\\..*", "\\1", s)
# [1] "Not interested"
更新(以解决评论):
如果您还有与模式不匹配的字符串,我建议使用
regexpr
而不是 sub
:s2 <- c("no match example",
"Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE.")
match <- regexpr("(?<=Reason: ).*?(?=\\.)", s2, perl = TRUE)
ifelse(match == -1, NA, regmatches(s2, match))
# [1] NA "Not interested. ChannelID: CARE"
对于第二个示例,您可以使用以下正则表达式:
s3 <- "Delete Payment Arrangement of type Proof of Payment for BAN : 907295267 on date 02/01/2014, from reason PAERR."
# a)
sub(".*type (.*?) for.*", "\\1", s3)
# [1] "Proof of Payment"
# b)
match <- regexpr("(?<=type ).*?(?= for)", s3, perl = TRUE)
ifelse(match == -1, NA, regmatches(s3, match))
# [1] "Proof of Payment"
关于regex - R从模式末尾提取子字符串直到第一次出现字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22428997/