regex - R正则表达式在@之后解析 token ，字符串中也没有其他 token

我在解析文本字符串中的地址时遇到问题。通常的地址将是“@address token token token”或“@address token token/ntoken”。

string <- c("@address token token token", "@address token token /ntoken")
gsub("^\\.?@([a-z0-9_]{1,25})[^a-z0-9_]+.*$", "\\1", string)

哪些被正确解析

[1] "address" "address"

然而，在某些情况下，地址将是字符串中的唯一标记，然后正则表达式将返回包含@的地址

string <- c("@address token token token", "@address token token /ntoken", "@address")
gsub("^\\.?@([a-z0-9_]{1,25})[^a-z0-9_]+.*$", "\\1", string)
# [1] "address"  "address"  "@address"

如何指示正则表达式也管理唯一 token 的情况？

最佳答案

in some circumstances the address will be the only token in the string, then regex will return the address including the @

因为在那种情况下没有匹配项。

稍作改动:

将 [^a-z0-9_]+ 转换为 [^a-z0-9_]? 使其可选。

^\.?@([a-z0-9_]{1,25})[^a-z0-9_]?.*$

这里是 Online demo

关于regex - R正则表达式在@之后解析 token ，字符串中也没有其他 token ，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24946533/

上一篇：proof - 有限多重集作为 Cubical Agda 中的 HIT

下一篇：Xcode 没有设备注册错误

相关文章：

PHP:标记化，使用正则表达式(moSTLy there)

R 中的正则表达式用于匹配仅包含非单词字符的单词

lua - 将 '%' 放入 Lua 中的 string.gsub() 中

java - 是否有 CheckStyle 规则强制 if else 关键字在 if/else 阶梯中位于同一行？

regex - 用 r 中正确的位数替换数字

java - 将字符串中的 <img> 替换为 <img></img>

windows - 从本地修改后的 zip 文件安装修改后的包时出错

r - 删除 ggplot 中中断之间的刻度

r - 将 dcast.data.table 与日期值和聚合一起使用

r - as.numeric() 为应该是数字的内容生成 NA