javascript RegEx 主题标签匹配 #foo 和 #foo-fåäö 但不匹配 http ://this. is/no#hashtag

目前我们使用 javascript new RegExp('#[^,#=!\s][^,#=!\s]*') (参见 [1]) 它大部分都有效，除了它还匹配带有 anchor 的 URL，如 http://this.is/no#hashtag而且我们宁愿避免匹配 foo#bar

已经对前瞻进行了一些尝试，但它似乎不起作用，或者我只是不明白。

源文本如下:

#public #writable #kommentarer-till-beta -- all these should be matched
Verkligen #bra jobbat! T ex #kommentarer till #artiklar och #blogginlägg, kool. -- mixed within text
http://this.is/no#hashtag -- problem
xxy#bar      -- We'd prefer not matching this one, and...
#foo=bar   =foo#bar  -- we probably shouldn't match any of those either.
#foo,bar #foo;bar #foo-bar #foo:bar   -- We're flexible on whether these get matched in part or in full

我们希望得到以下输出:

(出于可读性原因，显示 $ 而不是 ...)

$ $ $ -- all these should be matched
Verkligen $ jobbat! T ex $ till $ och $, kool. -- mixed within text
http://this.is/no$ -- problem
xxy$      -- We'd prefer not matching this one, and...
$=bar   =foo$  -- we probably shouldn't match any of those either.
$,bar $ $ $   -- We're flexible on whether these get matched in part or in full

[1] http://github.com/ether/pad/blob/master/etherpad/src/plugins/twitterStyleTags/hooks.js

最佳答案

我相信寻找单词边界在这里可以解决问题(或者，显然缺乏单词边界 - 这对我来说似乎相当违反直觉)。

\B#[^,#=!\s]+ 与第三行或第四行上的任何内容都不匹配。但是，它确实匹配 #foo=bar 中的 #foo，以及示例中 $ 符号涵盖的所有其他内容。

编辑:经过一番摆弄后， \B#[^,#=!\s]+[\s,] 将匹配第一行和第二行上的所有内容。第 3-5 行没有任何内容匹配，第 6 行，除了 #foo,bar 之外的所有内容都完全匹配(#foo,bar 仅在逗号之前的部分匹配。

您可能希望捕获组在末尾省略空格或逗号，因此为 \B(#[^,#=!\s]+)[\s,] .

(如果您确实希望第 6 行上的所有标记完全匹配，请从第一个字符类中删除逗号。)

请注意，您可能需要更多内容才能完美覆盖，但这至少满足您当前的测试用例。

关于javascript RegEx 主题标签匹配 #foo 和 #foo-fåäö 但不匹配 http ://this. is/no#hashtag，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/2588882/

javascript RegEx 主题标签匹配 #foo 和 #foo-fåäö 但不匹配 http ://this. is/no#hashtag

上一篇：php - 使用 php、ajax 和 mysql 的随机报价生成器

下一篇：javascript - JS ajax调用: need to sleep,检查是否输入了更多字符然后执行