c# - 如何为提及和主题标签修复此正则表达式?

标签 c# android regex string regex-group

我使用了以下 tool构建一个有效的 regex用于提及和主题标签。我已经设法在插入的文本中匹配到我想要的内容,但我需要解决以下匹配问题。

  • Only match those substrings which start and end with spaces. And in the case of a substring at the beginning or at the end of the string that is valid (be it a hashtag or a mention), also take it.

  • The matches found by the regex only take the part that does not contain spaces, (that the spaces are only part of the rule, but not part of the substring).



"@hello friend" - @hello must be matched as a mention.
"@ hello friend" - here there should be no matches.
"hey@hello @hello" - here only the last @hello must be matched as a mention.
"@hello! hi @hello #hi ##hello" - here only the second @hello and #hi must be matched as a mention and hashtag respectively.

图像中的另一个示例,其中只有 "@word" 应该是有效的提及:

enter image description here

2018 年 3 月 15 日 16:35 (GMT-4) 更新

我找到了解决问题的方法,使用 tool在 PCRE 模式下(服务器)并使用 negative lookbehindnegative lookahead:



enter image description here

但现在疑问来了,它与C#中的正则表达式一起工作吗?negative lookaheadnegative lookbehind,因为例如在 Javascript 中它不会工作,正如在工具中看到的那样,它用红线标记我。





  • (?:创建一个非捕获组
  • ^|\s+匹配字符串或空格的开头
  • (?:创建一个非捕获组
  • (?<mention>@|(?<hash>#)创建一个组来匹配 @#并分别命名组mention和hash
  • (?<item>\w+)与任何字母数字字符匹配一次或多次,并帮助从组中提取项目以便于使用。
  • (?=\s+)创建一个积极的前景来匹配任何空白

fiddle :Live Demo


更新 既然你提到你在使用 C#,我想我会为你提供一个 .NET 解决方案来解决你的问题,而不需要 RegEx;虽然我没有测试结果,但我猜这也比使用 RegEx 更快。

就个人而言,我的 .NET 风格是 Visual Basic,所以我为您提供了一个 VB.NET 解决方案,但您可以通过转换器轻松地运行它,因为我从不使用任何不能在C#:

Private Function FindTags(ByVal lead As Char, ByVal source As String) As String()
    Dim matches As List(Of String) = New List(Of String)
    Dim current_index As Integer = 0

    'Loop through all but the last character in the source
    For index As Integer = 0 To source.Length - 2
        'Reset the current index
        current_index = index

        'Check if the current character is a "@" or "#" and either we're starting at the beginning of the String or the last character was whitespace and then if the next character is a letter, digit, or end of the String
        If source(index) = lead AndAlso (index = 0 OrElse Char.IsWhiteSpace(source, index - 1)) AndAlso (Char.IsLetterOrDigit(source, index + 1) OrElse index + 1 = source.Length - 1) Then
            'Loop until the next character is no longer a letter or digit
                current_index += 1
            Loop While current_index + 1 < source.Length AndAlso Char.IsLetterOrDigit(source, current_index + 1)

            'Check if we're at the end of the line or the next character is whitespace
            If current_index = source.Length - 1 OrElse Char.IsWhiteSpace(source, current_index + 1) Then
                'Add the match to the collection
                matches.Add(source.Substring(index, current_index + 1 - index))
            End If
        End If

    Return matches.ToArray()
End Function

fiddle :Live Demo

关于c# - 如何为提及和主题标签修复此正则表达式?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49308174/


java - 提取出现在特定模式之后的子串

c# - .NET 中用于生成随机数的算法是什么?

c# - 根据一组规则验证日期

java - OpenGLES - 在运行时创建对象

java - Android 操作未显示在操作栏中


c# - 仅在 xml 文档中保存 xml header 会引发错误

c# - 如何通过 HTTP GET ASP.NET C# 获取链接中的参数

java - 使用 jsoup 解析图像

javascript - 正则表达式抓取 JavaScript