我正在尝试创建一个模式来查找字符串中的占位符,以便稍后能够用变量替换它们。根据我的要求,我遇到了在一个字符串中找到所有这些占位符的问题。
我已经找到这篇文章,但它只帮了一点点: Regex match ; but not \;
占位符看起来像这样
{&var} --> Variable stored in a dictionary --> dict("var")
{$prop} --> Property of a class cls.prop read by CallByName and PropGet
{#const} --> Some constant values by name from a function
通常我有这种模式并且效果很好
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.pattern = "\{([#\$&])([\w\.]+)\}"
例如我有这个字符串: “foo 的值是‘{&var}’,bar 是‘{$prop}’” 我按预期得到 2 场比赛
- (&)(变量)
- ($)(支持)
我还想为这个表达式添加一个类似 .Net 的格式化部分。
String.Format("This is a date: {0:dd.mm.yyyy}", DateTime.Now());
// This is a date: 05.07.2019
String.Format("This is a date, too: {0:dd.(mm).yyyy}", DateTime.Now());
// This is a date, too: 05.(07).2019
我扩展了 RegEx 以获得那个可选的格式化字符串
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.pattern = "\{([#\$&])([\w\.]+):{0,1}([^\}]*)\}"
RegEx.Execute("Value of foo is '{&var:DD.MM.YYYY}' and bar is '{$prop}'")
我得到了预期的 2 场比赛
- (&)(var)(DD.MM.YYYY)
- ($)(prop)()
此时我注意到我必须注意转义符“{”和“}”,因为可能我想在格式化结果中包含一些括号。
这不能正常工作,因为我的模式在“...{MM”之后停止
RegEx.Execute("Value of foo is '{&var:DD.{MM}.YYYY}' and bar is '{$prop}'")
在检查正则表达式之前向文本添加转义符号是可以的:
RegEx.Execute("Value of foo is '{&var:DD.\{MM\}.YYYY}' and bar is '{$prop}'")
但是我怎样才能正确地添加负向回顾呢?
第二:这如何适用于不应该解析的变量,即使它们具有正确的语法总线,外括号被转义了?
RegEx.Execute("This should not match '\{&var:DD.\{MM\}.YYYY\}' but this one '{&var:DD.\{MM\}.YYYY}'")
我希望我的问题不会令人困惑并且有人可以帮助我
2019 年 7 月 5 日 12:50 更新 在@wiktor-stribiżew 的大力帮助下,结果完成了。
根据要求,我提供了一些示例代码:
Sub testRegEx()
Debug.Print FillVariablesInText(Nothing, "Date\\\\{$var01:DD.\{MM\}.YYYY}\\\\ Var:\{$nomatch\}{$var02} Double: {#const}{$var01} rest of string")
End Sub
Function FillVariablesInText(ByRef dict As Dictionary, ByVal txt As String) As String
Const c_varPattern As String = "(?:(?:^|[^\\\n])(?:\\{2})*)\{([#&\$])([\w.]+)(?:\:([^}\\]*(?:\\.[^\}\\]*)*))?(?=\})"
Dim part As String
Dim snippets As New Collection
Dim allMatches, m
Dim i As Long, j As Long, x As Long, n As Long
' Create a RegEx object and execute pattern
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.pattern = c_varPattern
RegEx.MultiLine = True
RegEx.Global = True
Set allMatches = RegEx.Execute(txt)
' Start at position 1 of txt
j = 1
n = 0
For Each m In allMatches
n = n + 1
Debug.Print "(" & n & "):" & m.value
Debug.Print " [0] = " & m.SubMatches(0) ' Type [&$#]
Debug.Print " [1] = " & m.SubMatches(1) ' Name
Debug.Print " [2] = " & m.SubMatches(2) ' Format
part = "{" & m.SubMatches(0)
' Get offset for pre-match-string
x = 1 ' Index to Postion at least +1
Do While Mid(m.value, x, 2) <> part
x = x + 1
Loop
' Postition in txt
i = m.FirstIndex + x
' Anything to add to result?
If i <> j Then
snippets.Add Mid(txt, j, i - j)
End If
' Next start postition (not Index!) + 1 for lookahead-positive "}"
j = m.FirstIndex + m.Length + 2
' Here comes a function get a actual value
' e.g.: snippets.Add dict(m.SubMatches(1))
' or : snippets.Add Format(dict(m.SubMatches(1)), m.SubMatches(2))
snippets.Add "<<" & m.SubMatches(0) & m.SubMatches(1) & ">>"
Next m
' Any text at the end?
If j < Len(txt) Then
snippets.Add Mid(txt, j)
End If
' Join snippets
For i = 1 To snippets.Count
FillVariablesInText = FillVariablesInText & snippets(i)
Next
End Function
函数 testRegEx 给出了这个结果和调试打印:
(1):e\\\\{$var01:DD.\{MM\}.YYYY(2):}{$var02
[0] = $
[1] = var02
[2] =
(1):e\\\\{$var01:DD.\{MM\}.YYYY
[0] = $
[1] = var01
[2] = DD.\{MM\}.YYYY
(2):}{$var02
[0] = $
[1] = var02
[2] =
(3): {#const
[0] = #
[1] = const
[2] =
(4):}{$var01
[0] = $
[1] = var01
[2] =
Date\\\\<<$var01>>\\\\ Var:\{$nomatch\}<<$var02>> Double: <<#const>><<$var01>> rest of string
最佳答案
你可以使用
((?:^|[^\\])(?:\\{2})*)\{([#$&])([\w.]+)(?::([^}\\]*(?:\\.[^}\\]*)*))?}
为了确保也找到连续的匹配项,将最后一个 }
转换为先行,并且在提取匹配项时只需将其附加到结果中,或者如果您需要索引增加匹配长度通过 1:
((?:^|[^\\])(?:\\{2})*)\{([#$&])([\w.]+)(?::([^}\\]*(?:\\.[^}\\]*)*))?(?=})
^^^^^
参见 regex demo和 regex demo #2 .
详情
((?:^|[^\\])(?:\\{2})*)
- 第 1 组(确保出现的{
next 未转义):字符串或任何字符的开头,但\
后跟 0 个或多个双反斜杠\{
-{
字符([#$&])
- 第 2 组:三个字符中的任何一个([\w.]+)
- 第 3 组:1 个或多个单词或点字符(?::([^}\\]*(?:\\.[^}\\]*)*))?
-的可选序列:
然后是第 4 组:[^}\\]*
- 除了}
和\
之外的 0 个或更多字符
(?:\\.[^}\\]*)*
- 零个或多个\
转义字符,然后是 0 个或多个其他字符比}
和\
}
-}
字符
关于regex - 如何匹配转义组符号 {&date :dd.\{mm\}.yyyy} 而不是 {&date :dd. {mm}.yyyy} 与 vba 和正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56894777/