lua - Lua 中用单反斜杠替换输入中的双反斜杠

标签 lua escaping

假设我有 str 变量,我为其分配值 test\\ttest (或者它实际上可能只是 \\ ,对于本例)。我想做的是用单反斜杠替换双反斜杠。

目的很明确:我想输出\t转义序列(水平制表符),而现在只是输出为纯文本\t

很明显我不能使用:

str:gsub("\\","\")

因为这会导致语法错误,被 \" 识别为转义序列。我尝试了所有我能想到的方法。我还尝试使用 loadstring() (和嵌套loadstring() 也调用)但也失败了。

请不要说要做:

str:gsub("\\t","\t")

当然,它会起作用,但这不是我需要的。我需要用单个反斜杠替换双反斜杠。

最佳答案

我怀疑您对引用感到困惑,因为 string.gsub 可以替换反斜杠字符:

C:...> lua
Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
> a="test\\\\ttest"
> =a
test\\ttest
> =a:gsub([[\\]],[[\]])
test\ttest       1
>

The backslash is used a as a character escape in double and single-quoted strings, but not in long strings written with the [[...]] notation. In the usual string constant, backslash consumes one or more following characters, and replaces the whole sequence with a single byte in the internal string value. So "\\" is a single byte string containing a single backslash, "\" is a syntax error, and "\"" is a single byte string containing a double quotation mark.

Adding to the confusion is that Lua patterns as understood by string.gsub (and its siblings) use % characters for quoting and for naming special patterns. This is one of the more visible differences between Lua patterns and the regular expressions supported by other languages. To a Lua pattern, a backslash is just an ordinary character.

So when I set the value of a above, I used extra backslashes to get the string value to have two total. I could have written a=[[test\\ttest]] to the same effect. The call to gsub was written with the simple pattern that replaced doubled backslashes with singles. As can be seen, it succeeded and the result is the string test\ttest (along with a count of matches as the second return value).

In short, the substitution you as ask for in the question "just works" as expected.

But reading between the lines, that isn't quite what you wanted. It appears you are trying to convert the string test\\ttest to test<TAB>test. If that single conversion is what you wanted, then just write it as such: a:gsub([[\\t]],"\t"). (Note that I used quotes so that the string literal will interpret the \t to mean an ASCII character in the replacement value.)

The more general case is more difficult, because you not only have to handle the normal single-letter escapes for tab, bell, backspace, carriage return, newline, and so forth, but you also have to handle the one to three digit decimal code sequence.

Update: The temptation to write something that handles all backslash escapes as the Lua compiler does for string literals proved too strong.

function unbackslashed(s)
    local ch = {
        ["\\a"] = '\\007', --'\a' alarm             Ctrl+G BEL
        ["\\b"] = '\\008', --'\b' backspace         Ctrl+H BS
        ["\\f"] = '\\012', --'\f' formfeed          Ctrl+L FF
        ["\\n"] = '\\010', --'\n' newline           Ctrl+J LF
        ["\\r"] = '\\013', --'\r' carriage return   Ctrl+M CR
        ["\\t"] = '\\009', --'\t' horizontal tab    Ctrl+I HT
        ["\\v"] = '\\011', --'\v' vertical tab      Ctrl+K VT
        ["\\\n"] = '\\010',--     newline
        ["\\\\"] = '\\092',--     backslash
        ["\\'"] = '\\039', --     apostrophe
        ['\\"'] = '\\034', --     quote
    }
    return s:gsub("(\\.)", ch)
        :gsub("\\(%d%d?%d?)", function(n)
            return string.char(tonumber(n))
        end)
end

如果解析用户提供的文本并希望处理用户提供的文本中的反斜杠转义,这样的函数可能会很有用。字符串文字应该已经由编译器处理。

另一个警告是,如果您发现自己的字符串部分翻译,那么您实际上可能会遇到设计不够清晰的问题。实际上,除了解析用户输入之外还需要这样的功能,这表明您的设计可能存在更深层次的问题。

函数unbackslashed的工作原理是首先将所有采用反斜杠形式的识别序列替换为等效的数字形式,后跟单个字符。第二遍将所有数字形式转换为其文字字符。需要两次传递,因为 string.gsub 理解的字符串模式不支持完整正则表达式解析器支持的替代表示法。否则,要匹配的模式可以类似于 Perl 的 /\\([0-9]{1-3})|\\(.)/ 编写,并一次性执行替换。

关于lua - Lua 中用单反斜杠替换输入中的双反斜杠,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19961598/

相关文章:

lua - 如何拆分包含子表的 Lua 表

c - 具有数组访问和面向对象访问的 lua 用户数据?

java - 在jsp异常中转义xml字符

java - 为什么 Swing Parser 的 handleText 不处理嵌套标签?

C#运行参数中有多个空格的进程

linux - 在 postgresql 的 shell 中转义单引号

Django 。在引号内使用 url 标签,在引号内

lua - 如何根据 Redis 中的另一个 SET 按其 concat 值过滤任何 SET

sqlite在打开连接后首先执行查询慢

lua - Awesome WM (v3.5.5) keygrabber 替代品