regex - Emacs 正则表达式 : any characters spanning multiple lines between matching pattern

我要找I - <characters> I -并将其替换为 I - <characters>, I - .
<characters>可以是任何内容，包括 Tab、Newline、Whitespace、* 等。

例如:I - John M. Smith I -应替换为 I - John M. Smith, I - .

我试过类似的东西:

M-x Query replace regexp
\(I - \)\([a-z]+\) \(I - \)
\1\2, \3

它不工作。你能帮忙吗？

最佳答案

这可以通过对正则表达式进行一些调整来实现。

输入

I - abc I - 
I - defgh I - 
I - John M. Smith I - 
I - 1234567 I - 
I - 12345
67 I - 
I - 12345
6789ABC
DE F G H IJK
LM N O P I -

命令

M-x query-replace-regexp
\(I - \)\(\(.*?
\)*?.*?\)\( I - \)
\1\2,\4

请注意，上面的匹配正则表达式实际上更像这样......

\(I - \)\(\(.*?\n\)*?.*?\)\( I - \)

...用 \n 表示换行符。在迷你缓冲区中，您需要输入 \n 作为 C-q C-j 。

输出

I - abc, I - 
I - defgh, I - 
I - John M. Smith, I - 
I - 1234567, I - 
I - 12345
67, I - 
I - 12345
6789ABC
DE F G H IJK
LM N O P, I -

解释

您的原始正则表达式与中间的字符类 [a-z]+ 匹配。不过，你也说过:

The can be anything including Tab, Newline, Whitespace, *, & etc.

为了支持这一点，我们可以更改为 .* 以匹配任何字符。但是，这可能会消耗过多的输入，因此我们使用 ? 进行延迟匹配。最后一个棘手的一点是多行匹配，因为你说可能会有换行符。为了支持这一点，我们添加了 \n 处理。

只看中间部分，我们有......

\(\(.*?\n\)*?.*?\)

...并且您可以将其读作“匹配任意数量的字符(懒惰)，然后是任意次数的换行符(懒惰)，再一次是任意数量的字符(懒惰地以免消耗到尾随的 I - 部分行)。

引用

GNU Emacs Manual 15.10.2: Regexp Replacement

Emacs Wiki Multiline Regexp

关于regex - Emacs 正则表达式 : any characters spanning multiple lines between matching pattern，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45181208/

regex - Emacs 正则表达式 : any characters spanning multiple lines between matching pattern

上一篇：jvm - Java 虚拟机和交换空间

下一篇：php - 如何使用纯 token ( token 作为字符串)从 Controller $jwtManager->decode($jwt) 解析 jwt token