我有一个文本文件，其中包含重复结构作为标题和详细记录，例如

StopService::
697::12::test::20::<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="1e7f5e677f767171307d7173" rel="noreferrer noopener nofollow">[email protected]</a>::20 Main Rd::Alcatraz::CA::1200::Please send me Information to
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="3475745359555d581a575b59" rel="noreferrer noopener nofollow">[email protected]</a>::0::::

我想删除标题和详细记录之间的换行符，以便将它们作为单个记录处理，因为详细记录也可以包含换行符，我只需要删除直接跟在 :: 之后的换行符标志。

我不是使用正则表达式的专家，所以我搜索并尝试使用这种方法，但它不起作用:

 string text = File.ReadAllText(path);
 Regex.Replace(text, @"(?<=(:))(?!\1):\n", String.Empty);
 File.WriteAllText(path, text);

我也尝试过这个:

Regex.Replace(text, @"(?<=::)\n", String.Empty);

知道在这种情况下如何使用正则表达式后视吗？我的输出应该如下所示:

StopService::697::12::test::20::<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0160417860696e6e2f626e6c" rel="noreferrer noopener nofollow">[email protected]</a>::20 Main Rd::Alcatraz::CA::1200::Please send me Information to
    <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="eaabaa8d878b8386c4898587" rel="noreferrer noopener nofollow">[email protected]</a>::0::::

最佳答案

非正则表达式方式

逐行读取文件。检查第一行是否等于 StopService::不要在其后添加换行符 ( Environment.Newline )。

正则表达式方式

您可以匹配第一个 :: 之后的换行符使用(?<=^[^:]*::)向后看:

var str = "StopService::\r\n697::12::test::20::<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="3051704951585f5f1e535f5d" rel="noreferrer noopener nofollow">[email protected]</a>::20 Main Rd::Alcatraz::CA::1200::Please send me Information to\r\<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c8a68988afa5a9a1a4e6aba7a5" rel="noreferrer noopener nofollow">[email protected]</a>::0::::";
var rgx = new Regex(@"(?<=^[^:]*::)[\r\n]+");
Console.WriteLine(rgx.Replace(str, string.Empty));

输出:

StopService::697::12::test::20::<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="caab8ab3aba2a5a5e4a9a5a7" rel="noreferrer noopener nofollow">[email protected]</a>::20 Main Rd::Alcatraz::CA::1200::Please send me Information to
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="91d0d1f6fcf0f8fdbff2fefc" rel="noreferrer noopener nofollow">[email protected]</a>::0::::

参见IDEONE demo

后视 ( (?<=...) ) 匹配:

^ - 字符串开头
[^:]* - 除 : 之外的 0 个或多个字符
:: - 2 个冒号

[\r\n]+模式确保我们匹配所有换行符，即使有多个换行符。

关于c# - 如何在 C# 正则表达式中使用lookbehind 来删除换行符？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30602447/

c# - 如何在 C# 正则表达式中使用lookbehind 来删除换行符？

非正则表达式方式

正则表达式方式

上一篇：c# - 正确设计 WCF 服务

下一篇：c# - WPF隐式数据模板不显示 View