C# 正则表达式匹配 TEXT 中的关键字并在匹配周围获取几个单词

标签 c# regex

我需要匹配文本并获取匹配周围的单词。

例如,我的文本采用 HTML 格式,我将使用下面的示例

<p>Do not forget the error handling, I don't exactly know what happens if it wants to replace an occurence and can't find it</p>
<p>Edit: If you have multiple entries which should be replaced, loop the replace part until it will not be able to replace anymore then it will throw an error you can catch to continue</p>

匹配案例:

情况 1(如果中间有匹配单词):occurence

结果:I don't exactly know what happens if it wants to replace an occurence and can't find it

情况 2(如果第一个单词匹配单词):Do not

结果:Do not forget the error handling, I don't exactly know what happens if it wants to replace an occurence and can't find it

情况3(如果匹配文本中最后一个单词中的单词):to continue

结果:If you have multiple entries which should be replaced, loop the replace part until it will not be able to replace anymore then it will throw an error you can catch to continue

如果文本之间是单词,则应该在该单词周围获取文本。 如果匹配单词是第一个单词,那么它应该从第一个单词本身获取文本

如果匹配是最后一个单词,则获取匹配的最后一个单词之前的文本。

正则表达式 (?<=(\w+)\s)?(continue)(?=\s(\w+))?

它只匹配单词,我怎样才能让我们在匹配的关键字周围说出 10 -15 个单词。

使用正则表达式可以实现这一点吗

最佳答案

案例1:

([\w\s']+(?:occurence)[^<]+)|>((?:occurence)[^<]+)|[^>]+(?:occurence)<

Regex Demo

输出:

I don't exactly know what happens if it wants to replace an occurence and can't find it

案例2:

([\w\s']+(?:Do not)[^<]+)|>((?:Do not)[^<]+)|[^>]+(?:Do not)<

[Regex Demo]

输出:

Do not forget the error handling, I don't exactly know what happens if it wants to replace an occurence and can't find it

案例3:

([\w\s']+(?:to continue)[^<]+)|>((?:to continue)[^<]+)|[^>]+(?:to continue)<

Regex Demo

输出:

Edit: If you have multiple entries which should be replaced, loop the replace part until it will not be able to replace anymore then it will throw an error you can catch to continue

限制字数:

案例1:

>(Do not(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w']+)){0,100}\s?Do not(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w',]+)){0,100}\s?Do not)<

Regex Demo

案例2:

>(occurence(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w']+)){0,100}\s?occurence(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w',]+)){0,100}\s?occurence)<

Regex Demo

案例3:

>(continue(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w']+)){0,100}\s?continue(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w',]+)){0,100}\s?continue)<

Regex Demo

关于C# 正则表达式匹配 TEXT 中的关键字并在匹配周围获取几个单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35771067/

相关文章:

javascript - 是否可以编写短正则表达式来替换函数

javascript - 在 Javascript 字符串中检测俄语/西里尔字母?

javascript - 正则表达式 : find "//" followed by any symbol except space

regex - 用于 CSV 校正的 Perl REGEX 模式

c# - 如何将属性隐藏到用于 XML 序列化的基类中

javascript - 搜索文本框的 XSS 脚本

c# - 为什么在枚举声明中使用 int?

regex - 从 fasta 文件生成随机子集序列

c# - IOS 中奇怪的统一错误

c# - 为什么这个方法每次都返回相同的随机字符串?