我需要匹配文本并获取匹配周围的单词。
例如,我的文本采用 HTML 格式,我将使用下面的示例
<p>Do not forget the error handling, I don't exactly know what happens if it wants to replace an occurence and can't find it</p>
<p>Edit: If you have multiple entries which should be replaced, loop the replace part until it will not be able to replace anymore then it will throw an error you can catch to continue</p>
匹配案例:
情况 1(如果中间有匹配单词):occurence
结果:I don't exactly know what happens if it wants to replace an occurence and can't find it
情况 2(如果第一个单词匹配单词):Do not
结果:Do not forget the error handling, I don't exactly know what happens if it wants to replace an occurence and can't find it
情况3(如果匹配文本中最后一个单词中的单词):to continue
结果:If you have multiple entries which should be replaced, loop the replace part until it will not be able to replace anymore then it will throw an error you can catch to continue
如果文本之间是单词,则应该在该单词周围获取文本。 如果匹配单词是第一个单词,那么它应该从第一个单词本身获取文本
如果匹配是最后一个单词,则获取匹配的最后一个单词之前的文本。
正则表达式 (?<=(\w+)\s)?(continue)(?=\s(\w+))?
它只匹配单词,我怎样才能让我们在匹配的关键字周围说出 10 -15 个单词。
使用正则表达式可以实现这一点吗
最佳答案
案例1:
([\w\s']+(?:occurence)[^<]+)|>((?:occurence)[^<]+)|[^>]+(?:occurence)<
输出:
I don't exactly know what happens if it wants to replace an occurence and can't find it
案例2:
([\w\s']+(?:Do not)[^<]+)|>((?:Do not)[^<]+)|[^>]+(?:Do not)<
输出:
Do not forget the error handling, I don't exactly know what happens if it wants to replace an occurence and can't find it
案例3:
([\w\s']+(?:to continue)[^<]+)|>((?:to continue)[^<]+)|[^>]+(?:to continue)<
输出:
Edit: If you have multiple entries which should be replaced, loop the replace part until it will not be able to replace anymore then it will throw an error you can catch to continue
限制字数:
案例1:
>(Do not(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w']+)){0,100}\s?Do not(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w',]+)){0,100}\s?Do not)<
案例2:
>(occurence(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w']+)){0,100}\s?occurence(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w',]+)){0,100}\s?occurence)<
案例3:
>(continue(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w']+)){0,100}\s?continue(?:\s(?:[\w']+),?){0,100})|((?:\s(?:[\w',]+)){0,100}\s?continue)<
关于C# 正则表达式匹配 TEXT 中的关键字并在匹配周围获取几个单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35771067/