我正在开发一个软件,它必须检查文本是否包含从指定列表中获取的单词以及从另一个指定列表中获取的单词。
示例:
list 1: dog, cat
list 2: house, tree
以下文本必须匹配:
the dog is in the house -> contains dog and house
my house is full of dogs -> contains dog and house
the cat is on the tree -> contains cat and tree
以下示例必须不匹配
the frog is in the house -> there is no word from the first list
Boby is the name of my dog -> there is no word from the second list
Outside my house there is a tree -> there is no word from the first list
为了快速解决问题,我列出了如下模式:
dog.*house, house.*dog, cat.*house, ...
但我很确定有一种更聪明的方法......
最佳答案
您可以对每组替代项使用交替 (|
),并对订单使用包装交替。所以:
(?:(?:dog|cat).*(?:house|tree))|(?:(?:house|tree).*(?:dog|cat))
JavaScript 示例(非捕获组和交替在 Java 和 JavaScript 中的工作方式相同):
var tests = [
{match: true, text: "the dog is in the house -> contains dog and house"},
{match: true, text: "my house is full of dogs -> contains dog and house"},
{match: true, text: "the cat is on the tree -> contains cat and tree"},
{match: false, text: "the frog is in the house -> there is no word from the first list"},
{match: false, text: "Boby is the name of my dog -> there is no word from the second list"},
{match: false, text: "Outside my house there is a tree -> there is no word from the first list"}
];
var rex = /(?:(?:dog|cat).*(?:house|tree))|(?:(?:house|tree).*(?:dog|cat))/;
tests.forEach(function(test) {
var result = rex.test(test.text);
if (!!result == !!test.match) {
console.log('GOOD: "' + test.text + '": ' + result);
} else {
console.log('BAD: "' + test.text + '": ' + result + ' (expected ' + test.match + ')');
}
});
.as-console-wrapper {
max-height: 100% !important;
}
请注意,在上面我们没有检查单词,只是检查字母序列。如果您希望它是实际的单词,则需要添加断词断言或类似内容。留给读者作为练习......
关于java - 仅当字符串包含每个列表中的单词时才匹配的正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48563933/