java - 仅当字符串包含每个列表中的单词时才匹配的正则表达式

标签 java regex permutation

我正在开发一个软件,它必须检查文本是否包含从指定列表中获取的单词以及从另一个指定列表中获取的单词。

示例:

list 1: dog, cat
list 2: house, tree

以下文本必须匹配:

the dog is in the house -> contains dog and house
my house is full of dogs -> contains dog and house
the cat is on the tree -> contains cat and tree

以下示例必须匹配

the frog is in the house -> there is no word from the first list
Boby is the name of my dog -> there is no word from the second list
Outside my house there is a tree -> there is no word from the first list

为了快速解决问题,我列出了如下模式:

dog.*house, house.*dog, cat.*house, ...

但我很确定有一种更聪明的方法......

最佳答案

您可以对每组替代项使用交替 (|),并对订单使用包装交替。所以:

(?:(?:dog|cat).*(?:house|tree))|(?:(?:house|tree).*(?:dog|cat))

JavaScript 示例(非捕获组和交替在 Java 和 JavaScript 中的工作方式相同):

var tests = [
    {match: true,  text: "the dog is in the house -> contains dog and house"},
    {match: true,  text: "my house is full of dogs -> contains dog and house"},
    {match: true,  text: "the cat is on the tree -> contains cat and tree"},
    {match: false, text: "the frog is in the house -> there is no word from the first list"},
    {match: false, text: "Boby is the name of my dog -> there is no word from the second list"},
    {match: false, text: "Outside my house there is a tree -> there is no word from the first list"}
];
var rex = /(?:(?:dog|cat).*(?:house|tree))|(?:(?:house|tree).*(?:dog|cat))/;
tests.forEach(function(test) {
  var result = rex.test(test.text);
  if (!!result == !!test.match) {
    console.log('GOOD: "' + test.text + '": ' + result);
  } else {
    console.log('BAD: "' + test.text + '": ' + result + ' (expected ' + test.match + ')');
  }
});
.as-console-wrapper {
  max-height: 100% !important;
}

<小时/>

请注意,在上面我们没有检查单词,只是检查字母序列。如果您希望它是实际的单词,则需要添加断词断言或类似内容。留给读者作为练习......

关于java - 仅当字符串包含每个列表中的单词时才匹配的正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48563933/

相关文章:

java - 识别Java字节码中的循环

java - 15GB 后数据传输速率变慢,用于更大的文件传输

regex - 正则表达式匹配重复的最佳实践

c - 使用程序的输出作为同一程序的输入

java - SSL上下文。我应该重新加载它吗?

java - 使用 Maven 时,Derby 给出 ClassNotFoundException : org. apache.derby.jdbc.EmbeddedDriver

具有多个替换的 javascript .replace() 方法

r - 在字符串中添加前导零

python - 递归置换打印机的时间复杂度

python - 我想让 Euler 24 项目的解决方案更有效率