我正在尝试实现搜索功能。
用户输入一个短语,我想匹配该短语中的任何单词以及字符串数组中的短语本身。
问题在于该短语存储在变量中,因此 Pattern.compile
方法不会解释其特殊字符。
我在编译方法中使用以下标志:
Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE | Pattern.LITERAL | Pattern.MULTILINE
怎样才能达到预期的结果?
提前致谢。
编辑: 例如,短语:
"Dog cats donuts"
会产生以下模式:
Dogs | cats | donuts | Dogs cats donuts
最佳答案
- 将用户指定的短语按
\s+
拆分为arr
。 构建以下模式:
"\\b(?:" + Pattern.quote(arr[0]) + "|" + Pattern.quote(arr[1]) + "|" + Pattern.quote(arr[2]) + ... + "\\b"
Compile without the
Pattern.LITERAL
option.
In other words, if you want your patterns to match words in a user-specified phrase, you have to use alternation (the pipes) so that any one of those words can be considered a match. However, using the Pattern.LITERAL
option makes the alternation operators literal—therefore you have to "literalize" just the words themselves, using the Pattern.quote(...)
method. The \\b
are word boundaries so that you do not match, say, a word in the user's phrase like "bar" when encountering text like "barrage".
Edit. In response to your edit. If you want to match the longest possible match, e.g. not "Dogs" and "cats" and "donuts" but rather "Dogs cats donuts", you should place the complete phrase in the beginning of the alternation series, e.g.
\\b(Dogs cats donuts|Dogs|cats|donuts)\\b
关于Java 正则表达式 : Match any word from pattern,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17985990/