java - Lucene 的评分如何取决于查询的相对位置？

我使用 WhitespaceAnalyzer 作为查询分析器。如果我有 2 个文档:

| text | a b c |
| text | b a c |

text 是一个字段。

现在索引结构是这样的:

|Term|  in document | 
| a  | a b c / b a c|
| b  | a b c / b a c|
| c  | a b c / b a c|

我有一个疑问:

| text | a b c |

如何才能获得较高的 a b c 分数和较低的 b a c 分数。

Lucene是否支持根据相对位置计算分数？

我发现这会有所帮助:

PhraseQuery phraseQuery = new PhraseQuery();
phraseQuery.setSlop(1);

这样他们就会得到不同的分数。

在这里我遇到了另一个问题: https://stackoverflow.com/questions/18394532/how-can-lucenes-scoring-depend-on-terms-relative-position-in-the-document

最佳答案

这取决于您使用的查询类型。如果您搜索的短语按正确顺序放置(例如 new york 或 york new)，某些查询可能会获得更高的分数。根据 Lucene 文档，您可以使用分数解释来查看为什么 A B C 的分数高于 B A C。

Scoring is very much dependent on the way documents are indexed, so it is important to understand indexing (see Apache Lucene - Getting Started Guide and the Lucene file formats before continuing on with this section.) It is also assumed that readers know how to use the Searcher.explain(Query query, int doc) functionality, which can go a long way in informing why a score is returned.

http://lucene.apache.org/core/3_6_2/scoring.html

UPD。如果您使用 Lucene 3，要存储术语的位置，请查看此内容 http://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/document/Field.TermVector.html

关于java - Lucene 的评分如何取决于查询的相对位置？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/18247778/

java - Lucene 的评分如何取决于查询的相对位置？

上一篇：java - 解密字符串时出现数字格式异常

下一篇：java - 在非 Windows 系统上创建 Windows 安装程序