Solr:每个文档的 fieldNorm 不同，没有文档提升

我希望我的搜索结果按分数排序，他们正在这样做，但分数计算不正确。这就是说，不一定不正确，但与预期不同，我不确定为什么。我的目标是消除任何改变分数的因素。

如果我对两个对象执行匹配的搜索(其中对象 A 的分数预计高于对象 B)，则首先返回对象 B。

在此示例中，假设我的查询是单个术语:“apples”。

ObjectA's title: "apples are apples" (2/3 terms)
ObjectA's description: "There were apples in the apples-apples and now the apples went all apples all over the apples!" (6/18 terms)
ObjectB's title: "apples are great" (1/3 terms)
ObjectB's description: "There were apples in the apples-room and now the apples went all bad all over the apples!" (4/18 terms)

标题字段没有提升(或者更确切地说，提升为 1)，描述字段的提升为 0.8。我没有通过 solrconfig.xml 或我正在通过的查询指定文档提升。如果有另一种方法来指定文档增强，我可能会遗漏一种。

分析explain打印输出后，看起来ObjectA正在正确计算出比ObjectB更高的分数，就像我想要的那样，除了一个> 区别:ObjectB 的 title fieldNorm 始终高于 ObjectA 的。

<小时/>

下面是解释打印输出。您知道:标题字段为 mditem5_tns，描述字段为 mditem7_tns:

ObjectB:
1.3327172 = (MATCH) sum of:
  1.0352166 = (MATCH) max plus 0.1 times others of:
    0.9766194 = (MATCH) weight(mditem5_tns:appl in 0), product of:
      0.53929156 = queryWeight(mditem5_tns:appl), product of:
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.2977981 = queryNorm
      1.8109303 = (MATCH) fieldWeight(mditem5_tns:appl in 0), product of:
        1.0 = tf(termFreq(mditem5_tns:appl)=1)
        1.8109303 = idf(docFreq=3, maxDocs=9)
        1.0 = fieldNorm(field=mditem5_tns, doc=0)
    0.58597165 = (MATCH) weight(mditem7_tns:appl^0.8 in 0), product of:
      0.43143326 = queryWeight(mditem7_tns:appl^0.8), product of:
        0.8 = boost
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.2977981 = queryNorm
      1.3581977 = (MATCH) fieldWeight(mditem7_tns:appl in 0), product of:
        2.0 = tf(termFreq(mditem7_tns:appl)=4)
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.375 = fieldNorm(field=mditem7_tns, doc=0)
  0.2975006 = (MATCH) FunctionQuery(1000.0/(1.0*float(top(rord(lastmodified)))+1000.0)), product of:
    0.999001 = 1000.0/(1.0*float(1)+1000.0)
    1.0 = boost
    0.2977981 = queryNorm

ObjectA:
1.2324848 = (MATCH) sum of:
  0.93498427 = (MATCH) max plus 0.1 times others of:
    0.8632177 = (MATCH) weight(mditem5_tns:appl in 0), product of:
      0.53929156 = queryWeight(mditem5_tns:appl), product of:
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.2977981 = queryNorm
      1.6006513 = (MATCH) fieldWeight(mditem5_tns:appl in 0), product of:
        1.4142135 = tf(termFreq(mditem5_tns:appl)=2)
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.625 = fieldNorm(field=mditem5_tns, doc=0)
    0.7176658 = (MATCH) weight(mditem7_tns:appl^0.8 in 0), product of:
      0.43143326 = queryWeight(mditem7_tns:appl^0.8), product of:
        0.8 = boost
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.2977981 = queryNorm
      1.6634457 = (MATCH) fieldWeight(mditem7_tns:appl in 0), product of:
        2.4494898 = tf(termFreq(mditem7_tns:appl)=6)
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.375 = fieldNorm(field=mditem7_tns, doc=0)
  0.2975006 = (MATCH) FunctionQuery(1000.0/(1.0*float(top(rord(lastmodified)))+1000.0)), product of:
    0.999001 = 1000.0/(1.0*float(1)+1000.0)
    1.0 = boost
    0.2977981 = queryNorm

最佳答案

该问题是由词干分析器引起的。它将“apples are apples”扩展为“apples appl are apples appl”，从而使字段更长。由于文档 B 仅包含 1 个由词干分析器扩展的术语，因此该字段比文档 A 更短。

这会导致不同的 fieldNorms。

关于Solr:每个文档的 fieldNorm 不同，没有文档提升，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/3102895/

Solr:每个文档的 fieldNorm 不同，没有文档提升

上一篇：java - 如何访问 arrayList java 中包含的对象的参数

下一篇：symfony - 如何验证从数据库获取然后在 symfony2 中翻译的选择？