Solr:每个文档的 fieldNorm 不同，没有文档提升-6ren

Solr:每个文档的 fieldNorm 不同，没有文档提升

转载作者：行者123 更新时间：2023-12-02 02:09:06

我希望我的搜索结果按分数排序，他们正在这样做，但分数计算不正确。这就是说，不一定不正确，但与预期不同，我不确定为什么。我的目标是消除任何改变分数的因素。

如果我对两个对象执行匹配的搜索(其中对象 A 的分数预计高于对象 B)，则首先返回对象 B。

在此示例中，假设我的查询是单个术语:“apples”。

ObjectA's title: "apples are apples" (2/3 terms)
ObjectA's description: "There were apples in the apples-apples and now the apples went all apples all over the apples!" (6/18 terms)
ObjectB's title: "apples are great" (1/3 terms)
ObjectB's description: "There were apples in the apples-room and now the apples went all bad all over the apples!" (4/18 terms)

标题字段没有提升(或者更确切地说，提升为 1)，描述字段的提升为 0.8。我没有通过 solrconfig.xml 或我正在通过的查询指定文档提升。如果有另一种方法来指定文档增强，我可能会遗漏一种。

分析explain打印输出后，看起来ObjectA正在正确计算出比ObjectB更高的分数，就像我想要的那样，除了一个> 区别:ObjectB 的 title fieldNorm 始终高于 ObjectA 的。

<小时/>

下面是解释打印输出。您知道:标题字段为 mditem5_tns，描述字段为 mditem7_tns:

ObjectB:
1.3327172 = (MATCH) sum of:
  1.0352166 = (MATCH) max plus 0.1 times others of:
    0.9766194 = (MATCH) weight(mditem5_tns:appl in 0), product of:
      0.53929156 = queryWeight(mditem5_tns:appl), product of:
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.2977981 = queryNorm
      1.8109303 = (MATCH) fieldWeight(mditem5_tns:appl in 0), product of:
        1.0 = tf(termFreq(mditem5_tns:appl)=1)
        1.8109303 = idf(docFreq=3, maxDocs=9)
        1.0 = fieldNorm(field=mditem5_tns, doc=0)
    0.58597165 = (MATCH) weight(mditem7_tns:appl^0.8 in 0), product of:
      0.43143326 = queryWeight(mditem7_tns:appl^0.8), product of:
        0.8 = boost
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.2977981 = queryNorm
      1.3581977 = (MATCH) fieldWeight(mditem7_tns:appl in 0), product of:
        2.0 = tf(termFreq(mditem7_tns:appl)=4)
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.375 = fieldNorm(field=mditem7_tns, doc=0)
  0.2975006 = (MATCH) FunctionQuery(1000.0/(1.0*float(top(rord(lastmodified)))+1000.0)), product of:
    0.999001 = 1000.0/(1.0*float(1)+1000.0)
    1.0 = boost
    0.2977981 = queryNorm

ObjectA:
1.2324848 = (MATCH) sum of:
  0.93498427 = (MATCH) max plus 0.1 times others of:
    0.8632177 = (MATCH) weight(mditem5_tns:appl in 0), product of:
      0.53929156 = queryWeight(mditem5_tns:appl), product of:
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.2977981 = queryNorm
      1.6006513 = (MATCH) fieldWeight(mditem5_tns:appl in 0), product of:
        1.4142135 = tf(termFreq(mditem5_tns:appl)=2)
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.625 = fieldNorm(field=mditem5_tns, doc=0)
    0.7176658 = (MATCH) weight(mditem7_tns:appl^0.8 in 0), product of:
      0.43143326 = queryWeight(mditem7_tns:appl^0.8), product of:
        0.8 = boost
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.2977981 = queryNorm
      1.6634457 = (MATCH) fieldWeight(mditem7_tns:appl in 0), product of:
        2.4494898 = tf(termFreq(mditem7_tns:appl)=6)
        1.8109303 = idf(docFreq=3, maxDocs=9)
        0.375 = fieldNorm(field=mditem7_tns, doc=0)
  0.2975006 = (MATCH) FunctionQuery(1000.0/(1.0*float(top(rord(lastmodified)))+1000.0)), product of:
    0.999001 = 1000.0/(1.0*float(1)+1000.0)
    1.0 = boost
    0.2977981 = queryNorm

最佳答案

该问题是由词干分析器引起的。它将“apples are apples”扩展为“apples appl are apples appl”，从而使字段更长。由于文档 B 仅包含 1 个由词干分析器扩展的术语，因此该字段比文档 A 更短。

这会导致不同的 fieldNorms。

关于Solr:每个文档的 fieldNorm 不同，没有文档提升，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/3102895/

文章推荐： javascript - 如何在JS中删除存储在数组中的输入

文章推荐： gpu - Onnxruntime 与 PyTorch

文章推荐： java - 如何访问 arrayList java 中包含的对象的参数

solr 评分 - fieldnorm
当我搜索“iphone”时，我有以下记录和分数 - 记录1: 字段名称 - 显示名称:“iPhone” 字段名称 - 名称:“iPhone” 11.654595 = (MATCH) sum of:
Lucene fieldNorm 相似度计算与查询时值的差异
我想了解如何 fieldNorm计算(在索引时)，然后在查询时使用(并且显然重新计算)。在所有示例中，我都使用没有停用词的 StandardAnalyzer。调试 DefaultSimilarit
ElasticSearch fieldNorm 始终为 1
我最近开始使用 elasticsearch，所以如果这是一个“基本”问题，我深表歉意。我也一直在将我们的 Material 从 ES 版本 1.3 迁移到 2.4(!)，所以在这个过程中有些东西已经坏
Solr:每个文档的 fieldNorm 不同，没有文档提升
我希望我的搜索结果按分数排序，他们正在这样做，但分数计算不正确。这就是说，不一定不正确，但与预期不同，我不确定为什么。我的目标是消除任何改变分数的因素。如果我对两个对象执行匹配的搜索(其中对象 A

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Solr:每个文档的 fieldNorm 不同，没有文档提升