gpt4 book ai didi

Solr - 在文档中重复查询中的单词没有额外分数

转载 作者:行者123 更新时间:2023-12-04 14:27:52 25 4
gpt4 key购买 nike

我只想为术语匹配打分一次,而不是多次出现。

Ex - Search Query - Parle G Biscuits

Document 1 - Parle G Biscuits
Document 2 - Parle G Biscuits. I can eat 10 packets of Parle G Biscuits anytime.
Document 3 - Parle G Biscuits V2

I want to rank documents as Doc 1 > Doc 3 > Doc 2
Default answer from Solr - Doc 2 > Doc 1 > Doc 3

发生这种情况是因为该字符串在较长的字符串中被找到了两次。如果我能以某种方式停止为两次出现打分,我会得到想要的结果,因为文档 2 和 3 会因字符串长度过大而受到轻微惩罚。

如何修改 Solr 使其以给定的方式工作?

谢谢!

最佳答案

如果您不需要术语位置(例如,如果您不使用诸如 foo:"word1 word2" 之类的短语进行搜索),您可以 set the field to drop any term frequency information, payloads and positions : omitTermFreqAndPositions="true"

If true, omits term frequency, positions, and payloads from postings for this field. This can be a performance boost for fields that don't require that information. It also reduces the storage space required for the index. Queries that rely on position that are issued on a field with this option will silently fail to find documents. This property defaults to true for all field types that are not text fields.

由于没有单独的设置来降低词频,如果您需要该设置禁用的其他两个功能,则必须实现自定义相似度。

关于Solr - 在文档中重复查询中的单词没有额外分数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41724377/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com