gpt4 book ai didi

elasticsearch - 为什么我得到不同字段的综合得分而不是最佳匹配字段的得分

转载 作者:行者123 更新时间:2023-12-03 02:15:57 25 4
gpt4 key购买 nike

我试图通过将multi_match与best_fields以及一个自定义分析器(对每个字段应用带状阴影和/或ngram)一起使用,从不同的mutifield文档中找到最匹配的字段。我希望得到最佳匹配字段的分数,但是我会得到最佳匹配文档不同字段的综合分数。为什么会这样呢?有其他方法吗?有人可以告诉我如何获得最佳匹配 Realm 的分数吗?
这是我正在做的一个例子:

PUT /my-index1
{
"mappings": {
"_doc": {
"properties": {
"name1": {
"type": "text",
"fields": {
"name_shingles": {
"type": "text",
"analyzer": "shingle_analyz",
"search_analyzer": "shingle_analyz"
}
}
},
"name2": {
"type": "text",
"fields": {
"name_shingles": {
"type": "text",
"analyzer": "shingle_analyz",
"search_analyzer": "shingle_analyz"
}
}
}
}
}
},
"settings": {
"index": {
"number_of_shards": "1",
"analysis": {
"filter": {
"shingle_filter": {
"max_shingle_size": "3",
"min_shingle_size": "2",
"output_unigrams": false,
"type": "shingle"
},
"duplicate_token_filter": {
"type": "unique",
"only_on_same_position": false
}
},
"analyzer": {
"shingle_analyz": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"asciifolding",
"shingle_filter",
"duplicate_token_filter"
],
"char_filter": [
"my_char_filter"
]
}
},
"char_filter": {
"my_char_filter": {
"type": "mapping",
"mappings": [
"'s => s",
"'S => S"
]
}
}
},
"number_of_replicas": "1"
}
}
}


PUT /my-index1/_doc/1
{
"name1": "WA WA WA WOMAN",
"name2": "wa wa"
}

PUT /my-index1/_doc/2
{
"name1": "WA WA WOMAN",
"name2": "help"
}

PUT /my-index1/_doc/3
{
"name1": "the great showman",
"name2": "WA WA WOMAN"
}


GET my-index1/_search
{
"explain": true,
"query": {
"bool": {
"should": [
{
"multi_match": {
"fields": [
"name1.name_shingles",
"name2.name_shingles"
],
"query": "WA WA WOMAN",
"type": "best_fields",
"operator": "or"
}
}
]
}
}
}

最佳答案

经过一些研究,我得出的结论是,在处理带状板和多个字段时,match和muti_match查询会产生不可靠的分数。我尝试了许多不同的配置,并且总是混淆了不同 Realm 的分数。

关于elasticsearch - 为什么我得到不同字段的综合得分而不是最佳匹配字段的得分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63544424/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com