gpt4 book ai didi

elasticsearch - Elasticsearch 结果中的错误分数

转载 作者:行者123 更新时间:2023-12-02 22:21:14 25 4
gpt4 key购买 nike

没有获得 Elasticsearch 查询结果的正确分数。
ES 查询 -

{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "(emergency) OR (emergency*) OR (*emergency) OR (*emergency*)",
"fields": [
"MDMGlobalData.Name1"
]
}
}
]
}
}
}
ES 结果 -
{
"took": 29,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 798,
"relation": "eq"
},
"max_score": 9.169065,
"hits": [
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551037160",
"_score": 9.169065,
"_source": {
"MDMGlobalData": {
"Name1": "PARAGON EMERGENCY"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551040507",
"_score": 9.169065,
"_source": {
"MDMGlobalData": {
"Name1": "EMERGENCY MD"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551076447",
"_score": 9.169065,
"_source": {
"MDMGlobalData": {
"Name1": "COASTAL EMERGENCY"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551100746",
"_score": 9.169065,
"_source": {
"MDMGlobalData": {
"Name1": "EMERGENCY MD"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551090880",
"_score": 9.169065,
"_source": {
"MDMGlobalData": {
"Name1": "PAFFORD EMERGENCY"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551106787",
"_score": 9.169065,
"_source": {
"MDMGlobalData": {
"Name1": "CAPROCK EMERGENCY"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551021568",
"_score": 9.121077,
"_source": {
"MDMGlobalData": {
"Name1": "WILTON EMERGENCY"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551124137",
"_score": 9.121077,
"_source": {
"MDMGlobalData": {
"Name1": "EMERGENCY ONE"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551125549",
"_score": 9.121077,
"_source": {
"MDMGlobalData": {
"Name1": "EMERGENCY ONE"
}
}
},
{
"_index": "customermasterdata",
"_type": "_doc",
"_id": "MDMCM551133066",
"_score": 9.121077,
"_source": {
"MDMGlobalData": {
"Name1": "EMERGENCY MD"
}
}
}
]
}
}
理想情况下,结果中的第一组应该是 Name1,其值仅为“emergency”或以“emergency”一词开头
我们怎么能在几乎前 5 个结果集上获得相同的分数?作为 Name1 值是不同的。
由于记分错误,结果一团糟。
如何更正结果中的分数?

最佳答案

不,不必如此。因为ES跟在Lucene scoring function
分数相同的原因:

  • 您在每个文档中只有两个术语 - emergency and one more word
  • Emergency单词按原样匹配。 Field Length is same
  • 出现次数为一。即 Term frequencies are same .
  • 所有术语的相关性相同。 idf
  • Coord与您的文档仅包含一次 Emergency 相同

  • 但是如果你有一个带有 Emergency X Y Z 的文档,那么此分数将低于您拥有的其他文件。因为 term frequency这个更高。
    如果您只有 Emergency ,本文档的分数将高于所有。
    在您的场景中具有相同的分数是完全正常的,因为用户不知道哪个 emergency他/她的意思。
    更新:
    {
    "query":{
    "bool":{
    "must":{
    "term":{
    "MDMGlobalData.Name1":"emergency"
    }
    }
    }
    }
    }
    使用样本数据,输出:
    "hits": [
    {
    "_index": "emerge",
    "_type": "_doc",
    "_id": "iN1hKnMBojxRtp6HNI7d",
    "_score": 0.10938574,
    "_source": {
    "MDMGlobalData": {
    "Name1": "EMERGENCY"
    }
    }
    },
    {
    "_index": "emerge",
    "_type": "_doc",
    "_id": "g91TKnMBojxRtp6Hto4q",
    "_score": 0.08701137,
    "_source": {
    "MDMGlobalData": {
    "Name1": "PARAGON EMERGENCY"
    }
    }
    },
    {
    "_index": "emerge",
    "_type": "_doc",
    "_id": "hN1TKnMBojxRtp6H2I6A",
    "_score": 0.08701137,
    "_source": {
    "MDMGlobalData": {
    "Name1": "EMERGENCY MD"
    }
    }
    },
    {
    "_index": "emerge",
    "_type": "_doc",
    "_id": "hd1TKnMBojxRtp6H_I6_",
    "_score": 0.08701137,
    "_source": {
    "MDMGlobalData": {
    "Name1": "COASTAL EMERGENCY"
    }
    }
    },
    {
    "_index": "emerge",
    "_type": "_doc",
    "_id": "h91VKnMBojxRtp6HYI4e",
    "_score": 0.07223585,
    "_source": {
    "MDMGlobalData": {
    "Name1": "EMERGENCY MD X"
    }
    }
    }
    ]

    关于elasticsearch - Elasticsearch 结果中的错误分数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62778671/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com