gpt4 book ai didi

elasticsearch - 如何在Elasticsearch中提高结果,以使field1中的匹配始终高于field2中的匹配?

转载 作者:行者123 更新时间:2023-12-03 00:29:45 29 4
gpt4 key购买 nike

有映射,3个字段和9个文档:

#! /bin/bash

#DELETE
curl -XDELETE 'http://localhost:9200/test'
echo
# CREATE
curl -XPUT 'http://localhost:9200/test?pretty=1' -d '{
"settings": {
"analysis" : {
"analyzer" : {
"my_analyz_1" : {
"filter" : [
"standard",
"lowercase",
"asciifolding"
],
"type" : "custom",
"tokenizer" : "standard"
}
}
}
}
}'
echo
# DEFINE
curl -XPUT 'http://localhost:9200/test/posts/_mapping?pretty=1' -d '{
"posts" : {
"properties" : {
"section" : {
"type" : "string",
"analyzer" : "my_analyz_1"
},
"category" : {
"type" : "string",
"analyzer" : "my_analyz_1"
},
"title" : {
"type" : "string",
"analyzer" : "my_analyz_1"
}
}
}
}'
echo
# INSERT
curl localhost:9200/test/posts/1 -d '{section: "Bicycle", category: "Small", title: "Diamondback Grind-16"}'
curl localhost:9200/test/posts/2 -d '{section: "Bicycle", category: "Big", title: "Diamondback JrViper"}'
curl localhost:9200/test/posts/3 -d '{section: "Bicycle", category: "Small", title: "2-Hip Cyclone small"}'
curl localhost:9200/test/posts/4 -d '{section: "Bicycle", category: "Big", title: "2-Hip Bizzle"}'
curl localhost:9200/test/posts/5 -d '{section: "Small", category: "Small", title: "Toyota"}'
curl localhost:9200/test/posts/6 -d '{section: "Car", category: "Big", title: "Subaru Impreza small"}'
curl localhost:9200/test/posts/7 -d '{section: "Small", category: "Big", title: "Toyota Corona MARK II"}'
curl localhost:9200/test/posts/8 -d '{section: "Car", category: "Small", title: "Hyundai Elantra"}'
curl localhost:9200/test/posts/9 -d '{section: "Car", category: "Big", title: "Ford Maverick small"}'
echo
# REFRESH
curl -XPOST localhost:9200/test/_refresh
echo

我想搜索“小”一词,但我总是希望结果的顺序如下:
  • 结果中
  • 节中的“小”
  • 结果中的“small”属于
  • 类别
  • 结果中的标题
  • 中为“small”

    所以我搜索查询:
    curl "localhost:9200/test/posts/_search?pretty=1" -d '{
    "query": {
    "bool": {
    "must": [
    {
    "multi_match": {
    "query": "small",
    "fields": ["section^3", "category^2", "title"]
    }
    }
    ]
    }
    }
    }'

    结果是:
    {"_id": 7} {section: "Small",   category: "Big",   title: "Toyota Corona MARK II"}
    {"_id": 1} {section: "Bicycle", category: "Small", title: "Diamondback Grind-16"}
    {"_id": 5} {section: "Small", category: "Small", title: "Toyota"}
    {"_id": 3} {section: "Bicycle", category: "Small", title: "2-Hip Cyclone small"}
    {"_id": 8} {section: "Car", category: "Small", title: "Hyundai Elantra"}
    {"_id": 9} {section: "Car", category: "Big", title: "Ford Maverick small"
    {"_id": 6} {section: "Car", category: "Big", title: "Subaru Impreza small"}

    这不是我想要的。 5应该是第二个,因为匹配在该部分中。 3应该在7和5之后,因为匹配项在类别和标题中。

    因此,我的问题是,如何获得结果,其中节中的匹配总是更重要,然后类别中的匹配始终比标题中的匹配更重要。

    提前致谢!

    编辑:

    搜索类型为“dfs_query_then_fetch”的问题已解决,该类型可计算所有分片上的TF-IDF值。有关更多信息,请参见 http://www.elasticsearch.org/guide/reference/api/search/search-type/

    最佳答案

    您是否尝试过将use_dis_max设置为false

    这应该意味着categorytitle中具有“小”文档的文档将比仅category中具有“小”文档的文档更高。

    至于您在第二个和第三个结果之间看到的奇怪行为,我有点迷失了……您可以执行查询并要求
    explanation of how the scores were calculated

    关于elasticsearch - 如何在Elasticsearch中提高结果,以使field1中的匹配始终高于field2中的匹配?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17030561/

    29 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com