gpt4 book ai didi

elasticsearch - 重视 Elasticsearch 领域

转载 作者:行者123 更新时间:2023-12-02 23:07:30 28 4
gpt4 key购买 nike

我有一个包含产品的elasticsearch索引,我试图创建一个具有文本字段功能的搜索列表产品。
数据集的排序示例{"name": "foo", "count": 10}{"name": "bar", "count": 5}{"name": "foo bar"}{"name": "foo baz", "count": 20}一开始,我是在要求。

GET /product
/_search
{
"query": {
"match": {"name": "foo"}
}
}
效果很好,但现在我想增加某些产品的重量( count字段)
我正在使用此查询
GET /product/_search
{
"query": {
"function_score": {
"query": {
"match": {"name": "foo bar"}
},
"field_value_factor": {
"field": "count",
"missing": 0
}
}
}
}
但是首先使用此查询,我拥有 foo,然后是 bar,然后是 foo bar,似乎名称匹配的重要性不如count,我想拥有 foo bar,然后是 foobar但是寻找 foo我想要 foo bazfoofoo bar

最佳答案

But looking for foo I would like foo baz, foo and foo bar


添加带有索引数据,搜索查询和搜索结果的工作示例
请参阅 function score query以获取详细说明。
索引数据:
{"name": "foo", "count": 10} 
{"name": "bar", "count": 5}
{"name": "foo bar"}
{"name": "foo baz", "count": 20}
搜索查询:

But looking for foo I would like foo baz, foo and foo bar

{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "foo"
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "count",
"factor": 1.0,
"missing": 0
}
}
],
"boost_mode": "multiply"
}
}
}
搜索结果:
"hits": [
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "4",
"_score": 6.2774796,
"_source": {
"name": "foo baz",
"count": 20
}
},
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "1",
"_score": 4.1299205,
"_source": {
"name": "foo",
"count": 10
}
},
{
"_index": "stof_64169215",
"_type": "_doc",
"_id": "3",
"_score": 0.0,
"_source": {
"name": "foo bar"
}
}
]
更新1:

I would like to have foo bar then foo and bar


搜索查询:
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "foo bar"
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "count",
"factor": 1.0,
"missing": 0,
"modifier": "sqrt"
}
}
],
"boost_mode": "sum"
}
}
}
解释API结果:
要了解上述搜索查询,您需要了解如何计算查询的分数。
  • 是针对"name": "foo bar"进行搜索的,理想情况下应返回foo barfoobar。使用针对foo bar的正常匹配查询(并且没有功能得分查询),您将获得结果。
  • 现在,根据您的用例,您想在count字段上增加权重,为此您使用了Function score query,它允许您修改查询检索的文档分数。
  • 此外,可以组合几个功能。 function_score查询提供几种类型的得分函数。 field_value_factor函数允许您使用文档中的字段来影响得分。
  • 在field_value_factor中,有几个选项:

  • factor - Optional factor to multiply the field value with, defaults to1


    modifier - Modifier to apply to the field value
    missing - Value used if the document doesn’t have that field.


    生成以下得分公式:

    sqrt(1.0 * doc['count'].value)


    现在,对于包含 foo bar的文档,没有 count字段,因此将使用缺失值(在查询中定义,即 9)。分数将是 sqrt(1.0 * 9) = 3.0
    如果您缺少任何小于9的值,那么结果的顺序将改变。因为count字段的分数会有所不同(当您将缺少的值指定为0时,foo bar只会根据match查询获得分数,而field_value_factor不会添加分数)。然后根据match查询+ field_value_factor(在count字段上)计算最终分数。因此foo bar的总得分将小于其他文档。
    例如:对于 foo bar,最终得分将计算为 0.78038335+3.0=3.7803833。请仔细阅读下面的结果,以详细了解如何计算得分。
    请浏览此博客以了解 how scoring works in elasticsearch
    {
    "took": 4,
    "timed_out": false,
    "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
    },
    "hits": {
    "total": {
    "value": 3,
    "relation": "eq"
    },
    "max_score": 3.7803833,
    "hits": [
    {
    "_shard": "[stof_64169215][0]",
    "_node": "fVeabsK0Q1GnCZ_8oROXjA",
    "_index": "stof_64169215",
    "_type": "_doc",
    "_id": "3",
    "_score": 3.7803833,
    "_source": {
    "name": "foo bar"
    },
    "_explanation": {
    "value": 3.7803833,
    "description": "sum of",
    "details": [
    {
    "value": 0.78038335,
    "description": "sum of:",
    "details": [
    {
    "value": 0.39019167,
    "description": "weight(name:foo in 0) [PerFieldSimilarity], result of:",
    "details": [
    {
    "value": 0.39019167,
    "description": "score(freq=1.0), computed as boost * idf * tf from:",
    "details": [
    {
    "value": 2.2,
    "description": "boost",
    "details": []
    },
    {
    "value": 0.47000363,
    "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
    "details": [
    {
    "value": 2,
    "description": "n, number of documents containing term",
    "details": []
    },
    {
    "value": 3,
    "description": "N, total number of documents with field",
    "details": []
    }
    ]
    },
    {
    "value": 0.37735844,
    "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
    "details": [
    {
    "value": 1.0,
    "description": "freq, occurrences of term within document",
    "details": []
    },
    {
    "value": 1.2,
    "description": "k1, term saturation parameter",
    "details": []
    },
    {
    "value": 0.75,
    "description": "b, length normalization parameter",
    "details": []
    },
    {
    "value": 2.0,
    "description": "dl, length of field",
    "details": []
    },
    {
    "value": 1.3333334,
    "description": "avgdl, average length of field",
    "details": []
    }
    ]
    }
    ]
    }
    ]
    },
    {
    "value": 0.39019167,
    "description": "weight(name:bar in 0) [PerFieldSimilarity], result of:",
    "details": [
    {
    "value": 0.39019167,
    "description": "score(freq=1.0), computed as boost * idf * tf from:",
    "details": [
    {
    "value": 2.2,
    "description": "boost",
    "details": []
    },
    {
    "value": 0.47000363,
    "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
    "details": [
    {
    "value": 2,
    "description": "n, number of documents containing term",
    "details": []
    },
    {
    "value": 3,
    "description": "N, total number of documents with field",
    "details": []
    }
    ]
    },
    {
    "value": 0.37735844,
    "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
    "details": [
    {
    "value": 1.0,
    "description": "freq, occurrences of term within document",
    "details": []
    },
    {
    "value": 1.2,
    "description": "k1, term saturation parameter",
    "details": []
    },
    {
    "value": 0.75,
    "description": "b, length normalization parameter",
    "details": []
    },
    {
    "value": 2.0,
    "description": "dl, length of field",
    "details": []
    },
    {
    "value": 1.3333334,
    "description": "avgdl, average length of field",
    "details": []
    }
    ]
    }
    ]
    }
    ]
    }
    ]
    },
    {
    "value": 3.0,
    "description": "min of:",
    "details": [
    {
    "value": 3.0,
    "description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
    "details": []
    },
    {
    "value": 3.4028235E38,
    "description": "maxBoost",
    "details": []
    }
    ]
    }
    ]
    }
    },
    {
    "_shard": "[stof_64169215][0]",
    "_node": "fVeabsK0Q1GnCZ_8oROXjA",
    "_index": "stof_64169215",
    "_type": "_doc",
    "_id": "1",
    "_score": 3.685826,
    "_source": {
    "name": "foo",
    "count": 10
    },
    "_explanation": {
    "value": 3.685826,
    "description": "sum of",
    "details": [
    {
    "value": 0.52354836,
    "description": "sum of:",
    "details": [
    {
    "value": 0.52354836,
    "description": "weight(name:foo in 0) [PerFieldSimilarity], result of:",
    "details": [
    {
    "value": 0.52354836,
    "description": "score(freq=1.0), computed as boost * idf * tf from:",
    "details": [
    {
    "value": 2.2,
    "description": "boost",
    "details": []
    },
    {
    "value": 0.47000363,
    "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
    "details": [
    {
    "value": 2,
    "description": "n, number of documents containing term",
    "details": []
    },
    {
    "value": 3,
    "description": "N, total number of documents with field",
    "details": []
    }
    ]
    },
    {
    "value": 0.50632906,
    "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
    "details": [
    {
    "value": 1.0,
    "description": "freq, occurrences of term within document",
    "details": []
    },
    {
    "value": 1.2,
    "description": "k1, term saturation parameter",
    "details": []
    },
    {
    "value": 0.75,
    "description": "b, length normalization parameter",
    "details": []
    },
    {
    "value": 1.0,
    "description": "dl, length of field",
    "details": []
    },
    {
    "value": 1.3333334,
    "description": "avgdl, average length of field",
    "details": []
    }
    ]
    }
    ]
    }
    ]
    }
    ]
    },
    {
    "value": 3.1622777,
    "description": "min of:",
    "details": [
    {
    "value": 3.1622777,
    "description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
    "details": []
    },
    {
    "value": 3.4028235E38,
    "description": "maxBoost",
    "details": []
    }
    ]
    }
    ]
    }
    },
    {
    "_shard": "[stof_64169215][0]",
    "_node": "fVeabsK0Q1GnCZ_8oROXjA",
    "_index": "stof_64169215",
    "_type": "_doc",
    "_id": "2",
    "_score": 2.7596164,
    "_source": {
    "name": "bar",
    "count": 5
    },
    "_explanation": {
    "value": 2.7596164,
    "description": "sum of",
    "details": [
    {
    "value": 0.52354836,
    "description": "sum of:",
    "details": [
    {
    "value": 0.52354836,
    "description": "weight(name:bar in 0) [PerFieldSimilarity], result of:",
    "details": [
    {
    "value": 0.52354836,
    "description": "score(freq=1.0), computed as boost * idf * tf from:",
    "details": [
    {
    "value": 2.2,
    "description": "boost",
    "details": []
    },
    {
    "value": 0.47000363,
    "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
    "details": [
    {
    "value": 2,
    "description": "n, number of documents containing term",
    "details": []
    },
    {
    "value": 3,
    "description": "N, total number of documents with field",
    "details": []
    }
    ]
    },
    {
    "value": 0.50632906,
    "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
    "details": [
    {
    "value": 1.0,
    "description": "freq, occurrences of term within document",
    "details": []
    },
    {
    "value": 1.2,
    "description": "k1, term saturation parameter",
    "details": []
    },
    {
    "value": 0.75,
    "description": "b, length normalization parameter",
    "details": []
    },
    {
    "value": 1.0,
    "description": "dl, length of field",
    "details": []
    },
    {
    "value": 1.3333334,
    "description": "avgdl, average length of field",
    "details": []
    }
    ]
    }
    ]
    }
    ]
    }
    ]
    },
    {
    "value": 2.236068,
    "description": "min of:",
    "details": [
    {
    "value": 2.236068,
    "description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
    "details": []
    },
    {
    "value": 3.4028235E38,
    "description": "maxBoost",
    "details": []
    }
    ]
    }
    ]
    }
    }
    ]
    }
    }
    搜索结果:
    "hits": [
    {
    "_index": "stof_64169215",
    "_type": "_doc",
    "_id": "3",
    "_score": 3.7803833,
    "_source": {
    "name": "foo bar"
    }
    },
    {
    "_index": "stof_64169215",
    "_type": "_doc",
    "_id": "1",
    "_score": 3.685826,
    "_source": {
    "name": "foo",
    "count": 10
    }
    },
    {
    "_index": "stof_64169215",
    "_type": "_doc",
    "_id": "2",
    "_score": 2.7596164,
    "_source": {
    "name": "bar",
    "count": 5
    }
    }
    ]

    关于elasticsearch - 重视 Elasticsearch 领域,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64169215/

    28 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com