gpt4 book ai didi

elasticsearch - 在嵌套属性中使用 function_score 增强 Elasticsearch

转载 作者:行者123 更新时间:2023-11-29 02:50:17 29 4
gpt4 key购买 nike

在 Elasticsearch 中,给定以下文档结构:

"workhistory": {
"positions": [{
"company": "Some company",
"position": "Some Job Title",
"start": 1356998400,
"end": 34546576576,
"description": "",
"source": [
"some source",
"some other source"
]
},
{
"company": "Some other company",
"position": "Job Title",
"start": 1356998400,
"end": "",
"description": "",
"source": [
"some other source"
]
}]
}

和这个结构的映射:

  workhistory: {
properties: {
positions: {
type: "nested",
include_in_parent: true,
properties: {
company: {
type: "multi_field",
fields: {
company: {type: "string"},
original: {type : "string", analyzer : "string_lowercase"}
}
},
position: {
type: "multi_field",
fields: {
position: {type: "string"},
original: {type : "string", analyzer : "string_lowercase"}
}
}
}
}
}
}

我希望能够搜索“company”并在 company =“some company”等情况下匹配文档。然后我想获得 tf idf _score。我还想创建一个 function_score 查询,以根据“源”字段数组的值来提高此匹配项的分数。基本上,如果源包含“某些源”,则用 x 量提升 _score。如果需要,我可以更改“源”属性的结构。

这是我到目前为止得到的:

{
"bool": {
"should": [
{
"filtered": {
"query": {
"bool": {
"should": [
{
"bool": {
"should": [
{
"match": {
"workhistory.positions.company.original": "some company"
}
}
]
}
}
],
"minimum_should_match": "100%"
}
},
"filter": {
"and": [
{
"bool": {
"should": [
{
"term": {
"workhistory.positions.company.original": "some company"
}
}
]
}
}
]
}
}
},
{
"function_score": {
"query": {
"bool": {
"should": [
{
"bool": {
"should": [
{
"match": {
"workhistory.positions.company.original": "some company"
}
}
]
}
}
],
"minimum_should_match": "100%"
}
},
"filter": {
"and": [
{
"bool": {
"should": [
{
"term": {
"workhistory.positions.company.original": "some company"
}
}
]
}
}
]
}
}
}
]
}
}

这里也有一些过滤器,因为我只想返回具有过滤值的文档。在这个例子中,过滤器和查询基本相同,但在这个查询的更大版本中,我有一些其他的“可选”匹配来提升可选值等。function_score 现在没有做太多,因为我真的想不通了解如何使用它。目标是能够在我的应用程序代码中调整提升的数量并将其传递给搜索查询。

我使用的是 Elasticsearch 1.3.4 版。

最佳答案

老实说,我不确定您为什么要在其中重复所有这些过滤器和查询。也许我遗漏了一些东西,但根据你的描述,我相信你所需要的只是一个“function_score”。来自documentation :

The function_score allows you to modify the score of documents that are retrieved by a query.

因此,您定义一个查询(例如 - 匹配公司名称),然后定义一个函数列表,这些函数应该提高某个文档子集的 _score。来自同一文档:

Furthermore, several functions can be combined. In this case one can optionally choose to apply the function only if a document matches a given filter

因此,您有一个查找具有特定名称的公司的查询,然后您有一个过滤器,用于操作与过滤器匹配的文档的 _score 函数。在这种情况下,您的过滤器是应该包含某些内容的“来源”。该函数本身是一个脚本:_score + 2。最后,这将是我的想法:

    {
"query": {
"bool": {
"should": [
{
"function_score": {
"query": {
"bool": {
"should": [
{
"bool": {
"should": [
{
"match": {
"workhistory.positions.company.original": "some company"
}
}
]
}
}
],
"minimum_should_match": "100%"
}
},
"functions": [
{
"filter": {
"nested": {
"path": "workhistory.positions",
"query": {
"bool": {
"should": [
{
"match": {
"workhistory.positions.source": "some source"
}
}
]
}
}
}
},
"script_score": {
"script": "_score + 2"
}
},
{
"filter": {
"nested": {
"path": "workhistory.positions",
"query": {
"bool": {
"should": [
{
"match": {
"workhistory.positions.source": "xxx"
}
}
]
}
}
}
},
"script_score": {
"script": "_score + 4"
}
}
],
"max_boost": 5,
"score_mode": "sum",
"boost_mode": "sum"
}
}
]
}
}
}

关于elasticsearch - 在嵌套属性中使用 function_score 增强 Elasticsearch ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26237063/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com