elasticsearch - 重视 Elasticsearch 领域-6ren

elasticsearch - 重视 Elasticsearch 领域

转载作者：行者123 更新时间：2023-12-02 23:07:30

我有一个包含产品的elasticsearch索引，我试图创建一个具有文本字段功能的搜索列表产品。
数据集的排序示例{"name": "foo", "count": 10}{"name": "bar", "count": 5}{"name": "foo bar"}{"name": "foo baz", "count": 20}一开始，我是在要求。

GET /product
/_search
{
  "query": {
    "match": {"name": "foo"}
  }
}

效果很好，但现在我想增加某些产品的重量( count字段)
我正在使用此查询

GET /product/_search
{
  "query": {
    "function_score": {
      "query": {
        "match": {"name": "foo bar"}
      },
      "field_value_factor": {
        "field": "count",
        "missing": 0
      }
    }
  }
}

但是首先使用此查询，我拥有 foo，然后是 bar，然后是 foo bar，似乎名称匹配的重要性不如count，我想拥有 foo bar，然后是 foo和 bar但是寻找 foo我想要 foo baz， foo和 foo bar

最佳答案

But looking for foo I would like foo baz, foo and foo bar

添加带有索引数据，搜索查询和搜索结果的工作示例
请参阅 function score query以获取详细说明。
索引数据:

{"name": "foo", "count": 10} 
{"name": "bar", "count": 5} 
{"name": "foo bar"} 
{"name": "foo baz", "count": 20}

搜索查询:

But looking for foo I would like foo baz, foo and foo bar

{
    "query": {
        "function_score": {
            "query": {
                "bool": {
                    "should": [
                        {
                            "match": {
                                "name": {
                                    "query": "foo"
                                }
                            }
                        }
                    ]
                }
            },
            "functions": [
                {
                    "field_value_factor": {
                        "field": "count",
                        "factor": 1.0,
                        "missing": 0
                    }
                }
            ],
            "boost_mode": "multiply"
        }
    }
}

搜索结果:

"hits": [
      {
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "4",
        "_score": 6.2774796,
        "_source": {
          "name": "foo baz",
          "count": 20
        }
      },
      {
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "1",
        "_score": 4.1299205,
        "_source": {
          "name": "foo",
          "count": 10
        }
      },
      {
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "3",
        "_score": 0.0,
        "_source": {
          "name": "foo bar"
        }
      }
    ]

更新1:

I would like to have foo bar then foo and bar

搜索查询:

{
    "query": {
        "function_score": {
            "query": {
                "bool": {
                    "should": [
                        {
                            "match": {
                                "name": {
                                    "query": "foo bar"
                                }
                            }
                        }
                    ]
                }
            },
            "functions": [
                {
                    "field_value_factor": {
                        "field": "count",
                        "factor": 1.0,
                        "missing": 0,
                        "modifier": "sqrt"
                    }
                }
            ],
            "boost_mode": "sum"
        }
    }
}

解释API结果:
要了解上述搜索查询，您需要了解如何计算查询的分数。

是针对"name": "foo bar"进行搜索的，理想情况下应返回foo bar，foo和bar。使用针对foo bar的正常匹配查询(并且没有功能得分查询)，您将获得结果。

现在，根据您的用例，您想在count字段上增加权重，为此您使用了Function score query，它允许您修改查询检索的文档分数。

此外，可以组合几个功能。 function_score查询提供几种类型的得分函数。 field_value_factor函数允许您使用文档中的字段来影响得分。

在field_value_factor中，有几个选项:

factor - Optional factor to multiply the field value with, defaults to1

modifier - Modifier to apply to the field value
missing - Value used if the document doesn’t have that field.

生成以下得分公式:

sqrt(1.0 * doc['count'].value)

现在，对于包含 foo bar的文档，没有 count字段，因此将使用缺失值(在查询中定义，即 9)。分数将是 sqrt(1.0 * 9) = 3.0。
如果您缺少任何小于9的值，那么结果的顺序将改变。因为count字段的分数会有所不同(当您将缺少的值指定为0时，foo bar只会根据match查询获得分数，而field_value_factor不会添加分数)。然后根据match查询+ field_value_factor(在count字段上)计算最终分数。因此foo bar的总得分将小于其他文档。
例如:对于 foo bar，最终得分将计算为 0.78038335+3.0=3.7803833。请仔细阅读下面的结果，以详细了解如何计算得分。
请浏览此博客以了解 how scoring works in elasticsearch

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 3.7803833,
    "hits": [
      {
        "_shard": "[stof_64169215][0]",
        "_node": "fVeabsK0Q1GnCZ_8oROXjA",
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "3",
        "_score": 3.7803833,
        "_source": {
          "name": "foo bar"
        },
        "_explanation": {
          "value": 3.7803833,
          "description": "sum of",
          "details": [
            {
              "value": 0.78038335,
              "description": "sum of:",
              "details": [
                {
                  "value": 0.39019167,
                  "description": "weight(name:foo in 0) [PerFieldSimilarity], result of:",
                  "details": [
                    {
                      "value": 0.39019167,
                      "description": "score(freq=1.0), computed as boost * idf * tf from:",
                      "details": [
                        {
                          "value": 2.2,
                          "description": "boost",
                          "details": []
                        },
                        {
                          "value": 0.47000363,
                          "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                          "details": [
                            {
                              "value": 2,
                              "description": "n, number of documents containing term",
                              "details": []
                            },
                            {
                              "value": 3,
                              "description": "N, total number of documents with field",
                              "details": []
                            }
                          ]
                        },
                        {
                          "value": 0.37735844,
                          "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                          "details": [
                            {
                              "value": 1.0,
                              "description": "freq, occurrences of term within document",
                              "details": []
                            },
                            {
                              "value": 1.2,
                              "description": "k1, term saturation parameter",
                              "details": []
                            },
                            {
                              "value": 0.75,
                              "description": "b, length normalization parameter",
                              "details": []
                            },
                            {
                              "value": 2.0,
                              "description": "dl, length of field",
                              "details": []
                            },
                            {
                              "value": 1.3333334,
                              "description": "avgdl, average length of field",
                              "details": []
                            }
                          ]
                        }
                      ]
                    }
                  ]
                },
                {
                  "value": 0.39019167,
                  "description": "weight(name:bar in 0) [PerFieldSimilarity], result of:",
                  "details": [
                    {
                      "value": 0.39019167,
                      "description": "score(freq=1.0), computed as boost * idf * tf from:",
                      "details": [
                        {
                          "value": 2.2,
                          "description": "boost",
                          "details": []
                        },
                        {
                          "value": 0.47000363,
                          "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                          "details": [
                            {
                              "value": 2,
                              "description": "n, number of documents containing term",
                              "details": []
                            },
                            {
                              "value": 3,
                              "description": "N, total number of documents with field",
                              "details": []
                            }
                          ]
                        },
                        {
                          "value": 0.37735844,
                          "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                          "details": [
                            {
                              "value": 1.0,
                              "description": "freq, occurrences of term within document",
                              "details": []
                            },
                            {
                              "value": 1.2,
                              "description": "k1, term saturation parameter",
                              "details": []
                            },
                            {
                              "value": 0.75,
                              "description": "b, length normalization parameter",
                              "details": []
                            },
                            {
                              "value": 2.0,
                              "description": "dl, length of field",
                              "details": []
                            },
                            {
                              "value": 1.3333334,
                              "description": "avgdl, average length of field",
                              "details": []
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value": 3.0,
              "description": "min of:",
              "details": [
                {
                  "value": 3.0,
                  "description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
                  "details": []
                },
                {
                  "value": 3.4028235E38,
                  "description": "maxBoost",
                  "details": []
                }
              ]
            }
          ]
        }
      },
      {
        "_shard": "[stof_64169215][0]",
        "_node": "fVeabsK0Q1GnCZ_8oROXjA",
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "1",
        "_score": 3.685826,
        "_source": {
          "name": "foo",
          "count": 10
        },
        "_explanation": {
          "value": 3.685826,
          "description": "sum of",
          "details": [
            {
              "value": 0.52354836,
              "description": "sum of:",
              "details": [
                {
                  "value": 0.52354836,
                  "description": "weight(name:foo in 0) [PerFieldSimilarity], result of:",
                  "details": [
                    {
                      "value": 0.52354836,
                      "description": "score(freq=1.0), computed as boost * idf * tf from:",
                      "details": [
                        {
                          "value": 2.2,
                          "description": "boost",
                          "details": []
                        },
                        {
                          "value": 0.47000363,
                          "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                          "details": [
                            {
                              "value": 2,
                              "description": "n, number of documents containing term",
                              "details": []
                            },
                            {
                              "value": 3,
                              "description": "N, total number of documents with field",
                              "details": []
                            }
                          ]
                        },
                        {
                          "value": 0.50632906,
                          "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                          "details": [
                            {
                              "value": 1.0,
                              "description": "freq, occurrences of term within document",
                              "details": []
                            },
                            {
                              "value": 1.2,
                              "description": "k1, term saturation parameter",
                              "details": []
                            },
                            {
                              "value": 0.75,
                              "description": "b, length normalization parameter",
                              "details": []
                            },
                            {
                              "value": 1.0,
                              "description": "dl, length of field",
                              "details": []
                            },
                            {
                              "value": 1.3333334,
                              "description": "avgdl, average length of field",
                              "details": []
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value": 3.1622777,
              "description": "min of:",
              "details": [
                {
                  "value": 3.1622777,
                  "description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
                  "details": []
                },
                {
                  "value": 3.4028235E38,
                  "description": "maxBoost",
                  "details": []
                }
              ]
            }
          ]
        }
      },
      {
        "_shard": "[stof_64169215][0]",
        "_node": "fVeabsK0Q1GnCZ_8oROXjA",
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "2",
        "_score": 2.7596164,
        "_source": {
          "name": "bar",
          "count": 5
        },
        "_explanation": {
          "value": 2.7596164,
          "description": "sum of",
          "details": [
            {
              "value": 0.52354836,
              "description": "sum of:",
              "details": [
                {
                  "value": 0.52354836,
                  "description": "weight(name:bar in 0) [PerFieldSimilarity], result of:",
                  "details": [
                    {
                      "value": 0.52354836,
                      "description": "score(freq=1.0), computed as boost * idf * tf from:",
                      "details": [
                        {
                          "value": 2.2,
                          "description": "boost",
                          "details": []
                        },
                        {
                          "value": 0.47000363,
                          "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                          "details": [
                            {
                              "value": 2,
                              "description": "n, number of documents containing term",
                              "details": []
                            },
                            {
                              "value": 3,
                              "description": "N, total number of documents with field",
                              "details": []
                            }
                          ]
                        },
                        {
                          "value": 0.50632906,
                          "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                          "details": [
                            {
                              "value": 1.0,
                              "description": "freq, occurrences of term within document",
                              "details": []
                            },
                            {
                              "value": 1.2,
                              "description": "k1, term saturation parameter",
                              "details": []
                            },
                            {
                              "value": 0.75,
                              "description": "b, length normalization parameter",
                              "details": []
                            },
                            {
                              "value": 1.0,
                              "description": "dl, length of field",
                              "details": []
                            },
                            {
                              "value": 1.3333334,
                              "description": "avgdl, average length of field",
                              "details": []
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value": 2.236068,
              "description": "min of:",
              "details": [
                {
                  "value": 2.236068,
                  "description": "field value function: sqrt(doc['count'].value?:9.0 * factor=1.0)",
                  "details": []
                },
                {
                  "value": 3.4028235E38,
                  "description": "maxBoost",
                  "details": []
                }
              ]
            }
          ]
        }
      }
    ]
  }
}

搜索结果:

"hits": [
      {
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "3",
        "_score": 3.7803833,
        "_source": {
          "name": "foo bar"
        }
      },
      {
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "1",
        "_score": 3.685826,
        "_source": {
          "name": "foo",
          "count": 10
        }
      },
      {
        "_index": "stof_64169215",
        "_type": "_doc",
        "_id": "2",
        "_score": 2.7596164,
        "_source": {
          "name": "bar",
          "count": 5
        }
      }
    ]

关于elasticsearch - 重视 Elasticsearch 领域，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/64169215/

文章推荐： elasticsearch - ElasticSearch总结嵌套对象字段

文章推荐： powershell - 如何使用Powershell检查重复的多个文件？

文章推荐： audio - 计算使用SuperpoweredAdvancedAudioPlayer播放的样本

Java反射抓取一个非公共(public)领域
我在使用 Java 反射获取类中的字段时遇到问题: public class CraftLib { static List alloyRecipes = new ArrayList();
c# - 领域/业务层的设计模式选择
我试图避免此类 ContentDomain 成为上帝类，并将功能隔离到特定类中(以遵循 SRP)，就像这样内容域: public class ContentDomain : IContentDom
聊一聊对领域驱动设计中“领域”这个词语的理解与分析方法
1. 什么是领域百度百科对领域的解释：领域具体指一种特定的范围或区域领域一般指的是业务的问题域，领域是有边界的，边界内，规定了我们要做什么，要做的范围，软件项目从开始到交付的过
elasticsearch - 重视 Elasticsearch 领域
我有一个包含产品的elasticsearch索引，我试图创建一个具有文本字段功能的搜索列表产品。数据集的排序示例{"name": "foo", "count": 10}{"name": "bar",
c# - 模拟私有(private)领域
我知道有人问过类似的问题，但我还没有找到明确的解决方案。我正在尝试从一个大类(class)中模拟一个私有(private)领域。私有(private)字段在一些较早的方法中被实例化，我正在尝试对引用该
Java EE JDBC 领域
当使用 JDBC 领域进行授权时，我通常有以下表: 用户表角色表分组表当我现在使用用户名、密码登录时，安全模块会在表中进行查找:为我提供用户的所有角色:用户名。我可以以某种方式连接到进程并添加
tomcat - 想要配置具有不同数据源的共享 tomcat 领域
我有两组 Web 应用程序，它们都在同一台 Tomcat 5.5 服务器上运行。我在 server.xml 中定义了一个通用领域: 我的“美国”应用程序都希望与该数据源共享
Tomcat SSO Kerberos 领域
我设法使用 key 表在我的 Web 应用程序中启用 SSO。我必须更新以下文件才能使其正常工作: Jass.conf Krb5.conf Server.xml(领域) 网络.xml 它工作正常。我的
c# - 获得公共(public)领域？
我有一个这样定义的结构 private struct Combinators { public const char DirectChild = '>'; public const c
java - 调试自定义 tomcat 领域
我正在使用 maven 和 eclipse juno 为 Tomcat 7 开发自定义领域。它看起来很像 Implement a Tomcat Realm with LDAP authenticat
c# - 我如何模拟私有(private)领域？
我真的是模拟的新手，正在尝试用模拟对象替换私有(private)字段。目前私有(private)字段的实例是在构造函数中创建的。我的代码看起来像... public class Cache {
javascript - 如何理解 JS 领域
在 ECMAScript 规范中引入了“领域”的概念: Before it is evaluated, all ECMAScript code must be associated with a re
php - 你如何让当前用户登录到 apache 领域？
我正在为 Subversion 编写一个简单的内部前端。多亏了 WebDAV，我们有一个 Apache 设置为 SVN 存储库提供服务。此外，身份验证是通过 Apache 领域和 Open Direc
c++ - 委派到私有(private)领域
有时，C++ 的隐私概念让我感到困惑 :-) class Foo { struct Bar; Bar* p; public: Bar* operator->() const
protobuf-net 保留 future 领域
我现在为此进行了一些搜索，但无法确定 protobuf-net 或 protobuf 通常是否支持以下意义上的前向兼容性: 旧版本的对象使用新字段反序列化新版本的对象，但在将其序列化回时保留该字段，因
docker - 启用领域中通常概述的Docker Bearer token 领域
根据Nexus 3.x docx，“您还需要启用 Realm 中通常概述的Docker Bearer token Realm 。默认情况下，此 Realm 处于非 Activity 状态” 有人知道如
java - Shiro 自定义 JDBC 领域
我正在摆弄 Shiro 安全框架并实现自定义 JDBC 领域。以下值当前在我的 shiro.ini 文件中设置 jdbcRealm.authenticationQuery = SELECT pass
java - 如何将身份验证成功处理程序分配给多个 Spring Security 领域
我有以下 Spring 安全配置类，用于两个独立的安全领域:管理区域和前端区域: @Configuration @EnableWebSecurity @EnableGlobalMethodSecuri
c# - 代码生成 - 领域/模型优先 (DDD)
按照目前的情况，这个问题不适合我们的问答形式。我们希望答案得到事实、引用或专业知识的支持，但这个问题可能会引发辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开，visit the
ruby-on-rails - 如何只做积极的 int 领域？
我有 posqtresql 数据库。表中有一个整数字段。如何使它只有积极的？不在 rails 中进行验证。我需要在迁移文件中制作它最佳答案您可以在 Postgresql 中使用检查约束。 Rail

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

elasticsearch - 重视 Elasticsearch 领域