gpt4 book ai didi

elasticsearch - Elasticsearch没有返回我在搜索结果中期望的文档

转载 作者:行者123 更新时间:2023-12-03 01:21:33 34 4
gpt4 key购买 nike

我收集了具有名字,姓氏,电子邮件,描述和所有者ID的客户。我想从应用程序中提取一个字符串,并按优先级顺序搜索所有字段。我正在使用增强来实现这一目标。

目前,我在文档中的各个字段中都有许多名为Sean的测试客户。我有2个文档,其中包含一封电子邮件,电子邮件为sean.jones@email.com。一个文档的说明中包含相同的电子邮件。

当我执行以下搜索时,我在搜索结果中丢失了说明中不包含电子邮件的文档。

这是我的查询:

{
"query" : {
"bool" : {
"filter" : {
"match" : {
"ownerId" : "acct_123"
}
},
"must" : [
{
"bool" : {
"should" : [
{
"prefix" : {
"firstName" : {
"value" : "sean",
"boost" : 3
}
}
},
{
"prefix" : {
"lastName" : {
"value" : "sean",
"boost" : 3
}
}
},
{
"terms" : {
"boost" : 2,
"description" : [
"sean"
]
}
},
{
"prefix" : {
"email" : {
"value" : "sean",
"boost" : 1
}
}
}
]
}
}
]
}
}
}

这是我遗失的文件:
{
"_index" : "xxx",
"_id" : "cus_123",
"_version" : 1,
"_type" : "customers",
"_seq_no" : 9096,
"_primary_term" : 1,
"found" : true,
"_source" : {
"firstName" : null,
"id" : "cus_123",
"lastName" : null,
"email" : "sean.jones@email.com",
"ownerId" : "acct_123",
"description" : null
}
}

当我查看当前结果时,所有文档的得分均为3.0。他们的名字中也有“Sean”,因此得分更高。当我对缺少的文档执行_explain时,通过上面的查询,我得到以下信息:
{
"_index": "xxx",
"_type": "customers",
"_id": "cus_123",
"matched": true,
"explanation": {
"value": 1.0,
"description": "sum of:",
"details": [
{
"value": 1.0,
"description": "sum of:",
"details": [
{
"value": 1.0,
"description": "ConstantScore(email._index_prefix:sean)",
"details": []
}
]
},
{
"value": 0.0,
"description": "match on required clause, product of:",
"details": [
{
"value": 0.0,
"description": "# clause",
"details": []
},
{
"value": 1.0,
"description": "ownerId:acct_123",
"details": []
}
]
}
]
}
}

这是我的映射:
{
"properties": {
"firstName": {
"type": "text",
"index_prefixes": {
"max_chars": 10,
"min_chars": 1
}
},
"email": {
"analyzer": "my_email_analyzer",
"type": "text",
"index_prefixes": {
"max_chars": 10,
"min_chars": 1
}
},
"lastName": {
"type": "text",
"index_prefixes": {
"max_chars": 10,
"min_chars": 1
}
},
"description": {
"type": "text"
},
"ownerId": {
"type": "text"
}
}
}
        "my_email_analyzer": {
"type": "custom",
"tokenizer": "uax_url_email"
}

如果我不能正确理解这一点,因为该文档的得分仅为1,因此未达到特定的阈值。香港专业教育学院试图调整min_score,但我没有运气。关于如何使该文档包含在搜索结果中的任何想法?

非常感谢

最佳答案

它取决于“丢失”的含义:

  • ,是不是该文档未将其计入点击数(“总计”)?
  • 还是文档本身未在匹配列表中显示为匹配?

  • 如果是#2,则可能需要通过在搜索请求中添加 size -clause(默认大小为10)来增加Elasticsearch获取和返回的文档数量:

    示例
    "size": 50

    关于elasticsearch - Elasticsearch没有返回我在搜索结果中期望的文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60043143/

    34 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com