gpt4 book ai didi

lucene - 查询嵌套文档中的缺失字段

转载 作者:行者123 更新时间:2023-12-02 22:52:39 26 4
gpt4 key购买 nike

我有一个包含许多标签的用户文档
这是映射:

{
"user" : {
"properties" : {
"tags" : {
"type" : "nested",
"properties" : {
"id" : {
"type" : "string",
"index" : "not_analyzed",
"store" : "yes"
},
"current" : {
"type" : "boolean"
},
"type" : {
"type" : "string"
},
"value" : {
"type" : "multi_field",
"fields" : {
"value" : {
"type" : "string",
"analyzer" : "name_analyzer"
},
"value_untouched" : {
"type" : "string",
"index" : "not_analyzed",
"include_in_all" : false
}
}
}
}
}
}
}
}

以下是样本用户文档:
用户1
{
"created_at": 1317484762000,
"updated_at": 1367040856000,
"tags": [
{
"type": "college",
"value": "Dhirubhai Ambani Institute of Information and Communication Technology",
"id": "a6f51ef8b34eb8f24d1c5be5e4ff509e2a361829"
},
{
"type": "company",
"value": "alma connect",
"id": "58ad4afcc8415216ea451339aaecf311ed40e132"
},
{
"type": "company",
"value": "Google",
"id": "93bc8199c5fe7adfd181d59e7182c73fec74eab5",
"current": true
},
{
"type": "discipline",
"value": "B.Tech.",
"id": "a7706af7f1477cbb1ac0ceb0e8531de8da4ef1eb",
"institute_id": "4fb424a5addf32296f00013a"
},
]
}

使用者2:
{
"created_at": 1318513355000,
"updated_at": 1364888695000,
"tags": [
{
"type": "college",
"value": "Dhirubhai Ambani Institute of Information and Communication Technology",
"id": "a6f51ef8b34eb8f24d1c5be5e4ff509e2a361829"
},
{
"type": "college",
"value": "Bharatiya Vidya Bhavan's Public School, Jubilee hills, Hyderabad",
"id": "d20730345465a974dc61f2132eb72b04e2f5330c"
},
{
"type": "company",
"value": "Alma Connect",
"id": "93bc8199c5fe7adfd181d59e7182c73fec74eab5"
},
{
"type": "sector",
"value": "Website and Software Development",
"id": "dc387d78fc99ab43e6ae2b83562c85cf3503a8a4"
}
]
}

使用者3:
{
"created_at": 1318513355001,
"updated_at": 1364888695010,
"tags": [
{
"type": "college",
"value": "Dhirubhai Ambani Institute of Information and Communication Technology",
"id": "a6f51ef8b34eb8f24d1c5be5e4ff509e2a361821"
},
{
"type": "sector",
"value": "Website and Software Development",
"id": "dc387d78fc99ab43e6ae2b83562c85cf3503a8a1"
}
]
}

使用上面的ES文档进行搜索,我想构造一个查询,在这里我需要获取嵌套标签文档中具有公司标签的用户或没有任何公司标签的用户。我的搜索查询是什么?

例如,在上述情况下,如果搜索google标签,则返回的文档应为“用户1”和“用户3”(因为用户1具有公司标签google,而用户3没有公司标签)。用户2未返回,因为它也具有Google以外的公司标签。

最佳答案

一点都不琐碎,主要是由于没有type:company标签子句。这是我想出的:

{
"or" : {
"filters" : [ {
"nested" : {
"filter" : {
"and" : {
"filters" : [ {
"term" : {
"tags.value" : "google"
}
}, {
"term" : {
"tags.type" : "company"
}
} ]
}
},
"path" : "tags"
}
}, {
"not" : {
"filter" : {
"nested" : {
"filter" : {
"term" : {
"tags.type" : "company"
}
},
"path" : "tags"
}
}
}
} ]
}
}

它包含一个带有两个嵌套子句的 or filter:第一个查找具有tag.type:company和tags.value:google的文档,而第二个查找没有任何tag.type:company的所有文档。

尽管和/或/和过滤器不像 term filter那样利用与位集一起使用的过滤器缓存,但仍需要对此进行优化。最好花一些时间来找到一种使用 bool filter并获得相同结果的方法。看看 this article了解更多。

关于lucene - 查询嵌套文档中的缺失字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16399464/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com