gpt4 book ai didi

json - 还有其他方法可以针对JSON中的多个嵌套字段优化此Elasticsearch查询

转载 作者:行者123 更新时间:2023-12-03 01:28:21 28 4
gpt4 key购买 nike

我是Elasticserach的新手。以下是需要在其上运行 flex 查询的示例数据。我正在尝试获取account_type为“信用卡”且source_name为“SOMEVALUE”的那些文档

{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "bureau_data",
"_type" : "_doc",
"_id" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
"_score" : 1.0,
"_source" : {
"userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
"raw_derived" : {
"gender" : "MALE",
"firstname" : "trsqlsz",
"middlename" : "rgj",
"lastname" : "ggksb",
"mobilephone" : "2125954664",
"dob" : "1988-06-28 00:00:00",
"applications" : [
{
"applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
"createdat" : "2019-06-07 19:28:54",
"updatedat" : "2019-06-07 19:28:55",
"source" : "4",
"source_name" : "EXPERIAN",
"applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
"accounts" : [
{
"applicationcreditreportaccountid" : "c5de28c4-cac9-4390-852a-96f143cb0b62",
"currentbalance" : 418288,
"institutionid" : "021d58b4-aba5-42c9-8d39-304a78d34aea",
"accounttypeid" : "5",
"institution_name" : "HDFC BANK",
"account_type_name" : "Personal Loan"
}
]
}
]
}
}
}

我已经尝试了以下查询及其正常工作。我需要我们是否有任何优化的方法来查询多个嵌套字段
GET /my_index/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "raw_derived.applications.accounts",
"query": {
"bool": {
"must": [
{"match": {
"raw_derived.applications.accounts.account_type_name": "Credit Card"
}}
]
}
}
}
},
{
"nested": {
"path": "raw_derived.applications",
"query": {
"bool": {
"must": [
{"match": {
"raw_derived.applications.source_name": "CIBIL"
}}
]
}
}
}
}
]
}
}

}

如果我要查询多个嵌套字段,它将变得很长。请建议使用任何其他方式查询嵌套字段或多个AND

最佳答案

那么,您的优化应该始终从数据模型/映射开始,因为这主要是性能问题的原因,而而不是是您的查询。

话虽如此,您可以通过展平数据来避免嵌套查询。统一的数据模型将导致每个应用程序和帐户元素一个文档。

由于Elasticsearch是非关系数据存储,因此对“冗余”数据进行索引完全可以。这是,而不是,这是一种懒惰的方法,而是处理这些类型的数据结构的常用方法。

样本文档1:

{
"_index" : "bureau_data",
"_type" : "_doc",
"_id" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
"_score" : 1.0,
"_source" : {
"userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
"gender" : "MALE",
"firstname" : "trsqlsz",
"middlename" : "rgj",
"lastname" : "ggksb",
"mobilephone" : "2125954664",
"dob" : "1988-06-28 00:00:00",
"applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
"createdat" : "2019-06-07 19:28:54",
"updatedat" : "2019-06-07 19:28:55",
"source" : "4",
"source_name" : "EXPERIAN",
"applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
"applicationcreditreportaccountid" : "c5de28c4-cac9-4390-852a-96f143cb0b62",
"currentbalance" : 418288,
"institutionid" : "021d58b4-aba5-42c9-8d39-304a78d34aea",
"accounttypeid" : "5",
"institution_name" : "HDFC BANK",
"account_type_name" : "Personal Loan"
}
}

如果同一用户创建另一个帐户,则您将发送完全相同(“冗余”)的数据,但其他帐户元素/数据除外,如下所示:
    {
"_index" : "bureau_data",
"_type" : "_doc",
"_id" : "another, from es generated id",
"_score" : 1.0,
"_source" : {
"userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
"gender" : "MALE",
"firstname" : "trsqlsz",
"middlename" : "rgj",
"lastname" : "ggksb",
"mobilephone" : "2125954664",
"dob" : "1988-06-28 00:00:00",
"applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
"createdat" : "2019-06-07 19:28:54",
"updatedat" : "2019-06-07 19:28:55",
"source" : "4",
"source_name" : "EXPERIAN",
"applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
"applicationcreditreportaccountid" : "the new id",
"currentbalance" : 4711,
"institutionid" : "foo",
"accounttypeid" : "bar",
"institution_name" : "foo bar",
"account_type_name" : "foo baz"
}
}

使用这种数据模型,您可以运行简单的查询来获取结果:
    GET /my_index/_search
{
"query": {
"bool": {
"must": [
{
"match":{
"account_type_name": "Credit Card"
}
},
{
"match":{
"source_name": "CIBIL"
}
}
]
}
}
}

关于json - 还有其他方法可以针对JSON中的多个嵌套字段优化此Elasticsearch查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57408646/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com