gpt4 book ai didi

elasticsearch - Elasticsearch在inner_hits上聚合

转载 作者:行者123 更新时间:2023-12-03 00:16:39 34 4
gpt4 key购买 nike

我正在尝试对嵌套对象(查询)的inner_hits进行一些聚合,这些聚合基于查询日期进行过滤。我在以下块中进行的聚合是对主文档和“查询”中的所有对象进行聚合,而不仅仅是内部匹配中的对象。

GET /networkcollection/branch_routers/_search/
{
"_source": false,
"query": {
"filtered": {
"query": {
"match": {
"mh": 123
}
},
"filter": {
"nested": {
"path": "queries",
"filter": {
"range": {
"queries.dateQuery": {
"gt": "20160101T200000.000Z",
"lte": "now"
}
}
},
"inner_hits": {}
}
}
}
},
"aggs": {
"queries": {
"filter": {
"nested": {
"path": "queries",
"filter": {
"range": {
"queries.dateQuery": {
"gte": "20160101T200000.000Z",
"lte": "now"
}
}
}
}
},
"aggs": {
"minDateQuery": {
"min": {
"field": "queries.dateQuery"
}
}
}
}
}
}

我如何完成此聚合,以便仅聚合inner_hits中返回的“查询”对象?

最佳答案

我对这个答案很晚了,但是很可能仅在inner_hits上进行汇总。

我的ES版本:6.2.3

我正在提供详细的响应,包括索引映射,一些虚拟文档和search_query +响应。

基本思想是使用“过滤器”聚合。您根本不需要实际使用search_request的“query”部分,除非您要执行一些非常复杂的查询(以缩小聚合配置文件的范围)。可以在聚合“过滤器”中轻松指定大多数简单查询。

索引设置:

PUT networkcollection
{
"mappings": {
"branch_routers" : {
"properties" : {
"mh" : {
"type" : "text"
},
"queries" : {
"type" : "nested",
"properties" : {
"dateQuery" : {
"type" : "date"
}
}
}
}
}
}
}

PUT networkcollection/branch_routers/1
{
"mh" : "corona",
"queries" : [
{
"dateQuery" : "2012-04-23"
},
{
"dateQuery" : "2013-04-23"
},
{
"dateQuery" : "2014-04-23"
},
{
"dateQuery" : "2015-04-23"
},
{
"dateQuery" : "2016-04-23"
},
{
"dateQuery" : "2017-04-23"
},
{
"dateQuery" : "2018-04-23"
},
{
"dateQuery" : "2019-04-23"
},
{
"dateQuery" : "2020-04-23"
}
]
}

PUT networkcollection/branch_routers/2
{
"mh" : "happy",
"queries" : [
{
"dateQuery" : "2009-04-23"
},
{
"dateQuery" : "2008-04-23"
},
{
"dateQuery" : "2007-04-23"
},
{
"dateQuery" : "2015-04-23"
},
{
"dateQuery" : "2016-04-23"
},
{
"dateQuery" : "2017-04-23"
},
{
"dateQuery" : "2018-04-23"
},
{
"dateQuery" : "2019-04-23"
},
{
"dateQuery" : "2020-04-23"
}
]
}

PUT networkcollection/branch_routers/3
{
"mh" : "happy",
"queries" : [
{
"dateQuery" : "2001-04-23"
},
{
"dateQuery" : "2008-04-23"
},
{
"dateQuery" : "2007-04-23"
},
{
"dateQuery" : "2015-04-23"
},
{
"dateQuery" : "2016-04-23"
},
{
"dateQuery" : "2017-04-23"
},
{
"dateQuery" : "2018-04-23"
},
{
"dateQuery" : "2019-04-23"
},
{
"dateQuery" : "2020-04-23"
}
]
}


我们添加了三个基本文档,现在我们尝试将“mh”过滤为“happy”,并且我们希望嵌套对象中的最小dateQuery能够在2016年到现在之间过滤(我们目前位于中间日冕病毒锁定的原因,所以您知道这一年:))。

搜索查询:
GET networkcollection/branch_routers/_search
{
"_source": false,
"query": {
"match": {
"mh": "happy"
}
},
"aggs": {
"filtered_agg": {
"filter": {
"match" : {
"mh" : "happy"
}
},
"aggs": {
"filtered_nested": {
"nested": {
"path": "queries"
},
"aggs": {
"dateQuery_agg": {
"date_range": {
"field": "queries.dateQuery",
"ranges": [
{
"from": "now-4y/y",
"to": "now"
}
]
},
"aggs": {
"min_date": {
"min": {
"field": "queries.dateQuery"
}
}
}
}
}
}
}
}
}
}

响应:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.2876821,
"hits": [
{
"_index": "networkcollection",
"_type": "branch_routers",
"_id": "2",
"_score": 0.2876821
},
{
"_index": "networkcollection",
"_type": "branch_routers",
"_id": "3",
"_score": 0.2876821
}
]
},
"aggregations": {
"filtered_agg": {
"doc_count": 2,
"filtered_nested": {
"doc_count": 18,
"dateQuery_agg": {
"buckets": [
{
"key": "2016-01-01T00:00:00.000Z-2020-05-14T23:02:31.611Z",
"from": 1451606400000,
"from_as_string": "2016-01-01T00:00:00.000Z",
"to": 1589497351611,
"to_as_string": "2020-05-14T23:02:31.611Z",
"doc_count": 10,
"min_date": {
"value": 1461369600000,
"value_as_string": "2016-04-23T00:00:00.000Z"
}
}
]
}
}
}
}
}

如您所见,它可以正确过滤掉以“mh” =“corona”列出的文档,并仅保留带有“mh” =“happy”的两个文档,然后只过滤那些位于我指定的对象中的“查询”对象日期范围,最后提供min_date。

关于elasticsearch - Elasticsearch在inner_hits上聚合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35594652/

34 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com