gpt4 book ai didi

ElasticSearch 日期范围内的第一个和最后一个值以及其他聚合

转载 作者:行者123 更新时间:2023-11-29 02:51:51 25 4
gpt4 key购买 nike

我有像这样在 Elasticsearch 中索引的数据,这是我期望的输出,其中数据在 sku_id 上分组,我需要整个日期范围的平均排名,以及在日期范围内,last_7days_avg_rank 的第一个值和最后一个last_7days_avg_rank 的值 w.r.t 日期作为 2 个单独的字段,如下所示

如果这在 Elasticsearch 中可行,有人可以告诉我吗?现在我在服务层做这个计算,但由于响应时间已经变得 Not Acceptable ,我想把这个逻辑移到 ES 本身,但不知道如何实现这个?

输入:

 date     sku_id last_7days_avg_rank rank 
20180101 S1 200 200
20180102 S1 210 200
20180105 S1 220 200
20180108 S1 230 200

20180101 S2 180 300
20180103 S2 200 300
20180106 S2 250 300
20180107 S2 300 300

预期输出:

sku  first_val_last7day_avg  last_val_last7days_avg  avg(rank)   
S1 200 230 200
S2 180 300 300

谢谢!

最佳答案

你可以使用聚合得到想要的结果

{

"size": 0,
"aggs": {
"GROUP": {
"terms": {
"field": "sku_id"
},
"aggs": {
"AVG_RANK": {
"avg": {
"field": "rank"
}
},
"FIRST_7_RANK": {
"top_hits": {
"size": 1,
"sort": [
{
"my_date": {
"order": "asc"
}
}
]
}
},
"LAST_7_RANK": {
"top_hits": {
"size": 1,
"sort": [
{
"my_date": {
"order": "desc"
}
}
]
}
}
}
}
}
}

您可以得到以下结果作为输出:

 "aggregations": {
"GROUP": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "S1",
"doc_count": 40,
"LAST_7_RANK": {
"hits": {
"total": 40,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "type_name",
"_id": "AWI9MU6JeKRzn3ttxGOr",
"_score": null,
"_source": {
"my_date": "2018-01-08",
"sku_id": "S1",
"last_7days_avg_rank": 230,
"rank": 200
},
"sort": [
1515369600000
]
}
]
}
},
"AVG_RANK": {
"value": 200
},
"FIRST_7_RANK": {
"hits": {
"total": 40,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "type_name",
"_id": "AWI9LYVpeKRzn3ttxGOQ",
"_score": null,
"_source": {
"my_date": "20180101",
"sku_id": "S1",
"last_7days_avg_rank": 200,
"rank": 200
},
"sort": [
20180101
]
}
]
}
}
},
{
"key": "S2",
"doc_count": 40,
"LAST_7_RANK": {
"hits": {
"total": 40,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "type_name",
"_id": "AWI9MU6JeKRzn3ttxGOv",
"_score": null,
"_source": {
"my_date": "2018-01-07",
"sku_id": "S2",
"last_7days_avg_rank": 300,
"rank": 300
},
"sort": [
1515283200000
]
}
]
}
},
"AVG_RANK": {
"value": 300
},
"FIRST_7_RANK": {
"hits": {
"total": 40,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "type_name",
"_id": "AWI9LYVpeKRzn3ttxGOU",
"_score": null,
"_source": {
"my_date": "20180101",
"sku_id": "S2",
"last_7days_avg_rank": 180,
"rank": 300
},
"sort": [
20180101
]
}
]
}
}
}
]
}
}

以上结果为 S1 和 S2 创建了两个桶(组)。在每个桶中,您可以在 AVG_RANK 字段中获得该组的平均排名,对于 first_val_last7day_avg,您需要跟踪“FIRST_7_RANK”->“hits”->“hits”->“_source”的值->"rank"和类似的 last_val_last7days_avg 你需要 trance 值 "LAST_7_RANK"-> "hits"->"hits"->"_source"->"rank"希望对你有帮助

关于ElasticSearch 日期范围内的第一个和最后一个值以及其他聚合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49325741/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com