gpt4 book ai didi

sum of duration field of max per group in Elasticsearch(Elasticearch中每组最大持续时间字段的总和)

转载 作者:bug小助手 更新时间:2023-10-25 23:17:58 25 4
gpt4 key购买 nike



I would like to create a visualizer by summing up duration field after retrieving max id per group in Elasticsearch. For example:

我想创建一个可视化工具后,在Elasticearch中检索每个组的最大ID后总结持续时间字段。例如:


Data is:

数据为:










































































id workflow sid duration
1 A x1 1m
1 A x2 2m
2 A x1 2m
2 A x2 3m
1 B y1 1m
1 B y2 2m
2 B y1 2m
2 B y2 3m
3 B y1 4m
3 B y2 2m


Given the table below, expected returned data as follows, which is max of id per workflow and sum up the duration.

给出下表,预期返回的数据如下,即每个工作流的最大id数和持续时间之和。























id workflow total
2 A 5m
3 B 6m


I'm new to Elasticsearch query and Kibana. Appreciate it if you can provide a pointer how to resolve my problem statement.

我对Elasticearch Query和Kibana不熟悉。如果你能提供一个如何解决我的问题陈述的指针,我将不胜感激。


{
"size": 0,
"aggs": {
"my-bucket": {
"terms": {
"field": "workflow"
},
"aggs": {
"max_id": {
"max": {
"field": "id"
}
}
}
}
}
}

I have the search query above with expected bucket of workflow and max id #. How to use the max id # to retrieve the sid and sum up the duration.

我有上面的搜索查询,具有预期的工作流桶和最大ID号。如何使用最大id#来检索SID并汇总时长。


更多回答
优秀答案推荐

Instead of finding max_id, it might be easier to sort all buckets by id and only show the top one:

与查找max_id相比,按id对所有存储桶进行排序并只显示最上面的一个存储桶可能更容易:


DELETE test
PUT test
{
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
},
"mappings": {
"properties": {
"id": {
"type": "long"
},
"duration_min": {
"type": "integer"
},
"sid": {
"type": "keyword"
},
"workflow": {
"type": "keyword"
}
}
}
}


POST test/_bulk?refresh
{"index":{}}
{"id": 1, "workflow": "A", "sid": "x1", "duration_min": 1}
{"index":{}}
{"id": 1, "workflow": "A", "sid": "x2", "duration_min": 2}
{"index":{}}
{"id": 2, "workflow": "A", "sid": "x1", "duration_min": 2}
{"index":{}}
{"id": 2, "workflow": "A", "sid": "x2", "duration_min": 3}
{"index":{}}
{"id": 1, "workflow": "B", "sid": "y1", "duration_min": 1}
{"index":{}}
{"id": 1, "workflow": "B", "sid": "y2", "duration_min": 2}
{"index":{}}
{"id": 2, "workflow": "B", "sid": "y1", "duration_min": 2}
{"index":{}}
{"id": 2, "workflow": "B", "sid": "y2", "duration_min": 3}
{"index":{}}
{"id": 3, "workflow": "B", "sid": "y1", "duration_min": 4}
{"index":{}}
{"id": 3, "workflow": "B", "sid": "y2", "duration_min": 2}

GET test/_search
{
"size": 0,
"aggs": {
"by_workflow": {
"terms": {
"field": "workflow"
},
"aggs": {
"by_id": {
"terms": {
"field": "id"
},
"aggs": {
"sids": {
"terms": {
"field": "sid"
}
},
"duration_sum": {
"sum": {
"field": "duration_min"
}
},
"sales_bucket_sort": {
"bucket_sort": {
"sort": [
{ "_key": { "order": "desc" } }
],
"size": 1
}
}
}
}
}
}
}
}


This is another approach that I have learned from Elastic Stack community.

这是我从Elastic Stack社区学到的另一种方法。


GET test/_search
{
"size": 0,
"aggs": {
"workflow": {
"terms": {
"field": "workflow"
},
"aggs": {
"ids": {
"terms": {
"field": "id",
"order": { "max_id": "desc" },
"size": 1
},
"aggs": {
"max_id": {
"max": {
"field": "id"
}
},
"sum_duration": {
"sum": {
"field": "duration"
}
}
}
}
}
}
}
}

更多回答

Thanks for your sharing, it helps my understanding.

谢谢你的分享,这有助于我的理解。

Heh. Not sure what I was thinking here. Yeah, your solution is much better. :)

呵呵。不知道我在想什么。是啊,你的解决方案好多了。:)

I think you should remove "max_id": { "max": { "field": "id" } } since this value is already available from the parent agg and accept this as a solution.

我认为您应该删除“max_id”:{“max”:{“field”:“id”}},因为这个值已经在父agg中可用,并接受它作为解决方案。

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com