gpt4 book ai didi

kibana - Elastic Search上日期直方图和日期范围之间的不同结果

转载 作者:行者123 更新时间:2023-12-02 23:34:59 25 4
gpt4 key购买 nike

我想使用Elastic Search / Kibana分析我的日志数据,并按月统计唯一身份客户。
当我使用日期直方图聚合和日期范围聚合时,结果会有所不同。

这是日期直方图查询:

"query": {
"query_string": {
"query": "_type:logs AND created_at:[2015-04-01 TO now]",
"analyze_wildcard": true
}
},
"size": 0,
"aggs": {
"2": {
"date_histogram": {
"field": "created_at",
"interval": "1M",
"min_doc_count": 1
},
"aggs": {
"1": {
"cardinality": {
"field": "customer.id"
}
}
}
}
}

结果:
"aggregations": {
"2": {
"buckets": [
{
"1": {
"value": 595805
},
"key_as_string": "2015-04-01T00:00:00.000Z",
"key": 1427839200000,
"doc_count": 6410438
},
{
"1": {
"value": 647788
},
"key_as_string": "2015-05-01T00:00:00.000Z",
"key": 1430431200000,
"doc_count": 6669555
},...

这是日期范围查询:
"query": {
"query_string": {
"query": "_type:logs AND created_at:[2015-04-01 TO now]",
"analyze_wildcard": true
}
},
"size": 0,
"aggs": {
"2": {
"date_range": {
"field": "created_at",
"ranges": [
{
"from": "2015-04-01",
"to": "2015-05-01"
},
{
"from": "2015-05-01",
"to": "2015-06-01"
}
]
},
"aggs": {
"1": {
"cardinality": {
"field": "customer.id"
}
}
}
}
}

和响应:
"aggregations": {
"2": {
"buckets": [
{
"1": {
"value": 592179
},
"key": "2015-04-01T00:00:00.000Z-2015-05-01T00:00:00.000Z",
"from": 1427846400000,
"from_as_string": "2015-04-01T00:00:00.000Z",
"to": 1430438400000,
"to_as_string": "2015-05-01T00:00:00.000Z",
"doc_count": 6411884
},
{
"1": {
"value": 616995
},
"key": "2015-05-01T00:00:00.000Z-2015-06-01T00:00:00.000Z",
"from": 1430438400000,
"from_as_string": "2015-05-01T00:00:00.000Z",
"to": 1433116800000,
"to_as_string": "2015-06-01T00:00:00.000Z",
"doc_count": 6668060
}
]
}
}

在第一种情况下,我4月有595,805,5月有647,788
在第二种情况下,我四月份有592,179,五月份有616,995

有人可以解释我为什么这些用例之间有这些区别?

谢谢

我更新了我的第一篇文章以添加另一个示例

我添加了另一个示例,该示例中的数据较少(在1天之内),但存在相同的问题。这是第一个带有日期直方图的请求:
{
"size": 0,
"query": {
"query_string": {
"query": "_type:logs AND logs.created_at:[2015-04-01 TO 2015-04-01]",
"analyze_wildcard": true
}
},
"aggs": {
"2": {
"date_histogram": {
"field": "created_at",
"interval": "1h",
"pre_zone": "00:00",
"pre_zone_adjust_large_interval": true,
"min_doc_count": 1
},
"aggs": {
"1": {
"cardinality": {
"field": "customer.id"
}
}
}
}
}
}

我们可以看到第一个小时中有660个唯一计数和1717个文档计数:
{  
"hits":{
"total":203961,
"max_score":0,
"hits":[

]
},
"aggregations":{
"2":{
"buckets":[
{
"1":{
"value":660
},
"key_as_string":"2015-04-01T00:00:00.000Z",
"key":1427846400000,
"doc_count":1717
},
{
"1":{
"value":324
},
"key_as_string":"2015-04-01T01:00:00.000Z",
"key":1427850000000,
"doc_count":776
},
{
"1":{
"value":190
},
"key_as_string":"2015-04-01T02:00:00.000Z",
"key":1427853600000,
"doc_count":481
}
]
}
}
}

但是在第二个请求中,日期范围为:
{
"size": 0,
"query": {
"query_string": {
"query": "_type:logs AND logs.created_at:[2015-04-01 TO 2015-04-01]",
"analyze_wildcard": true
}
},
"aggs": {
"2": {
"date_range": {
"field": "created_at",
"ranges": [
{
"from": "2015-04-01T00:00:00",
"to": "2015-04-01T01:00:00"
},
{
"from": "2015-04-01T01:00:00",
"to": "2015-04-01T02:00:00"
}
]
},
"aggs": {
"1": {
"cardinality": {
"field": "customer.id"
}
}
}
}
}
}

我们只能看到633个唯一计数和1717个文档计数:
{  
"hits":{
"total":203961,
"max_score":0,
"hits":[

]
},
"aggregations":{
"2":{
"buckets":[
{
"1":{
"value":633
},
"key":"2015-04-01T00:00:00.000Z-2015-04-01T01:00:00.000Z",
"from":1427846400000,
"from_as_string":"2015-04-01T00:00:00.000Z",
"to":1427850000000,
"to_as_string":"2015-04-01T01:00:00.000Z",
"doc_count":1717
},
{
"1":{
"value":328
},
"key":"2015-04-01T01:00:00.000Z-2015-04-01T02:00:00.000Z",
"from":1427850000000,
"from_as_string":"2015-04-01T01:00:00.000Z",
"to":1427853600000,
"to_as_string":"2015-04-01T02:00:00.000Z",
"doc_count":776
}
]
}
}
}

请有人可以告诉我为什么?谢谢

最佳答案

当使用date_histogram聚合时,您需要考虑timezone,而date_range并不如此,因为它始终使用GMT时区。

如果查看结果中的长毫秒值,则会看到以下内容:

对于您的日期直方图,from: 1427839200000实际上等于2015-03-31T22:00:00.000Z,它不同于根据GMT时区格式化的key_as_string值(即2015-04-01T00:00:00.000Z)。

在您的第一次聚合中,尝试显式指定time_zone参数为您的当前时区(显然是GMT + 2),您应该获得相同的结果:

  "date_histogram": {
"field": "created_at",
"interval": "1M",
"min_doc_count": 1,
"time_zone": -2
},

关于kibana - Elastic Search上日期直方图和日期范围之间的不同结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32440855/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com