gpt4 book ai didi

elasticsearch - 如何在日期时间字段或 Elasticsearch 中按日期部分分组

转载 作者:行者123 更新时间:2023-12-02 22:45:11 26 4
gpt4 key购买 nike

我正在使用 elasticsearch 来存储和检索数据。

curl http://localhost:9200/test/test -X POST -H "Content-type: application/json" -d '{"id":1, "created_at": "2015-03-02T12:00:00", "name": "test1"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":2, "created_at": "2015-03-03T12:00:00", "name": "test2"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":3, "created_at": "2015-03-03T12:00:00", "name": "test3"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":3, "created_at": "2015-03-03T12:01:00", "name": "test3"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":3, "created_at": "2015-03-03T12:02:00", "name": "test3"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":4, "created_at": "2015-03-02T12:00:00", "name": "test4"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":5, "created_at": "2015-03-02T12:00:00", "name": "test5"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":6, "created_at": "2015-03-03T12:00:00", "name": "test6"}'

当我尝试按 created_at 分组时,它工作正常。

curl http://localhost:9200/test/test/_search -X POST -d '{"size": "0", "aggs": {"group_by_created_at":{"terms":{"field": "created_at"}}}}' | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 570 100 490 100 80 69900 11412 --:--:-- --:--:-- --:--:-- 81666
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"aggregations": {
"group_by_created_at": {
"buckets": [
{
"doc_count": 3,
"key": 1425297600000,
"key_as_string": "2015-03-02"
},
{
"doc_count": 5,
"key": 1425384000000,
"key_as_string": "2015-03-03"
},
{
"doc_count": 1,
"key": 1425384060000,
"key_as_string": "2015-03-03T12:01:00.000Z"
},
{
"doc_count": 1,
"key": 1425384120000,
"key_as_string": "2015-03-03T12:02:00.000Z"
}
]
}
},
"hits": {
"hits": [],
"max_score": 0.0,
"total": 8
},
"timed_out": false,
"took": 3
}

在上面的示例中,3 条记录的日期是 2015-03-03,我想计算一下。

输出会是这样的。

{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"aggregations": {
"group_by_created_at": {
"buckets": [
{
"doc_count": 3,
"key": 1425297600000,
"key_as_string": "2015-03-02"
},
{
"doc_count": 5,
"key": 1425384000000,
"key_as_string": "2015-03-03"
}
]
}
},
"hits": {
"hits": [],
"max_score": 0.0,
"total": 8
},
"timed_out": false,
"took": 3
}

我尝试在聚合中使用 range

curl http://localhost:9200/test/test/_search -X POST -d '{"size": "0", "aggs": {"group_by_created_at":{"range":{"field": "created_at", "ranges": [{"gte": "2015-03-02T00:00:00", "lte": "2015-03-02T23:59:59"}, {"gte": "2015-03-03T00:00:00", "lte": "2015-03-03T23:59:59"}]}}}}' | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 446 100 230 100 216 37581 35294 --:--:-- --:--:-- --:--:-- 38333
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"aggregations": {
"group_by_created_at": {
"buckets": [
{
"doc_count": 8,
"key": "*-*"
},
{
"doc_count": 8,
"key": "*-*"
}
]
}
},
"hits": {
"hits": [],
"max_score": 0.0,
"total": 8
},
"timed_out": false,
"took": 2
}

但它显示了两个存储桶中的所有 8 个文档。如果我在过滤查询中使用相同的存储桶,它工作正常。

curl http://localhost:9200/test/test/_search -X POST -d '{"query": {"filtered": {"filter":{"range":{"created_at" : {"gte": "2015-03-03T00:00:00", "lte": "2015-03-03T23:59:59"}}}}}}}' | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 994 100 870 100 124 110k 16105 --:--:-- --:--:-- --:--:-- 106k
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "mJs0WKiPTByQ6dLwJnKO8Q",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:00:00",
"id": 2,
"name": "test2"
},
"_type": "test"
},
{
"_id": "49a3pQX2TYa_KV029c0NLQ",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:02:00",
"id": 3,
"name": "test3"
},
"_type": "test"
},
{
"_id": "qWtAgCwSR_CTKsV1ibYVMg",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:01:00",
"id": 3,
"name": "test3"
},
"_type": "test"
},
{
"_id": "VoxSH6tXQmuugOVOmmrD2g",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:00:00",
"id": 6,
"name": "test6"
},
"_type": "test"
},
{
"_id": "oQmTxr5YRFaa3q7bvFOQLg",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:00:00",
"id": 3,
"name": "test3"
},
"_type": "test"
}
],
"max_score": 1.0,
"total": 5
},
"timed_out": false,
"took": 2
}

我错过了一些东西,我不知道是什么:(

最佳答案

有一个 date_histogram 聚合将在任何给定的时间间隔上分组。要按日期分组,您将使用:

"date_histogram":{
"field" : "created_at",
"interval" : "1d"
}

关于elasticsearch - 如何在日期时间字段或 Elasticsearch 中按日期部分分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28895627/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com