gpt4 book ai didi

elasticsearch - 如何对脚本聚合进行分类?

转载 作者:行者123 更新时间:2023-12-02 23:48:47 26 4
gpt4 key购买 nike

我需要将format属性用于带有日期字段的terms聚合,但是我面临2个主要问题:

  • 用于普通aggs;例如,如果我使用了像月或年这样的较长时间段,则可能会得到2个带有相同key_as_string的存储桶,例如,如果我使用MMMM格式,则会得到:
  •     {
    "key": 1427922000000,
    "key_as_string": "April",
    "doc_count": 20
    },
    {
    "key": 1428094800000,
    "key_as_string": "April",
    "doc_count": 20
    }
  • 我无法在format源内将terms属性用于composite,这会引发错误。

  • 因此,我使用自定义 script通过将日期转换为字符串来解决此问题:
    "terms": {
    "script": {
    "source": """
    // check if document has this field to avoid errors
    if(!doc.containsKey(params.field) || doc[params.field].toString() == "[]") return "";
    // Get each field value as string
    String datetime = doc[params.field].value.toString();
    // Cast datetime into ZonedDateTime to use format function
    ZonedDateTime zdt = ZonedDateTime.parse(datetime);
    // Create format object based on user option
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern(params.format);
    // return formatted date
    return zdt.format(formatter);
    """,
    "params": {
    "field": "Order Date",
    "format": "MMMM"
    }
    },
    "order": {
    "_key": "asc"
    }
    }

    它可以按预期工作,但是我面临 _key排序的问题,因为elastic将每个存储桶键都视为字符串,所以;它订购像这样的月份:
    {
    "key": "April",
    "doc_count": 90
    },
    {
    "key": "August",
    "doc_count": 46
    },
    {
    "key": "December",
    "doc_count": 61
    },
    {
    "key": "February",
    "doc_count": 67
    }

    我的问题是:如何为aggs定制顺序或将脚本值返回为 key_as_string,以便使 flex 按 key排序?

    最佳答案

    我通过对同一字段使用嵌套的min聚合来解决此问题,并告诉 elasticsearch 根据min值对存储分区进行排序。

    使用min的重点是获取代表格式化存储桶的日期值,以便 flex 可以基于该日期排序。

    {
    "size": 0,
    "aggs": {
    "Order Date": {
    "terms": {
    "script": {
    "source": """
    // check if document has this field to avoid errors
    if(doc.containsKey(params.field) && doc[params.field].toString() != "[]") {
    // Get each field value as string
    String datetime = doc[params.field].value.toString();
    // Cast datetime into ZonedDateTime to use format function
    ZonedDateTime zdt = ZonedDateTime.parse(datetime);
    // Create format object based on user option
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern(params.format);
    // return formatted date
    return zdt.format(formatter);
    }
    """,
    "params": {
    "field": "Order Date",
    "format": "MMMM"
    }
    },
    "size": 10000,
    "order": {
    "SORT_BY_DATE": "asc"
    }
    },
    "aggs": {
    "SORT_BY_DATE": {
    "min": {
    "field": "Order Date"
    }
    }
    }
    }
    }
    }

    这将生成:
    {
    "key": "January",
    "doc_count": 64,
    "SORT_BY_DATE": {
    "value": 1420149600000,
    "value_as_string": "2015-01-01T22:00:00.000Z"
    }
    },
    {
    "key": "February",
    "doc_count": 67,
    "SORT_BY_DATE": {
    "value": 1422828000000,
    "value_as_string": "2015-02-01T22:00:00.000Z"
    }
    },
    {
    "key": "March",
    "doc_count": 32,
    "SORT_BY_DATE": {
    "value": 1425247200000,
    "value_as_string": "2015-03-01T22:00:00.000Z"
    }
    },
    {
    "key": "April",
    "doc_count": 90,
    "SORT_BY_DATE": {
    "value": 1427922000000,
    "value_as_string": "2015-04-01T21:00:00.000Z"
    }
    },
    {
    "key": "May",
    "doc_count": 85,
    "SORT_BY_DATE": {
    "value": 1430514000000,
    "value_as_string": "2015-05-01T21:00:00.000Z"
    }
    }

    而且我忽略了 SORT_BY_DATE aggs的值,因为它不存在。

    关于elasticsearch - 如何对脚本聚合进行分类?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59244896/

    26 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com