gpt4 book ai didi

带分组的 Elasticsearch 查询

转载 作者:行者123 更新时间:2023-12-02 22:21:23 28 4
gpt4 key购买 nike

我有产品的数据库。每个产品由字段组成:uuid , group_id , title , since , till .
sincetill定义可用性间隔。

间隔 [since, till]是每个 group_id 的不相交对。因此,在一组中没有间隔相交的 2 个产品。

我需要获取满足以下条件的产品列表:

  • 列表中每组最多 1 个产品
  • 每个产品都匹配给定的标题
  • 每个产品都是最新的(因为 <= NOW <= 直到)或者如果当前产品不存在于它的组中,它应该是离 future 最近的产品(分钟(起),因为 >= NOW)

  • ES映射:
    {
    "products": {
    "mappings": {
    "dynamic": "false",
    "properties": {
    "group_id": {
    "type": "long",
    "fields": {
    "keyword": {
    "type": "keyword",
    "ignore_above": 256
    }
    }
    },
    "title": {
    "type": "text",
    "fields": {
    "keyword": {
    "type": "keyword",
    "ignore_above": 256
    }
    }
    },
    "since": {
    "type": "date",
    "fields": {
    "keyword": {
    "type": "keyword",
    "ignore_above": 256
    }
    }
    },
    "till": {
    "type": "date",
    "fields": {
    "keyword": {
    "type": "keyword",
    "ignore_above": 256
    }
    }
    }
    }
    }
    }
    }

    是否可以在 Elasticsearch 中创建这样的查询?

    最佳答案

    查看您的映射,我创建了示例文档、查询及其响应,如下所示:

    示例文件:

    POST product_index/_doc/1
    {
    "group_id": 1,
    "title": "nike",
    "since": "2020-01-01",
    "till": "2020-03-31"
    }

    POST product_index/_doc/2
    {
    "group_id": 2,
    "title": "nike",
    "since": "2020-01-01",
    "till": "2020-03-31"
    }

    POST product_index/_doc/3
    {
    "group_id": 3,
    "title": "nike",
    "since": "2020-03-15",
    "till": "2020-03-31"
    }

    POST product_index/_doc/4
    {
    "group_id": 3,
    "title": "nike",
    "since": "2020-03-19",
    "till": "2020-03-31"
    }

    如上所述,总共有4个文档, group 12每人一份文件,同时 group 3有两个文档,同时包含 since >= now
    查询请求:

    查询摘要如下:
    Bool
    - Must
    - Match title as nike
    - Should
    - clause 1 - since <= now <= till
    - clause 2 - now <= since
    Agg
    - Terms on GroupId
    - Top Hits (retrieve only 1st document as your clause is at most for each group, and sort them by asc order of since)

    以下是实际查询:
    POST product_index/_search
    {
    "size": 0,
    "query": {
    "bool": {
    "must": [
    {
    "match": {
    "title": "nike"
    }
    },
    {
    "bool": {
    "should": [
    { <--- since <=now <= till
    "bool": {
    "must": [
    {
    "range": {
    "till": {
    "gte": "now"
    }
    }
    },
    {
    "range": {
    "since": {
    "lte": "now"
    }
    }
    }
    ]
    }
    },
    { <---- since >= now
    "bool": {
    "must": [
    {
    "range": {
    "since": {
    "gte": "now"
    }
    }
    }
    ]
    }
    }
    ]
    }
    }
    ]
    }
    },
    "aggs": {
    "my_groups": {
    "terms": {
    "field": "group_id.keyword",
    "size": 10
    },
    "aggs": {
    "my_docs": {
    "top_hits": {
    "size": 1, <--- Note this to return at most one document
    "sort": [
    { "since": { "order": "asc"} <--- Sort to return the lowest value of since
    }
    ]
    }
    }
    }
    }
    }
    }

    请注意,我使用了 Terms AggregationTop Hits作为它的子聚合。

    回复:
    {
    "took" : 7,
    "timed_out" : false,
    "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
    },
    "hits" : {
    "total" : {
    "value" : 4,
    "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
    },
    "aggregations" : {
    "my_groups" : {
    "doc_count_error_upper_bound" : 0,
    "sum_other_doc_count" : 0,
    "buckets" : [
    {
    "key" : "3",
    "doc_count" : 2,
    "my_docs" : {
    "hits" : {
    "total" : {
    "value" : 2,
    "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
    {
    "_index" : "product_index",
    "_type" : "_doc",
    "_id" : "3",
    "_score" : null,
    "_source" : {
    "group_id" : 3,
    "title" : "nike",
    "since" : "2020-03-15",
    "till" : "2020-03-31"
    },
    "sort" : [
    1584230400000
    ]
    }
    ]
    }
    }
    },
    {
    "key" : "1",
    "doc_count" : 1,
    "my_docs" : {
    "hits" : {
    "total" : {
    "value" : 1,
    "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
    {
    "_index" : "product_index",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : null,
    "_source" : {
    "group_id" : 1,
    "title" : "nike",
    "since" : "2020-01-01",
    "till" : "2020-03-31"
    },
    "sort" : [
    1577836800000
    ]
    }
    ]
    }
    }
    },
    {
    "key" : "2",
    "doc_count" : 1,
    "my_docs" : {
    "hits" : {
    "total" : {
    "value" : 1,
    "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
    {
    "_index" : "product_index",
    "_type" : "_doc",
    "_id" : "2",
    "_score" : null,
    "_source" : {
    "group_id" : 2,
    "title" : "nike",
    "since" : "2020-01-01",
    "till" : "2020-03-31"
    },
    "sort" : [
    1577836800000
    ]
    }
    ]
    }
    }
    }
    ]
    }
    }
    }

    让我知道这是否有帮助!

    关于带分组的 Elasticsearch 查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60621420/

    28 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com