gpt4 book ai didi

elasticsearch - 如何使用Elasticsearch获取每个文档的缺失字段的平均计数?

转载 作者:行者123 更新时间:2023-12-02 23:22:57 27 4
gpt4 key购买 nike

简短地说:使用Elasticsearch,在给出字段列表的情况下,如何获得每个文档的平均丢失字段的总数?

细节

使用missing聚合类型,我可以获取缺少给定字段的文档总数。因此,具有以下数据:

"hits": [{
"name": "A name",
"nickname": "A nickname",
"bestfriend": "A friend",
"hobby": "An hobby"
},{
"name": "A name",
"hobby": "An hobby"
},{
"name": "A name",
"nickname": "A nickname",
"hobby": "An hobby"
},{
"name": "A name",
"bestfriend": "A friend"
}]

我可以运行以下查询:
{
"aggs": {
"name_missing": {
"missing": {"field": "name"}
},
"nickname_missing": {
"missing": {"field": "nickname"}
},
"hobby_missing": {
"missing": {"field": "hobby"}
},
"bestfriend_missing": {
"missing": {"field": "bestfriend"}
}
}
}

我得到以下汇总:
...
"aggregations": {
"name_missing": {
"doc_count": 0
},
"nickname_missing": {
"doc_count": 2
},
"hobby_missing": {
"doc_count": 1
},
"bestfriend_missing": {
"doc_count": 1
}
}
...

我现在需要的是为每个文档获取 丢失字段的平均数量。我可以通过对结果进行代码数学运算:
  • 对所有missing聚合doc_count
  • 求和
  • 除以总点击数

  • 但是,您如何获得与Elasticsearch的聚合结果相同的结果?

    感谢您的任何答复/建议。

    最佳答案

    这是一个丑陋的解决方案,但可以解决问题。

    GET missing/missing/_search
    {
    "size": 0,
    "aggs": {
    "result": {
    "terms": {
    "script": "'aaa'"
    },
    "aggs": {
    "name_missing": {
    "missing": {
    "field": "name"
    }
    },
    "nickname_missing": {
    "missing": {
    "field": "nickname"
    }
    },
    "hobby_missing": {
    "missing": {
    "field": "hobby"
    }
    },
    "bestfriend_missing": {
    "missing": {
    "field": "bestfriend"
    }
    },
    "avg_missing": {
    "bucket_script": {
    "buckets_path": { // This is kind of defining variables. name_missing._count will take the doc_count of the name_missing aggregation and same for others(nickname_missing,hobby_missing,bestfriend_missing) as well. "count":"_count" will take doc_count of the documents on which aggregation is performed(total no. of Hits).
    "name_missing": "name_missing._count",
    "nickname_missing": "nickname_missing._count",
    "hobby_missing": "hobby_missing._count",
    "bestfriend_missing": "bestfriend_missing._count",
    "count":"_count"
    },
    "script": "(name_missing+nickname_missing+hobby_missing+bestfriend_missing)/count" // Here we are adding all the missing values and dividing it by the total no. of Hits as you require.
    }
    }
    }
    }
    }
    }

    我已经向您展示了如何执行此操作,现在介绍了如何调整参数并提取您想要的内容。

    关于elasticsearch - 如何使用Elasticsearch获取每个文档的缺失字段的平均计数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46808140/

    27 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com