gpt4 book ai didi

elasticsearch - 如何在ElasticSearch Aggregation中包含所有文档并避免sum_other_doc_count> 0

转载 作者:行者123 更新时间:2023-12-03 02:25:44 25 4
gpt4 key购买 nike

ES并不是我工作的主流,有一种我无法纠正的行为。我有一个相当简单的聚合查询:

GET /my_index/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"request_type": "some_type"
}
},
{
"match": {
"carrier_name.keyword": "some_carrier"
}
}
]
}
},
"aggs": {
"by_date": {
"terms": {
"field": "date",
"order": {
"_term": "asc"
}
},
"aggs": {
"carrier_total": {
"sum": {
"field": "total_count"
}
}
}
}
}
}

我对 https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html的理解是,并非所有文档都包含在聚合中。实际上,根据查询部分,我的确在结果“sum_other_doc_count”中看到:值大于零。

我的问题:有没有一种方法可以构建搜索以便包括所有文档?文件数量很少,通常少于1k,

提前致谢,
鲁汶

最佳答案

根据documentaion

size defaults to 10

from + size can not be more than the index.max_result_window index setting, which defaults to 10,000.



在您的情况下,文档很小,接近1k,因此可以轻松检索1k结果。

The size parameter can be set to define how many term buckets should be returned out of the overall terms list. By default, the node coordinating the search process will request each shard to provide its own top size term buckets and once all shards respond, it will reduce the results to the final list that will then be returned to the client.



因此,要求在字段日期中包含前1000个文档。

...
"by_date": {
"terms": {
"field": "date",
"order": {
"_term": "asc"
},
"size": 1000
}

...

请求的大小越大,结果将越准确,但是,计算最终结果的成本也就越高

要了解更多信息,可以引用 official doc

关于elasticsearch - 如何在ElasticSearch Aggregation中包含所有文档并避免sum_other_doc_count> 0,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61106956/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com