gpt4 book ai didi

ElasticSearch 按文档字段分组并计算出现次数

转载 作者:行者123 更新时间:2023-12-02 22:54:44 34 4
gpt4 key购买 nike

我的 ElasticSearch 6.5.2 索引看起来像:

      {
"_index" : "searches",
"_type" : "searches",
"_id" : "cCYuHW4BvwH6Y3jL87ul",
"_score" : 1.0,
"_source" : {
"querySearched" : "telecom",
}
},
{
"_index" : "searches",
"_type" : "searches",
"_id" : "cSYuHW4BvwH6Y3jL_Lvt",
"_score" : 1.0,
"_source" : {
"querySearched" : "telecom",
}
},
{
"_index" : "searches",
"_type" : "searches",
"_id" : "eCb6O24BvwH6Y3jLP7tM",
"_score" : 1.0,
"_source" : {
"querySearched" : "industry",
}

我想要一个返回此结果的查询:

"result": 
{
"querySearched" : "telecom",
"number" : 2
},
{
"querySearched" : "industry",
"number" : 1
}

我只想按发生次数分组并获取每个数字的数量,限制为十个最大的数字。我尝试使用聚合,但桶是空的。谢谢!

最佳答案

案例你的映射

PUT /index
{
"mappings": {
"doc": {
"properties": {
"querySearched": {
"type": "text",
"fielddata": true
}
}
}
}
}

你的查询应该是这样的

GET index/_search
{
"size": 0,
"aggs": {
"result": {
"terms": {
"field": "querySearched",
"size": 10
}
}
}
}

您应该添加fielddata:true 以启用text 类型字段的聚合more of that

    "size": 10, => limit to 10

在与@Kamal 简短讨论后,我觉得有义务让您知道,如果您选择启用 fielddata:true,您必须知道它会消耗大量堆空间。

来 self 分享的链接:

Fielddata can consume a lot of heap space, especially when loading high cardinality text fields. Once fielddata has been loaded into the heap, it remains there for the lifetime of the segment. Also, loading fielddata is an expensive process which can cause users to experience latency hits. This is why fielddata is disabled by default.

另一种选择(一种更有效的选择):

PUT /index
{
"mappings": {
"doc": {
"properties": {
"querySearched": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}

然后是你的聚合查询

GET index/_search
{
"size": 0,
"aggs": {
"result": {
"terms": {
"field": "querySearched.keyword",
"size": 10
}
}
}
}

两种解决方案都有效,但您应该选择 this正在考虑中。

希望对你有帮助

关于ElasticSearch 按文档字段分组并计算出现次数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58733898/

34 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com