gpt4 book ai didi

elasticsearch - 使用术语聚合执行搜索时如何返回实际值(不是小写)?

转载 作者:行者123 更新时间:2023-11-29 02:52:20 24 4
gpt4 key购买 nike

我正在开发一个 ElasticSearch (6.2) 项目,其中 index 有许多 keyword 字段,并且它们使用 lowercase 过滤器进行规范化以执行不区分大小写的搜索。搜索效果很好,返回规范化字段的实际值(不是小写)。但是,聚合不返回字段的实际值(返回小写)。

以下示例取自 ElasticSearch 文档。

https://www.elastic.co/guide/en/elasticsearch/reference/master/normalizer.html

创建索引:

PUT index{  "settings": {    "analysis": {      "normalizer": {        "my_normalizer": {          "type": "custom",          "char_filter": [],          "filter": ["lowercase", "asciifolding"]        }      }    }  },  "mappings": {    "_doc": {      "properties": {        "foo": {          "type": "keyword",          "normalizer": "my_normalizer"        }      }    }  }}

插入文档:

PUT index/_doc/1{  "foo": "Bar"}PUT index/_doc/2{  "foo": "Baz"}

聚合搜索:

GET index/_search{  "size": 0,  "aggs": {    "foo_terms": {      "terms": {        "field": "foo"      }    }  }}

结果:

{  "took": 43,  "timed_out": false,  "_shards": {    "total": 1,    "successful": 1,    "skipped" : 0,    "failed": 0  },  "hits": {    "total": 3,    "max_score": 0.0,    "hits": {    "total": 2,    "max_score": 0.47000363,    "hits": [      {        "_index": "index",        "_type": "_doc",        "_id": "1",        "_score": 0.47000363,        "_source": {          "foo": "Bar"        }      },      {        "_index": "index",        "_type": "_doc",        "_id": "2",        "_score": 0.47000363,        "_source": {          "foo": "Baz"        }      }    ]  }  },  "aggregations": {    "foo_terms": {      "doc_count_error_upper_bound": 0,      "sum_other_doc_count": 0,      "buckets": [        {          "key": "bar",          "doc_count": 1        },        {          "key": "baz",          "doc_count": 1        }      ]    }  }}

如果您检查聚合,您将看到已返回小写值。例如“键”:“条”

有没有办法改变聚合以返回实际值?

例如“键”:“条”

最佳答案

如果您想进行不区分大小写的搜索但在聚合中返回精确值,则不需要任何规范化器。您可以简单地拥有一个带有 keyword 子字段的 text 字段(将标记小写并默认允许不区分大小写的搜索)。您将使用前者进行搜索,将后者用于聚合。它是这样的:

PUT index
{
"mappings": {
"_doc": {
"properties": {
"foo": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}

在为你的两个文档建立索引后,你可以在 foo.keyword 上发布一个 terms 聚合:

GET index/_search
{
"size": 2,
"aggs": {
"foo_terms": {
"terms": {
"field": "foo.keyword"
}
}
}
}

结果看起来像这样:

{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "index",
"_type": "_doc",
"_id": "2",
"_score": 1,
"_source": {
"foo": "Baz"
}
},
{
"_index": "index",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"foo": "Bar"
}
}
]
},
"aggregations": {
"foo_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Bar",
"doc_count": 1
},
{
"key": "Baz",
"doc_count": 1
}
]
}
}
}

关于elasticsearch - 使用术语聚合执行搜索时如何返回实际值(不是小写)?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51664234/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com