gpt4 book ai didi

elasticsearch - 如何在Elasticsearch中匹配包含连字符或尾随空格的查询词

转载 作者:行者123 更新时间:2023-12-02 22:17:23 27 4
gpt4 key购买 nike

在elasticsearch映射的映射char_filter部分中,它的类型比较模糊,我很难理解是否以及如何使用charfilter分析器:http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

基本上,我们存储在索引中的数据是String类型的id,如下所示:"008392342000"。当查询字词实际上包含连字符或结尾空格时,我希望能够搜索此类ID:"008392342-000 "

您如何建议我将分析仪设置为?
当前,这是该字段的定义:

"mappings": {
"client": {
"properties": {
"ucn": {
"type": "multi_field",
"fields": {
"ucn_autoc": {
"type": "string",
"index": "analyzed",
"index_analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
},
"ucn": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}

这是包含分析器等的索引的设置。
 "settings": {
"analysis": {
"filter": {
"autocomplete_ngram": {
"max_gram": 15,
"min_gram": 1,
"type": "edge_ngram"
},
"ngram_filter": {
"type": "nGram",
"min_gram": 2,
"max_gram": 8
}
},
"analyzer": {
"lowercase_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "keyword"
},
"autocomplete_index": {
"filter": [
"lowercase",
"autocomplete_ngram"
],
"tokenizer": "keyword"
},
"ngram_index": {
"filter": [
"ngram_filter",
"lowercase"
],
"tokenizer": "keyword"
},
"autocomplete_search": {
"filter": [
"lowercase"
],
"tokenizer": "keyword"
},
"ngram_search": {
"filter": [
"lowercase"
],
"tokenizer": "keyword"
}
},
"index": {
"number_of_shards": 6,
"number_of_replicas": 1
}
}
}

最佳答案

您尚未提供实际的分析器,输入的数据以及期望的数据,但是根据您提供的信息,我将从此开始:

{
"settings": {
"analysis": {
"char_filter": {
"my_mapping": {
"type": "mapping",
"mappings": [
"-=>"
]
}
},
"analyzer": {
"autocomplete_search": {
"tokenizer": "keyword",
"char_filter": [
"my_mapping"
],
"filter": [
"trim"
]
},
"autocomplete_index": {
"tokenizer": "keyword",
"filter": [
"trim"
]
}
}
}
},
"mappings": {
"test": {
"properties": {
"ucn": {
"type": "multi_field",
"fields": {
"ucn_autoc": {
"type": "string",
"index": "analyzed",
"index_analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
},
"ucn": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}

char_filter可以将 -替换为什么: -=>。我也将使用 trim过滤器来摆脱任何尾随或前导的空格。不知道您的 autocomplete_index分析仪是什么,我只是使用了一个 keyword分析仪。

测试分析器的 GET /my_index/_analyze?analyzer=autocomplete_search&text= 0123-34742-000结果为:
"tokens": [
{
"token": "012334742000",
"start_offset": 0,
"end_offset": 17,
"type": "word",
"position": 1
}
]

这意味着它确实消除了 -和空格。
典型的查询将是:
{
"query": {
"match": {
"ucn.ucn_autoc": " 0123-34742-000 "
}
}
}

关于elasticsearch - 如何在Elasticsearch中匹配包含连字符或尾随空格的查询词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28204012/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com