gpt4 book ai didi

elasticsearch - Elasticsearch 结果不符合预期

转载 作者:行者123 更新时间:2023-12-02 22:13:02 27 4
gpt4 key购买 nike

我有一个使用具有以下配置的自定义分析器索引的字段

 "COMPNAYNAME" : {
"type" : "text",
"analyzer" : "textAnalyzer"
}

"textAnalyzer" : {
"filter" : [
"lowercase"
],
"char_filter" : [ ],
"type" : "custom",
"tokenizer" : "ngram_tokenizer"
}

"tokenizer" : {
"ngram_tokenizer" : {
"type" : "ngram",
"min_gram" : "2",
"max_gram" : "3"
}
}

当我搜索文本“ikea”时,我得到以下结果

查询:

GET company_info_test_1/_search
{
"query": {
"match": {
"COMPNAYNAME": {"query": "ikea"}
}
}
}

结果如下,

1.mikea
2.likeable
3.maaikeart
4.likeables
5.ikea b.v. <------
6.likeachef
7.ikea breda <------
8.bernikeart
9.ikea duiven
10.mikea media

我预计完全匹配结果应该比其余结果得到更多提升。如果我必须使用精确匹配和模糊搜索,您能帮我建立索引的最佳方法是什么吗?

提前致谢。

最佳答案

You can use ngram tokenizer along with "search_analyzer": "standard" Refer this to know more about search_analyzer

正如@EvaldasBuinauskas 所指出的,您还可以使用 edge_ngram tokenizer在这里,如果您希望 token 仅从开始而不是从中间生成。

添加带有索引数据、映射、搜索查询和结果的工作示例

索引数据:

{ "title": "ikea b.v."}
{ "title" : "mikea" }
{ "title" : "maaikeart"}

索引映射

{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 2,
"max_gram": 10,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "standard"
}
}
}
}

搜索查询:

{
"query": {
"match" : {
"title" : "ikea"
}
}
}

搜索结果:

"hits": [
{
"_index": "normal",
"_type": "_doc",
"_id": "4",
"_score": 0.1499838, <-- note this
"_source": {
"title": "ikea b.v."
}
},
{
"_index": "normal",
"_type": "_doc",
"_id": "1",
"_score": 0.13562363, <-- note this
"_source": {
"title": "mikea"
}
},
{
"_index": "normal",
"_type": "_doc",
"_id": "3",
"_score": 0.083597526,
"_source": {
"title": "maaikeart"
}
}
]

关于elasticsearch - Elasticsearch 结果不符合预期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63917546/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com