gpt4 book ai didi

ElasticSearch - 使用连字符搜索

转载 作者:行者123 更新时间:2023-11-29 02:46:34 25 4
gpt4 key购买 nike

Elasticsearch 1.6

我想为包含连字符的文本编制索引,例如 U-12、U-17、WU-12、T 恤……并能够使用“简单查询字符串”查询来搜索它们。

数据样本(简化):

{"title":"U-12 Soccer",
"comment": "the t-shirts are dirty"}

由于已经有很多关于连字符的问题,我已经尝试了以下解决方案:

使用字符过滤器:ElasticSearch - Searching with hyphens in name .

所以我做了这个映射:

{
"settings":{
"analysis":{
"char_filter":{
"myHyphenRemoval":{
"type":"mapping",
"mappings":[
"-=>"
]
}
},
"analyzer":{
"default":{
"type":"custom",
"char_filter": [ "myHyphenRemoval" ],
"tokenizer":"standard",
"filter":[
"standard",
"lowercase"
]
}
}
}
},
"mappings":{
"test":{
"properties":{
"title":{
"type":"string"
},
"comment":{
"type":"string"
}
}
}
}
}

搜索是通过以下查询完成的:

{"_source":true,
"query":{
"simple_query_string":{
"query":"<Text>",
"default_operator":"AND"
}
}
}
  1. 有效的方法:

    “U-12”、“U*”、“t*”、“ts*”

  2. 什么没用:

    “U-*”、“u-1*”、“t-*”、“t-sh*”……

所以似乎没有对搜索字符串执行 char 过滤器?我可以做些什么来完成这项工作?

最佳答案

答案很简单:

引自 Igor Motov:Configuring the standard tokenizer

By default the simple_query_string query doesn't analyze the words with wildcards. As a result it searches for all tokens that start with i-ma. The word i-mac doesn't match this request because during analysis it's split into two tokens i and mac and neither of these tokens starts with i-ma. In order to make this query find i-mac you need to make it analyze wildcards:

{
"_source":true,
"query":{
"simple_query_string":{
"query":"u-1*",
"analyze_wildcard":true,
"default_operator":"AND"
}
}
}

关于ElasticSearch - 使用连字符搜索,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30917043/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com