gpt4 book ai didi

elasticsearch - Elasticsearch自定义分析器被忽略

转载 作者:行者123 更新时间:2023-12-02 23:31:51 25 4
gpt4 key购买 nike

我正在使用Elasticsearch 2.2.0,并且尝试在字段上使用lowercase + asciifolding过滤器。

这是http://localhost:9200/myindex/的输出

{
"myindex": {
"aliases": {},
"mappings": {
"products": {
"properties": {
"fold": {
"analyzer": "folding",
"type": "string"
}
}
}
},
"settings": {
"index": {
"analysis": {
"analyzer": {
"folding": {
"token_filters": [
"lowercase",
"asciifolding"
],
"tokenizer": "standard",
"type": "custom"
}
}
},
"creation_date": "1456180612715",
"number_of_replicas": "1",
"number_of_shards": "5",
"uuid": "vBMZEasPSAyucXICur3GVA",
"version": {
"created": "2020099"
}
}
},
"warmers": {}
}
}

当我尝试使用 folding API测试 _analyze自定义过滤器时,这就是 http://localhost:9200/myindex/_analyze?analyzer=folding&text=%C3%89sta%20est%C3%A1%20loca的输出
{
"tokens": [
{
"end_offset": 4,
"position": 0,
"start_offset": 0,
"token": "Ésta",
"type": "<ALPHANUM>"
},
{
"end_offset": 9,
"position": 1,
"start_offset": 5,
"token": "está",
"type": "<ALPHANUM>"
},
{
"end_offset": 14,
"position": 2,
"start_offset": 10,
"token": "loca",
"type": "<ALPHANUM>"
}
]
}

如您所见,返回的 token 为: Éstaestáloca 而不是 estaestaloca。这是怎么回事?似乎这种折叠式分析仪被忽略了。

最佳答案

创建索引时,看起来像是一个简单的错字。

在您的"analysis":{"analyzer":{...}}块中,这是:

"token_filters": [...]

应该
"filter": [...]

检查 the documentation对此进行确认。由于您的过滤器数组名称不正确,ES完全忽略了它,只是决定使用 standard分析器。这是一个使用Sense chrome插件编写的小示例。按顺序执行它们:
DELETE /test

PUT /test
{
"analysis": {
"analyzer": {
"folding": {
"type": "custom",
"filter": [
"lowercase",
"asciifolding"
],
"tokenizer": "standard"
}
}
}
}

GET /test/_analyze
{
"analyzer":"folding",
"text":"Ésta está loca"
}

和最后 GET /test/_analyze的结果:
"tokens": [
{
"token": "esta",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "esta",
"start_offset": 5,
"end_offset": 9,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "loca",
"start_offset": 10,
"end_offset": 14,
"type": "<ALPHANUM>",
"position": 2
}
]

关于elasticsearch - Elasticsearch自定义分析器被忽略,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35565421/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com