gpt4 book ai didi

elasticsearch - Elasticsearch如何使用智能中文分析?

转载 作者:行者123 更新时间:2023-11-29 02:54:01 25 4
gpt4 key购买 nike

我已经在我们的 ES 集群上安装了 Smart Chinese Analysis for Elasticsearch,但是我没有找到关于如何指定正确分析器的文档。除了我需要设置一个分词器和一个指定停用词和词干分析器的过滤器之外,我会...

例如荷兰语:

"dutch": {
"type": "custom",
"tokenizer": "uax_url_email",
"filter": ["lowercase", "asciifolding", "dutch_stemmer_filter", "dutch_stop_filter"]
}

with:

"dutch_stemmer_filter": {
"type": "stemmer",
"name": "dutch"
},

"dutch_stop_filter": {
"type": "stop",
"stopwords": ["_dutch_"]
}

如何为中文配置我的分析器?

最佳答案

对某个索引试试这个(分析器是'smartcn',分词器是'smartcn_tokenizer'):

PUT /test_chinese
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"default": {
"type": "smartcn"
}
}
}
}
}
}

GET /test_chinese/_analyze?text='叻出色'

它应该输出两个标记(测试取自 plugin test classes ):

{
"tokens": [
{
"token": "叻",
"start_offset": 1,
"end_offset": 2,
"type": "word",
"position": 2
},
{
"token": "出色",
"start_offset": 2,
"end_offset": 4,
"type": "word",
"position": 3
}
]
}

关于elasticsearch - Elasticsearch如何使用智能中文分析?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26087072/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com