gpt4 book ai didi

elasticsearch - 文本 block 上的word_delimiter

转载 作者:行者123 更新时间:2023-12-02 23:30:25 28 4
gpt4 key购买 nike

看来word_delimiter仅设计用于单个术语。如果我有一段文本如下所示,该怎么办:

 "Contra-indications of paracetamol can be of certain sorts"

在这种情况下, word_delimiter接受整个句子并将其连接起来,而我只需要将其连接起来即可,以便我可以在文本块内搜索 "Contra-indications"contra indicationscontra-indications

最佳答案

您需要一个这样的分析器:

{
"settings": {
"analysis": {
"filter": {
"delimiter_filter": {
"type": "word_delimiter",
"catenate_words": true,
"preserve_original": true
}
},
"analyzer": {
"delimiter_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"delimiter_filter"
]
}
}
}
},
"mappings": {
"assets": {
"properties": {
"domain": {
"type": "string",
"analyzer": "delimiter_analyzer"
}
}
}
}
}

对于您的示例文本 Contra-indications of paracetamol can be of certain sorts,它们是生成的术语:
           "domain": [
"be",
"can",
"certain",
"contra",
"contra-indications",
"contraindications",
"indications",
"of",
"paracetamol",
"sorts"
]

关于elasticsearch - 文本 block 上的word_delimiter,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37148057/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com