gpt4 book ai didi

elasticsearch - 带有大型char_filter列表创建的自定义分析器,用于elasticsearch

转载 作者:行者123 更新时间:2023-12-03 00:11:44 25 4
gpt4 key购买 nike

我尝试将自定义分析器添加到 flex 搜索中。我的同义词“mappings”列表太大(mapper_list)。 mapper_list的大小约为30.000个元素。

requests.post(es_host + '/_close')

settings = {
"settings" : {
"analysis" : {
"char_filter" : {
"my_mapping" : {
"type" : "mapping",
"mappings" : mapper_list
}
},
"analyzer" : {
"my_analyzer" : {
"tokenizer" : "standard",
"char_filter" : ["my_mapping"]
}
}
}
}
}

requests.put(es_host + '/_settings',
data=json.dumps(settings))

requests.post(es_host + '/_open')

错误搜索导致的错误信息
[test-index] IndexCreationException[failed to create index]; nested: ArrayIndexOutOfBoundsException[256];
at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:360)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewIndices(IndicesClusterStateService.java:313)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:174)
at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:610)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:772)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

请提出有关解决此问题的方法的任何评论。

有关ES版本的信息:
  "version" : {
"number" : "2.4.1",
"build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
"build_timestamp" : "2016-09-27T18:57:55Z",
"build_snapshot" : false,
"lucene_version" : "5.5.2"
}

最佳答案

我认为错误的原因是由于大句子的映射。您到底想映射什么?如果您查看source code并且违反了该限制,则限制为256个字符。我得到同样的异常(exception)

ArrayIndexOutOfBoundsException[256]



如果我尝试映射大字符串。
{
"settings": {
"analysis": {
"char_filter": {
"my_mapping": {
"type": "mapping",
"mappings": ["More than 256 characters. Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. => exception will be thrown"]
}
},
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"char_filter": [
"my_mapping"
]
}
}
}
}
}

我不知道您的用例,但是您需要减少要映射的字符串的长度,然后它应该可以工作。

关于elasticsearch - 带有大型char_filter列表创建的自定义分析器,用于elasticsearch,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40175225/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com