gpt4 book ai didi

elasticsearch - Elasticsearch索引搜索货币美元和英镑符号

转载 作者:行者123 更新时间:2023-12-02 22:24:04 24 4
gpt4 key购买 nike

在我的某些文档中,我使用$或£符号。我想搜索£并检索包含该符号的文档。我经历了the documentation,但出现了一些认知失调。

# Delete the `my_index` index
DELETE /my_index

# Create a custom analyzer
PUT /my_index
{
"settings": {
"analysis": {
"char_filter": {
"&_to_and": {
"type": "mapping",
"mappings": [
"&=> and ",
"$=> dollar "
]
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"char_filter": [
"html_strip",
"&_to_and"
],
"tokenizer": "standard",
"filter": [
"lowercase"
]
}
}
}
}
}

就像文档指出的那样,这将返回“the”,“quick”,“and”,“brown”,“fox”:
# Test out the new analyzer
GET /my_index/_analyze?analyzer=my_analyzer&text=The%20quick%20%26%20brown%20fox

这将返回“the”,“quick”,“dollar”,“brown”,“fox”
GET /my_index/_analyze?analyzer=my_analyzer&text=The%20quick%20%24%20brown%20fox    

添加一些记录:
PUT /my_index/test/1
{
"title": "The quick & fast fox"
}

PUT /my_index/test/1
{
"title": "The daft fox owes me $100"
}

我本以为如果搜索“美元”,我会得到结果吗?相反,我没有任何结果:
GET /my_index/test/_search
{ "query": {
"simple_query_string": {
"query": "dollar"
}
}
}

甚至在分析器上使用“$”:
GET /my_index/test/_search
{ "query": {
"query_string": {
"query": "dollar10",
"analyzer": "my_analyzer"
}
}
}

最佳答案

您的问题是您指定了自定义分析器,但从未使用过。如果使用term vertors,则可以验证。因此,请按照以下步骤操作:

在为标题字段创建自定义分析器并为其设置索引时:

GET /my_index

{
"settings": {
"analysis": {
"char_filter": {
"&_to_and": {
"type": "mapping",
"mappings": [
"&=> and ",
"$=> dollar "
]
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"char_filter": [
"html_strip",
"&_to_and"
],
"tokenizer": "standard",
"filter": [
"lowercase"
]
}
}
}
}, "mappings" :{
"test" : {
"properties" : {
"title" : {
"type":"string",
"analyzer":"my_analyzer"
}
}
}
}
}

插入数据:
PUT my_index/test/1

{
"title": "The daft fox owes me $100"
}

检查术语 vector :
GET /my_index/test/1/_termvectors?fields=title

响应:
{
"_index":"my_index",
"_type":"test",
"_id":"1",
"_version":1,
"found":true,
"took":3,
"term_vectors":{
"title":{
"field_statistics":{
"sum_doc_freq":6,
"doc_count":1,
"sum_ttf":6
},
"terms":{
"daft":{
"term_freq":1,
"tokens":[
{
"position":1,
"start_offset":4,
"end_offset":8
}
]
},
"dollar100":{ <-- You can see it here
"term_freq":1,
"tokens":[
{
"position":5,
"start_offset":21,
"end_offset":25
}
]
},
"fox":{
"term_freq":1,
"tokens":[
{
"position":2,
"start_offset":9,
"end_offset":12
}
]
},
"me":{
"term_freq":1,
"tokens":[
{
"position":4,
"start_offset":18,
"end_offset":20
}
]
},
"owes":{
"term_freq":1,
"tokens":[
{
"position":3,
"start_offset":13,
"end_offset":17
}
]
},
"the":{
"term_freq":1,
"tokens":[
{
"position":0,
"start_offset":0,
"end_offset":3
}
]
}
}
}
}
}

现在搜索:
GET /my_index/test/_search

{
"query": {
"match": {
"title": "dollar100"
}
}
}

那会找到匹配的。但是使用查询字符串搜索为:
GET /my_index/test/_search

{ "query": {
"simple_query_string": {
"query": "dollar100"
}
}
}

什么也找不到。因为它搜索特殊的_all字段。如我所见,由于未分析字段,因此会聚合字段:
GET /my_index/test/_search

{
"query": {
"match": {
"_all": "dollar100"
}
}
}

找不到结果。但:
GET /my_index/test/_search

{
"query": {
"match": {
"_all": "$100"
}
}
}

发现。我不确定,但原因可能是默认分析器不是自定义分析器。要将自定义分析器设置为默认检查:

Changing the default analyzer in ElasticSearch or LogStash

http://elasticsearch-users.115913.n3.nabble.com/How-we-can-change-Elasticsearch-default-analyzer-td4040411.html

http://grokbase.com/t/gg/elasticsearch/148kwsxzee/overriding-built-in-analyzer-and-set-it-as-default

http://elasticsearch-users.115913.n3.nabble.com/How-to-set-the-default-analyzer-td3935275.html

关于elasticsearch - Elasticsearch索引搜索货币美元和英镑符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37075970/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com