gpt4 book ai didi

php - 在Elasticsearch上忽略ASCII字符

转载 作者:行者123 更新时间:2023-12-02 23:37:39 25 4
gpt4 key购买 nike

如何在Elasticsearch上忽略ASCII字符?
我读过http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-asciifolding-tokenfilter.html,但现在真的不怎么做。

我正在使用PHP软件包。

public function createIndex()
{
$indexParams['index'] = $this->data['index'];

$mapping = [
'_source' => [
'enabled' => true
],
'properties' => [
'history.name' => [
'type' => 'string',
'_boost' => 0.2
]
]
];
$settings = [
"analysis" => [
"analyzer" => [
"default" => [
"tokenizer" => "standard",
"filter" => ["standard", "asciifolding"]
]
]
]
];
$indexParams['body']['mappings'][$this->data['type']] = $mapping;
$indexParams['body']['settings'][$this->data['type']] = $settings;

$this->es->client->indices()->create($indexParams);
}

但这仍然不能忽略带重音符号的字符。

谢谢,

最佳答案

几个更正和建议:

  • 我更喜欢设置显式分析器,而不是更改默认值。 future 会有更少的惊喜。因此,在您的示例中,我显式设置了analyzer: ascii_folding
  • 然后将分析器名称从default更改为ascii_folding
  • 最后,设置是按索引而不是按类型的。 JSON结构为:
    {
    "settings" : {
    "analysis" : {}
    },
    "mappings" : {
    "my_type" : {}
    }
    }

  • 编辑:用经过测试的有效代码段替换了旧示例。硬编码一些值(索引,类型等),但其他方面相同。它将文档作为命中返回...查询中肯定存在其他错误。
    $indexParams['index'] = 'test';
    $mapping = [
    '_source' => [
    'enabled' => true
    ],
    'properties' => [
    'history.name' => [
    'type' => 'string',
    '_boost' => 0.2,
    'analyzer' => 'ascii_folding'
    ]
    ]
    ];
    $settings = [
    "analysis" => [
    "analyzer" => [
    "ascii_folding" => [
    "tokenizer" => "standard",
    "filter" => ["standard", "asciifolding"]
    ]
    ]
    ]
    ];
    $indexParams['body']['mappings']['test'] = $mapping;
    $indexParams['body']['settings'] = $settings;

    // create index and wait for yellow
    $client->indices()->create($indexParams);
    $client->cluster()->health(['wait_for_status' => 'yellow']);


    //Index your document, refresh to make it visible
    $params = [
    'index' => 'test',
    'type' => 'test',
    'id' => 1,
    'body' => [
    'history.name' => 'Nicôlàs Wîdàrt'
    ]
    ];
    $client->index($params);
    $client->indices()->refresh();

    // Now search for it
    $params = [
    'index' => 'test',
    'type' => 'test',
    'body' => [
    'query' => [
    'match' => [
    'history.name' => 'Nicolas'
    ]
    ]
    ]
    ];
    $results = $client->search($params);
    print_r($results);

    这将单个文档作为值返回:
    Array
    (
    [took] => 3
    [timed_out] =>
    [_shards] => Array
    (
    [total] => 5
    [successful] => 5
    [failed] => 0
    )
    [hits] => Array
    (
    [total] => 1
    [max_score] => 0.19178301
    [hits] => Array
    (
    [0] => Array
    (
    [_index] => test
    [_type] => test
    [_id] => 1
    [_score] => 0.19178301
    [_source] => Array
    (
    [history.name] => Nicôlàs Wîdàrt
    )
    )
    )
    )
    )

    关于php - 在Elasticsearch上忽略ASCII字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28584780/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com