gpt4 book ai didi

mysql - Sphinx 在西里尔查询中返回空结果

转载 作者:行者123 更新时间:2023-11-29 03:41:36 25 4
gpt4 key购买 nike

仍然使用拉丁语和俄语 translit(!) 进行正常搜索

$ search sumka   
using config file '/etc/sphinx/sphinx.conf'...
index 'test1': query 'sumka ': returned 636 matches of 636 total in 0.000 sec

displaying matches:
1. document=154143, weight=1660, name=Сумка Sony LCS-MS10 Gray Alpha Текстильная сумка для фотокамеры Alpha Серый цвет, casual style (сумка почтальона) [LCSMS10H.AE], description_short=Сумка Sony LCS-MS10 Gray Alpha Текстильная сумка для фотокамеры Alpha Серый цвет, casual style (сумка почтальона) [LCSMS10H.AE]
...

$ search сумка

using config file '/etc/sphinx/sphinx.conf'...
index 'test1': query 'сумка ': returned 0 matches of 0 total in 0.000 sec

words:
1. 'сумка': 0 documents, 0 hits

这里听起来像字符集问题,但我在 mysql 和查询中有 utf8

mysql> show variables like "character%";
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

mysql> show variables like "collation%";
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_unicode_ci |
+----------------------+-----------------+

$ file words
words: UTF-8 Unicode text
$ cat words | search --stdin

using config file '/etc/sphinx/sphinx.conf'...
index 'test1': query 'сумка
': returned 0 matches of 0 total in 0.000 sec

words:
1. 'сумка': 0 documents, 0 hits

对于php客户端和sphinx mysql之类的客户端也是如此。

完整的 sphinx 配置是 here ,但重要部分的引用:

source src1
{
...
sql_query_pre = SET NAMES utf8
sql_query_pre = SET CHARACTER SET utf8
...
}
index test1
{
...
charset_type = utf-8
...
}

我只发现一个类似的问题,但是在db中有latin1 charset。

程序版本是:

mysql  Ver 14.14 Distrib 5.5.20, for Linux (x86_64) using readline 5.1
Sphinx 2.0.6-id64-release (r3473)
centos 5.8

更新

使用 http://sphinxsearch.com/wiki/doku.php?id=charset_tables#cyrillic 中的表将 charset_table 添加到配置中但还是不行。

我还在我的本地 gentoo 上安装了 Sphinx 2.0.5-release (r3308),它可以开箱即用地处理西里尔查询。

最佳答案

你能试试mysql接口(interface)吗? ( mysql -P 9306 -h 127.0.0.1 然后做一个SELECT * FROM test1 WHERE MATCH('сумка'); )

关于mysql - Sphinx 在西里尔查询中返回空结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13177560/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com