gpt4 book ai didi

full-text-search - sphinx 是如何计算重量的?

转载 作者:行者123 更新时间:2023-12-02 19:42:12 25 4
gpt4 key购买 nike

注意:
这是一个交叉帖子,首发于sphinx forum ,但是我没有得到答案,所以我将其发布在这里。

首先看一个例子:

以下是我的表格(仅供测试使用):

+----+--------------------------+----------------------+| Id | title                    | body                 |+----+--------------------------+----------------------+|  1 | National first hospital  | NASA                 ||  2 | National second hospital | Space Administration ||  3 | National govenment       | Support the hospital |+----+--------------------------+----------------------+

I want to search the contents from the title and body field, so I config the sphinx.confas shown followed:

--------The sphinx config file----------source mysql{        type = mysql        sql_host = localhost        sql_user = root        sql_pass =0000        sql_db = testfull        sql_port = 3306 # optional, default is 3306        sql_query_pre = SET NAMES utf8        sql_query = SELECT * FROM test}index mysql{        source = mysql        path = var/data/mysql_old_test        docinfo = extern        mlock = 0        morphology = stem_en, stem_ru, soundex        min_stemming_len = 1        min_word_len = 1        charset_type = utf-8        html_strip = 0}indexer{        mem_limit = 128M}searchd{    listen = 9312        read_timeout = 5        max_children = 30        max_matches = 1000        seamless_rotate = 0        preopen_indexes = 0        unlink_old = 1        pid_file = var/log/searchd_mysql.pid        log = var/log/searchd_mysql.log        query_log = var/log/query_mysql.log}------------------

Then I reindex the db and start the searchd daemon.

In my client side I set the attribute as:

----------Client side config-------------------

sc = new SphinxClient();
///other thing
HashMap<String, Integer> weiMap=new HashMap<String, Integer>();
weiMap.put("title", 100);
weiMap.put("body", 0);
sc.SetFieldWeights(weiMap);

sc.SetMatchMode(SphinxClient.SPH_MATCH_ALL);

sc.SetSortMode(SphinxClient.SPH_SORT_EXTENDED,"@weight DESC");
<小时/>

当我尝试搜索“国立医院”时,我得到以下输出:

Query 'National hospital' retrieved 3 of 3 matches in 0.0 sec.Query stats:        'nation' found 3 times in 3 documents        'hospit' found 3 times in 3 documentsMatches:1. id=3, weight=1012. id=1, weight=1003. id=2, weight=100

The match number (three matched) is right,however the order of the result is not what Iwanted.

Obviously the document of id 1 and 2 should be the most closed items to the requiredstring( "National hospital" ), so in my opinion they should be given the largestweights,but they are orderd at the last position.

I wonder if there is anyway to meet my requirement?

PS:

1)please do not suggestion me set the sortModel to :

sc.SetSortMode(SphinxClient.SPH_SORT_EXTENDED,"@weight ASC");

这可能只适用于这个例子,它会导致一些其他潜在的问题。

2)实际上我的表中的内容都是中文的,我只是使用“National Hosp..l”来制作一个例子。

最佳答案

1° 您询问“国家医院”,但 sphinx 搜索“国家”和“医院”,因为

 morphology = stem_en, stem_ru, soundex

2°你给予重量

 weiMap.put("title", 100);
weiMap.put("body", 0);

到不存在的文本字段

 sql_query = SELECT * FROM test

3°最后我对主要问题的简单回答

你按重量排序,第三排的重量更大,因为国家和医院之间没有任何言语

关于full-text-search - sphinx 是如何计算重量的?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3699398/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com