gpt4 book ai didi

postgresql - Postgres 全文搜索按位置排序

转载 作者:行者123 更新时间:2023-11-29 13:24:53 24 4
gpt4 key购买 nike

我有一张电影表,我想搜索标题并返回最接近的匹配项。

我认为全文搜索可能有用,但它似乎无法按单词的位置排序,尽管 postgres 知道该位置。这在 postgres 中可能吗?

这是我的查询:

SELECT collectibles.id, collectibles.title, ts_rank_cd(to_tsvector('english', collectibles.title), plainto_tsquery('old school')) AS score
FROM collectibles WHERE to_tsvector('english', collectibles.title) @@ plainto_tsquery('old school')
ORDER BY score DESC;

这是一些结果:(这是我能想到的最好的格式,抱歉!)

id | title | score

- 277568 | Wilson Meadows: Live At The 15th Old School & Blues Festival | 0.1
- 3545 | 5 Film Collection: Will Ferrell: Campaign / Old School (Unrtated Version) / Blades Of Glory / Roxbury / Semi-Pro | 0.1
- 10366 | Alice Cooper: Old School: 1964-1974 (DVD/CD Combo) | 0.1
- 13004 | American Classics: Old School (3-Disc Set) | 0.1
- 13005 | American Classics: Old School: Classic Chevrolets | 0.1
- 13006 | American Classics: Old School: Classic Travel Trailers | 0.1
- 13007 | American Classics: Old School: Kings Of Kustomizing | 0.1
- 14592 | Anchorman: The Legend Of Ron Burgundy (Widescreen/ Extended Edition) / Old School (R-Rated Version) (Back-To-Back) | 0.1
- 14593 | Anchorman: The Legend Of Ron Burgundy (Widescreen/ Extended Edition) / Old School (R-Rated Version) (Side-By-Side) | 0.1
- 20242 | Audiovisualize: Mixed By Addictive TV: Snake Worship Island / Corp. Inc. / Old School Futures / These Melodies / Robot War / ... | 0.1
- 192057 | Old School (DreamWorks/ Widescreen/ Unrated Version/ Special Edition) | 0.1
- 192058 | Old School (DreamWorks/ Widescreen/ Unrated Version/ Special Edition) / Road Trip (R-Rated) (Back-To-Back) | 0.1
- 192059 | Old School (DreamWorks/ Widescreen/ Unrated Version/ Special Edition) / Road Trip (R-Rated) (Side-By-Side) | 0.1
- 192060 | Old School (DreamWorks/ Widescreen/ Unrated Version/ Special Edition) / Road Trip (Unrated) (Back-To-Back) | 0.1
- 192061 | Old School (DreamWorks/ Widescreen/ Unrated Version/ Special Edition) / Road Trip (Unrated) (Side-By-Side) | 0.1
- 192062 | Old School (Warner Brothers/ R-Rated Version) | 0.1
- 192063 | Old School (Warner Brothers/ R-Rated Version/ Blu-ray) | 0.1
- 192064 | Old School (Warner Brothers/ Unrated Version) | 0.1
- 192065 | Old School (Warner Brothers/ Unrated Version/ Blu-ray) | 0.1
- 192066 | Old School Comedy (4-Pack): Atoll K / Jack And The Beanstalk / The Flying Deuces / Africa Screams | 0.1
- 192067 | Old School Hip Hop Dance #1: Beginner | 0.1
- 192068 | Old School Hip Hop Greatest | 0.1
- 192069 | Old School Hip Hop: Run DMC & Flava Flav (2-Disc) | 0.1
- 192070 | Old School Hits Movie Marathon Collection (3-Disc) | 0.1
- 192071 | Old School Returns | 0.1

所有这些的分数都是 0.1,但许多标题中的单词位置更靠近字符串的前面。有什么方法可以将它们排名更高吗?不幸的是,字符串或 ID 的长度并不是很好的排名限定符。

最佳答案

这里需要对ts_rank(tsvector,tsquery,normalization factor)使用归一化功能。在下面的代码片段中,我使用了 normalization=1(将排名除以 1 + 文档长度的对数),但您可以根据您的实际需要调整它.这是示例:

WITH s(id,tsv) AS ( VALUES
(1,to_tsvector('english','Alice Cooper: Old School: 1964-1974 (DVD/CD Combo)')),
(2,to_tsvector('english','American Classics: Old School: Kings Of Kustomizing')),
(3,to_tsvector('english','Old School Hip Hop Greatest')),
(4,to_tsvector('english','Old School Returns'))
)
SELECT id,ts_rank(tsv,tsq,1) AS rank
FROM s,to_tsquery('english','old & school') tsq
ORDER BY rank DESC;

结果:

 id |   rank    
----+-----------
4 | 0.0495516
3 | 0.0383384
2 | 0.0353013
1 | 0.0312636
(4 rows)

关于postgresql - Postgres 全文搜索按位置排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35643744/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com