gpt4 book ai didi

string - 使用 to_tsvector 和 to_tsquery 过滤非罗马字符

转载 作者:行者123 更新时间:2023-11-29 13:21:08 24 4
gpt4 key购买 nike

我想为我的应用程序提供多语言搜索支持。

Postgresql 9.6 Search Controls说我需要 tsvectortsquery 来正确解析/规范化文本。这适用于基于罗马的语言,但不适用于非罗马字符。

考虑这个搜索片段

where to_tsvector(title) @@ to_tsquery('hola')

我正在寻找标题为“hola mi amiga”的标题,找到了。但是,鉴于:

where to_tsvector(title) @@ to_tsquery('你') //language = Chinese, Code = zh-CN

我正在寻找带有你好吗的标题,但没有找到。

要允许字符串规范化处理非罗马字符,我应该考虑哪些因素?

最佳答案

确保你设置正确的配置

default_text_search_config (string)Selects the text search configuration that is used by those variants of the text search functions that do not have an explicit argument specifying the configuration. See Chapter 12 for further information. The built-in default is pg_catalog.simple, but initdb will initialize the configuration file with a setting that corresponds to the chosen lc_ctype locale, if a configuration matching that locale can be identified.

你可以看到当前值

SHOW default_text_search_config;
or SELECT get_current_ts_config();

您可以使用 SET default_text_search_config = newconfiguration; 为 session 更改它或者,您可以使用 ALTER DATABASE <db> SET default_text_search_config = newconfiguration

From Chapter 12. Full Text Search

During installation an appropriate configuration is selected and default_text_search_config is set accordingly in postgresql.conf. If you are using the same text search configuration for the entire cluster you can use the value in postgresql.conf. To use different configurations throughout the cluster but the same configuration within any one database, use ALTER DATABASE ... SET. Otherwise, you can set default_text_search_config in each session.

Each text search function that depends on a configuration has an optional regconfig argument, so that the configuration to use can be specified explicitly. default_text_search_config is used only when this argument is omitted.

您可以使用 \dF查看您已安装的文本搜索配置。

所以你想要的是这样的

where to_tsvector('newconfig', title) @@ to_tsquery('newconfig', '你')

不知道查询使用什么语言来回答这个问题,或者什么配置可以正确地阻止该语言。

关于string - 使用 to_tsvector 和 to_tsquery 过滤非罗马字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41601379/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com