gpt4 book ai didi

search - 通过 text_general 搜索时出现重复

转载 作者:行者123 更新时间:2023-12-01 23:18:41 26 4
gpt4 key购买 nike

我正在使用 Solr 8,并以几乎默认的配置运行它。我想搜索多个文本字段,并将它们复制到 _text_general_ 字段中。

这是我的 Managed_schema.xml 的一部分:

<field name="id" type="string" stored="true" required="true" />
<field name="p_name" type="string" indexed="true" stored="true" required="true" />
<field name="p_additional_info" type="string" indexed="true" stored="true" />
<field name="p_brand" type="string" indexed="true" stored="true" />
<field name="p_manufacturer" type="string" indexed="true" stored="true" />
<field name="p_image_link" type="string" indexed="false" stored="true" />

<uniqueKey>id</uniqueKey>

<field name="_root_" type="string" indexed="true" stored="false" docValues="false" />
<field name="_text_" type="text_general" indexed="true" stored="false" multiValued="true"/>

<!-- copy this 3 values to basic search field -->
<copyField source="p_name" dest="_text_"/>
<copyField source="p_brand" dest="_text_"/>
<copyField source="p_additional_info" dest="_text_"/>

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

当我尝试使用 q=*:* 获取结果时,它没有给我重复项。当我尝试使用 q=men 获取结果时,我看到以下内容:

{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"men",
"_":"1555579368807"}},
"response":{"numFound":9,"start":0,"docs":[
{
"p_additional_info":"A creme especially made for men, suitable for face, body and hands.",
"p_name":"NIVEA MEN CREME",
"id":"16",
"_version_":1630705876203470848},
{
"p_additional_info":"A creme especially made for men, suitable for face, body and hands.",
"p_name":"NIVEA MEN CREME",
"id":"16",
"_version_":1630702978343108608},
...
]}}

有人知道如何解决这个问题吗?..

UPD我正在通过 DIH 从 DB 导入我的文档:

<dataConfig>
<dataSource name="jdbc" driver="org.postgresql.Driver" url="jdbc:postgresql://mydb.rds.amazonaws.com:5432/myapp" user="***" password="***" readOnly="true" />
<document>
<entity name="products"
query="select id, additional_info, brand, name, image_link, country, manufacturer from products"
dataSource="jdbc" pk="id">

<field name="id" column="id" />
<field name="p_additional_info" column="additional_info" />
<field name="p_brand" column="brand" />
<field name="p_name" column="name" />
<field name="p_country" column="country" />
<field name="p_manufacturer" column="manufacturer" />
<field name="p_image_link" column="image_link" />

</entity>
</document>
</dataConfig>

最佳答案

当我进行手动更改时,我的 Managed_schema 中似乎存在一些错误。在我创建新的并通过模式 API 进行更改后,正如 @EricLavault 所假设的那样,一切正常。

关于search - 通过 text_general 搜索时出现重复,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55743206/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com