gpt4 book ai didi

solr - 用 solr 突出显示确切的短语

转载 作者:行者123 更新时间:2023-12-01 05:00:28 24 4
gpt4 key购买 nike

我正在使用 solrj 作为客户端来索引 solr 服务器上的文档我是 solr 的新手,我在 solr 中突出显示时遇到问题。使用 solr 突出显示确切的短语不起作用。

例如,如果关键字是:“dulce hogar”它返回:

<i> dulce </i> <i> hogar </i> 

应该是:

<i> dulce hogar </i> 

我不明白是哪个问题。

我在 schema.xml 中的配置

 <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

在 solrconfig.xml 中

 <requestHandler name="/select" class="solr.SearchHandler">

<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">text</str>
<bool name="hl.usePhraseHighlighter">true</bool>
</lst>


</requestHandler>
<!-- Highlighting Component

http://wiki.apache.org/solr/HighlightingParameters
-->
<searchComponent class="solr.HighlightComponent" name="highlight">
<highlighting>
<!-- Configure the standard fragmenter -->
<!-- This could most likely be commented out in the "default" case -->
<fragmenter name="gap"
default="true"
class="solr.highlight.GapFragmenter">
<lst name="defaults">
<int name="hl.fragsize">100</int>
</lst>
</fragmenter>

<!-- A regular-expression-based fragmenter
(for sentence extraction)
-->
<fragmenter name="regex"
class="solr.highlight.RegexFragmenter" default="true">
<lst name="defaults">
<!-- slightly smaller fragsizes work better because of slop -->
<int name="hl.fragsize">70</int>
<!-- allow 50% slop on fragment sizes -->
<float name="hl.regex.slop">0.5</float>
<!-- a basic sentence pattern -->
<str name="hl.regex.pattern">[-\w ,/\n\&quot;&apos;]{20,200}</str>
<bool name="hl.usePhraseHighlighter">true</bool>
<bool name="hl.highlightMultiTerm">true</bool>
</lst>
</fragmenter>

<!-- Configure the standard formatter -->
<formatter name="html"
default="true"
class="solr.highlight.HtmlFormatter">
<lst name="defaults">
<str name="hl.simple.pre"><![CDATA[<em>]]></str>
<str name="hl.simple.post"><![CDATA[</em>]]></str>
</lst>
</formatter>

提前感谢您的帮助,

锡尔。

最佳答案

检查 this邮政。您需要设置 hl.q="dulce hogar"字段以及 fastVector 和 phraseHighLighter。

关于solr - 用 solr 突出显示确切的短语,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19266432/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com