gpt4 book ai didi

hibernate-search - Hibernate Search 5.0 数字 Lucene 查询 HSEARCH000233 问题

转载 作者:行者123 更新时间:2023-12-01 11:34:55 24 4
gpt4 key购买 nike

问题:我们如何提供包含数字和非数字字段的原始 lucene 查询字符串的休眠搜索?

背景:我们最近升级到 HibernateSearch 5.0,由于 HibernateSearch 查询解析器(pre-lucene)的更改,我们的许多查询现在都失败了,并出现以下错误:
The specified query contains a string based sub query which targets the numeric encoded field(s)
在大多数情况下,我们使用 lucene 的文本语法以及 MultiFieldQueryParser由于我们正在运行的查询的复杂性,将查询传递给 HibernateSearch。直到 HibernateSearch 5.0,这些工作得很好。在升级过程中,我们遇到了从 HibernateSearch 抛出的异常,这些异常阻止了我们的应用程序运行曾经可以工作的查询。我们不明白为什么会抛出异常或前进的最佳方式。

在试图追查问题时,我试图以最原始的形式简化哪些有效,哪些无效。 (这是由 HibernateSearch 的 QueryValidationTest 构建的)。

例子:

给定以下实体类:

@Entity
@Indexed
public static class B {
@Id
@GeneratedValue
private long id;

@Field
private long value;

@Field
private String text;
}

测试 1(我们如何为休眠搜索编写查询:FAILURE):
        QueryParser parser = new MultiFieldQueryParser(new String[]{"id","value","num"},new StandardAnalyzer());
Query query = parser.parse("+(value:1 text:test)");
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( query, B.class );
fullTextQuery.list();

结果是:
org.hibernate.search.exception.SearchException: HSEARCH000233: The specified query '+(value:1 text:test)' contains a string based sub query which targets the numeric encoded field(s) 'value'. Check your query or try limiting the targeted entities.
at org.hibernate.search.query.engine.impl.LazyQueryState.validateQuery(LazyQueryState.java:163)
at org.hibernate.search.query.engine.impl.LazyQueryState.search(LazyQueryState.java:102)
at org.hibernate.search.query.engine.impl.QueryHits.updateTopDocs(QueryHits.java:227)
at org.hibernate.search.query.engine.impl.QueryHits.<init>(QueryHits.java:122)
at org.hibernate.search.query.engine.impl.QueryHits.<init>(QueryHits.java:94)
at org.hibernate.search.query.engine.impl.HSQueryImpl.getQueryHits(HSQueryImpl.java:436)
at org.hibernate.search.query.engine.impl.HSQueryImpl.queryEntityInfos(HSQueryImpl.java:257)
at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.list(FullTextQueryImpl.java:200)
at org.hibernate.search.test.query.validation.QueryValidationTest.testRawLuceneWithNumericValue(QueryValidationTest.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.hibernate.testing.junit4.ExtendedFrameworkMethod.invokeExplosively(ExtendedFrameworkMethod.java:62)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.hibernate.testing.junit4.FailureExpectedHandler.evaluate(FailureExpectedHandler.java:58)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.hibernate.testing.junit4.BeforeClassCallbackHandler.evaluate(BeforeClassCallbackHandler.java:43)
at org.hibernate.testing.junit4.AfterClassCallbackHandler.evaluate(AfterClassCallbackHandler.java:42)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

测试 2:(使用数值范围变体同样失败:失败):
        QueryParser parser = new MultiFieldQueryParser(new String[]{"id","value","text"},new StandardAnalyzer());
Query query = parser.parse("+(value:[1 TO 1] text:test)");
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( query, B.class );
fullTextQuery.list();

测试 3:(使用 lucene 术语:成功)
        TermQuery query = new TermQuery( new Term("text", "bar") );
TermQuery nq = new TermQuery( new Term("value", "1") );

BooleanQuery bq = new BooleanQuery();
bq.add(query, Occur.SHOULD);
bq.add(nq, Occur.SHOULD);

FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( bq, B.class );

注意:完整版的测试用例和测试说明了我们所看到的内容: https://github.com/abrin/hibernate-search/blob/3fdcc8229f0bfa00329b9d977172fd218d82cac2/orm/src/test/java/org/hibernate/search/test/query/validation/QueryValidationTest.java

谢谢

最佳答案

首先,您的问题的原因是,从 Search 5 开始,数字类型被索引为 Lucene 数字字段(与基于字符串的字段相反)。除了性能提升之外,它还允许例如对数字字段进行排序而无需填充。搜索 5 documentation说如下:

Prior to Search 5, numeric field encoding was only chosen ifexplicitly requested via @NumericField. As of Search 5 this encodingis automatically chosen for numeric types. To avoid numeric encodingyou can explicitly specify a non numeric field bridge via@Field.bridge or @FieldBridge. The packageorg.hibernate.search.bridge.builtin contains a set of bridges whichencode numbers as strings, for exampleorg.hibernate.search.bridge.builtin.IntegerBridge.


所以,如果你想坚持你的旧行为,你需要确保你的数值仍然被索引为字符串。在您的示例中 value需要用 org.hibernate.search.bridge.builtin.LongBridge 索引.您可以使用 @FieldBridge 来实现这一点。注释(您可以忽略 id 大小写,因为文档 id 无论如何都被索引为字符串):
@Field
@FieldBridge(impl = LongBridge.class)
private long value;
关于您的测试场景的一些评论:
  • 测试 1:查询解析器只创建基于字符串的查询。 Lucene 不知道在此级别上以数字方式索引哪些字段。只能使用适当的 NumericRangeQuery 定位/搜索数字字段.如果您仍然想使用查询解析器,您需要提供自己的子类并自己处理数字字段。另见 - How do I make the QueryParser in Lucene handle numeric ranges?
  • 测试 2:同样的问题。即便如此,您仍在使用范围语法 value:[1 TO 1] ,它只是创建一个文本/字符串范围查询。
  • 测试 3:我认为这实际上行不通。它可能不会抛出异常,但我很确定,如果您查看多个搜索结果,您会注意到 value术语被忽略。一个 TermQuery是基于字符串的,将无法在数字编码字段中找到匹配项。另见 Lucene 3.0.3 Numeric term query
  • 关于hibernate-search - Hibernate Search 5.0 数字 Lucene 查询 HSEARCH000233 问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28138308/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com