gpt4 book ai didi

java - 如何在 Lucene 4 中获取 Lucene 字段的所有条款

转载 作者:搜寻专家 更新时间:2023-10-30 21:12:41 25 4
gpt4 key购买 nike

我正在尝试将我的代码从 Lucene 3.4 更新到 4.1。我想出了除了一个之外的变化。我有需要迭代一个字段的所有术语值的代码。在 Lucene 3.1 中,有一个 IndexReader#terms() 方法提供了一个 TermEnum,我可以对其进行迭代。对于 Lucene 4.1,这似乎已经改变,即使在文档中搜索了几个小时后,我也无法弄清楚如何。有人可以指出我正确的方向吗?

谢谢。

最佳答案

请关注Lucene 4 Migration guide::

How you obtain the enums has changed. The primary entry point is the Fields class. If you know your reader is a single segment reader, do this:

Fields fields = reader.Fields();
if (fields != null) {
...
}

If the reader might be multi-segment, you must do this:

Fields fields = MultiFields.getFields(reader);
if (fields != null) {
...
}

The fields may be null (eg if the reader has no fields).

Note that the MultiFields approach entails a performance hit on MultiReaders, as it must merge terms/docs/positions on the fly. It's generally better to instead get the sequential readers (use oal.util.ReaderUtil) and then step through those readers yourself, if you can (this is how Lucene drives searches).

If you pass a SegmentReader to MultiFields.fields it will simply return reader.fields(), so there is no performance hit in that case.

Once you have a non-null Fields you can do this:

Terms terms = fields.terms("field");
if (terms != null) {
...
}

The terms may be null (eg if the field does not exist).

Once you have a non-null terms you can get an enum like this:

TermsEnum termsEnum = terms.iterator();

The returned TermsEnum will not be null.

You can then .next() through the TermsEnum

关于java - 如何在 Lucene 4 中获取 Lucene 字段的所有条款,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15290980/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com