gpt4 book ai didi

java - 使用lucene进行前缀搜索

转载 作者:行者123 更新时间:2023-11-30 08:15:39 25 4
gpt4 key购买 nike

我正在尝试使用 lucene 搜索功能进行自动完成。我有以下代码,它通过查询前缀进行搜索,但除此之外,它还提供了包含该单词的所有句子,而我希望它只显示以该前缀开头的句子或单词。

例如:--holiday mansion 船屋--眼睛m肌肉--电影历代电影--机器机器

我希望它只显示最后 2 个查询。如何做到这一点也被困在这里,我也是 lucene 的新手。请任何人在这方面帮助我。提前致谢。

       addDoc(IndexWriter w, String title, String isbn) throws IOException {
Document doc = new Document();
doc.add(new Field("title", title, Field.Store.YES, Field.Index.ANALYZED));

// use a string field for isbn because we don't want it tokenized
doc.add(new Field("isbn", isbn, Field.Store.YES, Field.Index.ANALYZED));
w.addDocument(doc);

}

主要:

    try {
// 0. Specify the analyzer for tokenizing text.
// The same analyzer should be used for indexing and searching
StandardAnalyzer analyzer = new StandardAnalyzer();

// 1. create the index
Directory index = FSDirectory.open(new File(indexDir));
IndexWriter writer = new IndexWriter(index, new StandardAnalyzer(Version.LUCENE_30), true, IndexWriter.MaxFieldLength.UNLIMITED); //3

for (int i = 0; i < source.size(); i++) {
addDoc(writer, source.get(i), + (i + 1) + "z");
}

writer.close();


// 2. query
Term term = new Term("title", querystr);
//create the term query object
PrefixQuery query = new PrefixQuery(term);



// 3. search
int hitsPerPage = 20;
IndexReader reader = IndexReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
searcher.search(query, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;

// 4. Get results
for (int i = 0; i < hits.length; ++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);

System.out.println(d.get("title"));


}

reader.close();


} catch (Exception e) {
System.out.println("Exception (LuceneAlgo.getSimilarString()) : " + e);
}


}


}

最佳答案

我看到两种解决方案:

  1. 按照 Yahnoosh 的建议,将标题字段保存两次,一次保存为 TextField(=已分析),一次保存为 StringField(未分析)

  2. 将其保存为 TextField,但查询时使用 SpanFirstQuery

// 2. query
Term term = new Term("title", querystr);
//create the term query object
PrefixQuery pq = new PrefixQuery(term);
SpanQuery wrapper = new SpanMultiTermQueryWrapper<PrefixQuery>(pq);
Query final = new SpanFirstQuery(wrapper, 1);

关于java - 使用lucene进行前缀搜索,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29707267/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com