gpt4 book ai didi

java - 将之前写入 HDFS 的 lucene 索引加载到 RamDirectory

转载 作者:行者123 更新时间:2023-12-02 21:46:44 25 4
gpt4 key购买 nike

这是错误消息:

Exception in thread "main" org.apache.lucene.index.IndexNotFoundException: no segments* file found in RAMDirectory@1cff1d4a lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@2ddf0c3: files: [/prod/hdfs/LUCENE/index/140601/_0.cfe, /prod/hdfs/LUCENE/index/140601/segments_2, /prod/hdfs/LUCENE/index/140601/_0.si, /prod/hdfs/LUCENE/index/140601/segments.gen, /prod/hdfs/LUCENE/index/140601/_0.cfs]
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:801)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)

我已正确提交并关闭了索引编写器。

这是搜索器代码:
public class SearchFiles {

private SearchFiles() {}

public static void main(String[] args) throws Exception {

String filenm = "";
// Creating FileSystem object, to be able to work with HDFS
Configuration config = new Configuration();
config.set("fs.defaultFS","hdfs://127.0.0.1:9000/");
config.addResource(new Path("/usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/core-site.xml"));
FileSystem dfs = FileSystem.get(config);
FileStatus[] status = dfs.listStatus(new Path("/prod/hdfs/LUCENE/index/140601"));

// Creating a RAMDirectory (memory) object, to be able to create index in memory.
RAMDirectory rdir = new RAMDirectory();

// Getting the list of index files present in the directory into an array.
FSDataInputStream filereader = null;

for (int i=0;i<status.length;i++)
{

// Reading data from index files on HDFS directory into filereader object.
filereader = dfs.open(status[i].getPath());
int size = filereader.available();
// Reading data from file into a byte array.

byte[] bytarr = new byte[size];
filereader.read(bytarr, 0, size);

// Creating file in RAM directory with names same as that of
//index files present in HDFS directory.
filenm = new String (status[i].getPath().toString()) ;
String sSplitValue = filenm.substring(21,filenm.length());
System.out.println( sSplitValue);

IndexOutput indxout = rdir.createOutput((sSplitValue) , null);

// Writing data from byte array to the file in RAM directory
indxout.writeBytes(bytarr,bytarr.length);
indxout.flush();
indxout.close();
}
filereader.close();
// IndexReader indexReader = IndexReader.open(rdir);

IndexReader indexReader = DirectoryReader.open(rdir);
IndexSearcher searcher = new IndexSearcher(indexReader);
Analyzer analyzer = new StandardAnalyzer (Version.LUCENE_47);
QueryParser parser = new QueryParser(Version.LUCENE_47, "FUNDG_SRCE_CD",analyzer);
Query query = parser.parse("D");
TopDocs results = searcher.search(query,1000);

int numTotalHits = results.totalHits;
TopDocs topDocs = searcher.search(query,1000);
ScoreDoc[] hits = topDocs.scoreDocs;

//Printing the number of documents or entries that match the search query.
System.out.println("Total Hits = "+ numTotalHits);
for (int j =0 ; j < hits.length ; j++) {
int docId = hits[j].doc;

Document d = searcher.doc(docId);

System.out.println(d.get("FUNDG_SRCE_CD") +" " + d.get("ACCT_NUM") ) ;
}
}
}

最佳答案

我不认为您应该将 null 作为 IOContext 传入。 createOutput 的参数.尝试使用 IOContext.DEFAULT反而。真的不知道这是否会成功,但也许是朝着正确方向迈出的一步。

为什么不让它变得容易呢?您可以使用适当的 RAMDirectory复制索引的构造函数:

public static void main(String[] args) throws Exception  {
Directory oldDirectory = FSDirectory("/prod/hdfs/LUCENE/index/140601");
Directory rdir = new RAMDirectory(fsDirectory, IOContext.DEFAULT);
IndexReader indexReader = DirectoryReader.open(rdir);
//etc.
}

关于java - 将之前写入 HDFS 的 lucene 索引加载到 RamDirectory,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24636212/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com