gpt4 book ai didi

java - org.apache.lucene.store.LockObtainFailedException : Lock obtain timed out:

转载 作者:搜寻专家 更新时间:2023-11-01 03:25:18 24 4
gpt4 key购买 nike

我正在尝试索引从 tomcat 服务器获取的大量日志文件。我已经编写了代码来打开每个文件,为每一行创建一个索引,然后使用 Apache lucene 存储每一行​​。所有这些都是使用多线程完成的。

当我尝试这段代码时,我得到了这个异常

org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:

代码

  if (indexWriter.getConfig().getOpenMode() == IndexWriterConfig.OpenMode.CREATE)
{
// New index, so we just add the document (no old document can be there):
System.out.println("adding " + path);

indexWriter.addDocument(doc);

} else {
// Existing index (an old copy of this document may have been indexed) so
// we use updateDocument instead to replace the old one matching the exact
// path, if present:
System.out.println("updating " + path);

indexWriter.updateDocument(new Term("path", path), doc);

}
indexWriter.commit();
indexWriter.close();

现在我想既然我每次都提交索引,它可能会导致写锁定。所以我删除了 indexWriter.commit();:

if (indexWriter.getConfig().getOpenMode() == IndexWriterConfig.OpenMode.CREATE)
{
// New index, so we just add the document (no old document can be there):
System.out.println("adding " + path);

indexWriter.addDocument(doc);

} else {
// Existing index (an old copy of this document may have been indexed) so
// we use updateDocument instead to replace the old one matching the exact
// path, if present:
System.out.println("updating " + path);

indexWriter.updateDocument(new Term("path", path), doc);

}

indexWriter.close();

现在我也没有异常(exception)

问。所以我的问题是为什么 indexWriter.commit();导致异常。即使我删除了 indexWriter.commit();我在搜索时没有遇到任何问题。那就是我得到了我想要的确切结果。那为什么要使用 indexWriter.commit(); ?

最佳答案

简而言之,它类似于DB提交,除非您提交事务,否则添加到Solr的文档只是保存在Memory中。只有在提交时,文档才会保留在索引中。
如果文件在内存中时 Solr 崩溃,您可能会丢失这些文件。

Explanation :-

One of the principles in Lucene since day one is the write-once policy. We never write a file twice. When you add a document via IndexWriter it gets indexed into the memory and once we have reached a certain threshold (max buffered documents or RAM buffer size) we write all the documents from the main memory to disk; you can find out more about this here and here. Writing documents to disk produces an entire new index called a segment. Now, when you index a bunch of documents or you run incremental indexing in production here you can see the number of segments changing frequently. However, once you call commit Lucene flushes its entire RAM buffer into segments, syncs them and writes pointers to all segments belonging to this commit into the SEGMENTS file.

如果文档已经存在于 Solr 中,它将被覆盖(由唯一 id 确定)。
因此,您的搜索可能仍然可以正常工作,但除非您提交,否则无法搜索最新的文档。

此外,一旦您打开 indexwriter,它将获得索引的锁,您应该关闭 writer 以释放锁。

关于java - org.apache.lucene.store.LockObtainFailedException : Lock obtain timed out:,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15470674/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com