gpt4 book ai didi

grails - 可搜索索引在手动更新时被锁定 (LockObtainFailedException)

转载 作者:行者123 更新时间:2023-12-02 03:54:49 28 4
gpt4 key购买 nike

我们有一个在负载均衡器后面运行的 Grails 项目。服务器上运行着三个 Grails 应用程序实例(使用单独的 Tomcat 实例)。每个实例都有自己的可搜索索引。因为索引是分开的,所以自动更新不足以保持应用程序实例之间的索引一致。因此,我们禁用了可搜索索引镜像,并且索引的更新是在计划的 quartz 作业中手动完成的。根据我们的理解,应用程序的任何其他部分都不应该修改索引。

quartz 作业每分钟运行一次,它从数据库中检查哪些行已被应用程序更新,并重新索引这些对象。该作业还会检查同一个作业是否已经在运行,因此它不会进行任何并发索引。应用程序在启动后几个小时内运行良好,然后在作业开始时突然抛出 LockObtainFailedException:

22.10.2012 11:20:40 [xxxx.ReindexJob] ERROR Could not update searchable index, class org.compass.core.engine.SearchEngineException: Failed to open writer for sub index [product]; nested exception is org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@/home/xxx/tomcat/searchable-index/index/product/lucene-a7bbc72a49512284f5ac54f5d7d32849-write.lock

根据上次执行作业的日志,重建索引没有任何错误,作业成功完成。尽管如此,这次re-index操作还是抛出了加锁异常,就好像之前的操作没有完成,锁还没有释放一样。应用重启后才会释放锁。

我们尝试通过手动打开锁定的索引来解决问题,这导致将以下错误打印到日志中:

22.10.2012 11:21:30 [manager.IndexWritersManager ] ERROR Illegal state, marking an index writer as open, while another is marked as open for sub index [product]

在此之后,作业似乎正常工作并且不会再次陷入锁定状态。然而,这会导致应用程序持续使用 100% 的 CPU 资源。以下是 quartz 作业代码的简化版本。

任何帮助解决问题的人都将不胜感激,在此先感谢。

class ReindexJob {

def compass
...

static Calendar lastIndexed

static triggers = {
// Every day every minute (at xx:xx:30), start delay 2 min
// cronExpression: "s m h D M W [Y]"
cron name: "ReindexTrigger", cronExpression: "30 * * * * ?", startDelay: 120000
}

def execute() {
if (ConcurrencyHelper.isLocked(ConcurrencyHelper.Locks.LUCENE_INDEX)) {
log.error("Search index has been locked, not doing anything.")
return
}

try {
boolean acquiredLock = ConcurrencyHelper.lock(ConcurrencyHelper.Locks.LUCENE_INDEX, "ReindexJob")
if (!acquiredLock) {
log.warn("Could not lock search index, not doing anything.")
return
}

Calendar reindexDate = lastIndexed
Calendar newReindexDate = Calendar.instance
if (!reindexDate) {
reindexDate = Calendar.instance
reindexDate.add(Calendar.MINUTE, -3)
lastIndexed = reindexDate
}

log.debug("+++ Starting ReindexJob, last indexed ${TextHelper.formatDate("yyyy-MM-dd HH:mm:ss", reindexDate.time)} +++")
Long start = System.currentTimeMillis()

String reindexMessage = ""

// Retrieve the ids of products that have been modified since the job last ran
String productQuery = "select p.id from Product ..."
List<Long> productIds = Product.executeQuery(productQuery, ["lastIndexedDate": reindexDate.time, "lastIndexedCalendar": reindexDate])

if (productIds) {
reindexMessage += "Found ${productIds.size()} product(s) to reindex. "

final int BATCH_SIZE = 10
Long time = TimeHelper.timer {
for (int inserted = 0; inserted < productIds.size(); inserted += BATCH_SIZE) {
log.debug("Indexing from ${inserted + 1} to ${Math.min(inserted + BATCH_SIZE, productIds.size())}: ${productIds.subList(inserted, Math.min(inserted + BATCH_SIZE, productIds.size()))}")
Product.reindex(productIds.subList(inserted, Math.min(inserted + BATCH_SIZE, productIds.size())))
Thread.sleep(250)
}
}

reindexMessage += " (${time / 1000} s). "
} else {
reindexMessage += "No products to reindex. "
}

log.debug(reindexMessage)

// Re-index brands
Brand.reindex()

lastIndexed = newReindexDate

log.debug("+++ Finished ReindexJob (${(System.currentTimeMillis() - start) / 1000} s) +++")
} catch (Exception e) {
log.error("Could not update searchable index, ${e.class}: ${e.message}")
if (e instanceof org.apache.lucene.store.LockObtainFailedException || e instanceof org.compass.core.engine.SearchEngineException) {
log.info("This is a Lucene index locking exception.")
for (String subIndex in compass.searchEngineIndexManager.getSubIndexes()) {
if (compass.searchEngineIndexManager.isLocked(subIndex)) {
log.info("Releasing Lucene index lock for sub index ${subIndex}")
compass.searchEngineIndexManager.releaseLock(subIndex)
}
}
}
} finally {
ConcurrencyHelper.unlock(ConcurrencyHelper.Locks.LUCENE_INDEX, "ReindexJob")
}
}
}

根据 JMX CPU 示例,Compass 似乎在幕后进行一些调度。从 1 分钟的 CPU 样本来看,比较正常和 100% CPU 实例时似乎没有什么不同:

  • org.apache.lucene.index.IndexWriter.doWait() 使用了大部分 CPU 时间。
  • Compass Scheduled Executor Thread 显示在线程列表中,这在正常情况下是看不到的。
  • 一个 Compass Executor Thread 正在执行 commitMerge,在正常情况下,这些线程都没有在执行 commitMerge。

最佳答案

您可以尝试增加“compass.transaction.lockTimeout”设置。默认值为 10(秒)。

另一种选择是在 Compass 中禁用并发并使其同步。这是通过 'compass.transaction.processor.read_committed.concurrentOperations': 'false' 设置控制的。您可能还必须将“compass.transaction.processor”设置为“read_committed”

这些是我们当前使用的指南针设置:

compassSettings = [
'compass.engine.optimizer.schedule.period': '300',
'compass.engine.mergeFactor':'1000',
'compass.engine.maxBufferedDocs':'1000',
'compass.engine.ramBufferSize': '128',
'compass.engine.useCompoundFile': 'false',
'compass.transaction.processor': 'read_committed',
'compass.transaction.processor.read_committed.concurrentOperations': 'false',
'compass.transaction.lockTimeout': '30',
'compass.transaction.lockPollInterval': '500',
'compass.transaction.readCommitted.translog.connection': 'ram://'
]

这关闭了并发。您可以通过将“compass.transaction.processor.read_committed.concurrentOperations”设置更改为“true”来使其异步。 (或删除条目)。

指南针配置引用: http://static.compassframework.org/docs/latest/core-configuration.html

read_committed 处理器的并发文档: http://www.compass-project.org/docs/latest/reference/html/core-searchengine.html#core-searchengine-transaction-read_committed

如果你想保持异步操作,你也可以控制它使用的线程数。使用 compass.transaction.processor.read_committed.concurrencyLevel=1 设置将允许异步操作但仅使用一个线程(默认为 5 个线程)。还有 compass.transaction.processor.read_committed.backlog 和 compass.transaction.processor.read_committed.addTimeout 设置。

希望对您有所帮助。

关于grails - 可搜索索引在手动更新时被锁定 (LockObtainFailedException),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13123173/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com