gpt4 book ai didi

java - 如何提高 Neo4j 2.0 cypher/ExecutionResult 在重负载下的性能?

转载 作者:行者123 更新时间:2023-12-01 13:41:17 25 4
gpt4 key购买 nike

背景:我们注意到,随着并发线程数量的增加,从 ExecutionResult 检索数据时性能会下降。我们的生产应用程序有 200 个工作线程,在嵌入式模式下使用 Neo4j 2.0.0 Community。例如以毫秒为单位。

  1. 线程:1,加密时间:0,提取时间:188
  2. 线程:10,加密时间:1,提取时间:188
  3. 线程:50,加密时间:1,提取时间:2481
  4. 线程:100,加密时间:1,提取时间:4466

程序输出示例(过滤 1 个线程的结果):

2013-12-23 14:39:31,137 [main] INFO  net.ahm.graph.CypherLab  - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: 1
2013-12-23 14:39:31,137 [main] INFO net.ahm.graph.CypherLab - >>>> STARTED GRAPHDB
2013-12-23 14:39:39,203 [main] INFO net.ahm.graph.CypherLab - >>>> CREATED NODES
2013-12-23 14:39:43,510 [main] INFO net.ahm.graph.CypherLab - >>>> WARMED UP
2013-12-23 14:39:43,510 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> CYPHER TOOK: 0 m-secs
2013-12-23 14:39:43,698 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> GETTING RESULTS TOOK: 188 m-secs
2013-12-23 14:39:43,698 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> CYPHER RETURNED ROWS: 50000
2013-12-23 14:39:43,698 [Thread-4] INFO net.ahm.graph.CypherLab - ### GRAPHDB SHUTDOWNHOOK INVOKED !!!



2013-12-23 14:40:10,470 [main] INFO net.ahm.graph.CypherLab - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: 10
...
2013-12-23 14:40:23,985 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> CYPHER TOOK: 1 m-secs
2013-12-23 14:40:25,219 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> GETTING RESULTS TOOK: 188 m-secs
2013-12-23 14:40:25,219 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> CYPHER RETURNED ROWS: 50000
2013-12-23 14:40:25,234 [Thread-4] INFO net.ahm.graph.CypherLab - ### GRAPHDB SHUTDOWNHOOK INVOKED !!!


2013-12-23 14:41:28,850 [main] INFO net.ahm.graph.CypherLab - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: 50
...
2013-12-23 14:41:41,781 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> CYPHER TOOK: 1 m-secs
2013-12-23 14:41:45,720 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> GETTING RESULTS TOOK: 2481 m-secs
2013-12-23 14:41:45,720 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> CYPHER RETURNED ROWS: 50000
2013-12-23 14:41:46,855 [Thread-4] INFO net.ahm.graph.CypherLab - ### GRAPHDB SHUTDOWNHOOK INVOKED !!!


2013-12-23 14:44:09,267 [main] INFO net.ahm.graph.CypherLab - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: 100
...
2013-12-23 14:44:22,077 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> CYPHER TOOK: 1 m-secs
2013-12-23 14:44:30,915 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> GETTING RESULTS TOOK: 4466 m-secs
2013-12-23 14:44:30,915 [pool-1-thread-1] INFO net.ahm.graph.CypherLab - >>>> CYPHER RETURNED ROWS: 50000
2013-12-23 14:44:31,680 [Thread-4] INFO net.ahm.graph.CypherLab - ### GRAPHDB SHUTDOWNHOOK INVOKED !!!

测试程序:

package net.ahm.graph;

import java.io.File;
import java.util.Map;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

import org.apache.log4j.Logger;
import org.neo4j.cypher.javacompat.ExecutionEngine;
import org.neo4j.cypher.javacompat.ExecutionResult;
import org.neo4j.graphdb.DynamicLabel;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.RelationshipType;
import org.neo4j.graphdb.Transaction;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.graphdb.factory.GraphDatabaseSettings;
import org.neo4j.graphdb.schema.IndexDefinition;
import org.neo4j.graphdb.schema.Schema;
import org.neo4j.kernel.impl.util.FileUtils;
import org.neo4j.kernel.impl.util.StringLogger;

public class CypherLab {
private static final Logger LOG = Logger.getLogger(CypherLab.class);
private final static int CONCURRENCY = 100;

public static void main(String[] args) throws Exception {
FileUtils.deleteRecursively(new File("graphdb"));
final GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder("graphdb")
.setConfig(GraphDatabaseSettings.use_memory_mapped_buffers, "true").setConfig(GraphDatabaseSettings.cache_type, "strong")
.newGraphDatabase();
registerShutdownHook(graphDb);
LOG.info(">>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: " + CONCURRENCY);
LOG.info(">>>> STARTED GRAPHDB");
createIndex("Parent", "name", graphDb);
createIndex("Child", "name", graphDb);
try (Transaction tx = graphDb.beginTx()) {
Node parent = graphDb.createNode(DynamicLabel.label("Parent"));
parent.setProperty("name", "parent");
for (int i = 0; i < 50000; i++) {
Node child = graphDb.createNode(DynamicLabel.label("Child"));
child.setProperty("name", "child" + i);
parent.createRelationshipTo(child, RelationshipTypes.PARENT_CHILD);
}
tx.success();
}
LOG.info(">>>> CREATED NODES");
final ExecutionEngine engine = new ExecutionEngine(graphDb, StringLogger.SYSTEM);
for (int i = 0; i < 10; i++) {
try (Transaction tx = graphDb.beginTx()) {
ExecutionResult result = engine.execute("match (n:Parent)-[:PARENT_CHILD]->(m:Child) return n.name, m.name");
for (Map<String, Object> row : result) {
assert ((String) row.get("n.name") != null);
assert ((String) row.get("m.name") != null);
}
tx.success();
}
}
LOG.info(">>>> WARMED UP");
ExecutorService es = Executors.newFixedThreadPool(CONCURRENCY);
final CountDownLatch cdl = new CountDownLatch(CONCURRENCY);
for (int i = 0; i < CONCURRENCY; i++) {
es.execute(new Runnable() {
@Override
public void run() {
try (Transaction tx = graphDb.beginTx()) {
long time = System.currentTimeMillis();
ExecutionResult result = engine.execute("match (n:Parent)-[:PARENT_CHILD]->(m:Child) return n.name, m.name");
LOG.info(">>>> CYPHER TOOK: " + (System.currentTimeMillis() - time) + " m-secs");
int count = 0;
time = System.currentTimeMillis();
for (Map<String, Object> row : result) {
assert ((String) row.get("n.name") != null);
assert ((String) row.get("m.name") != null);
count++;
}
LOG.info(">>>> GETTING RESULTS TOOK: " + (System.currentTimeMillis() - time) + " m-secs");
tx.success();
LOG.info(">>>> CYPHER RETURNED ROWS: " + count);
} catch (Throwable t) {
LOG.error(t);
} finally {
cdl.countDown();
}
}
});
}
cdl.await();
es.shutdown();
}

private static void createIndex(String label, String propertyName, GraphDatabaseService graphDb) {
IndexDefinition indexDefinition;
try (Transaction tx = graphDb.beginTx()) {
Schema schema = graphDb.schema();
indexDefinition = schema.indexFor(DynamicLabel.label(label)).on(propertyName).create();
tx.success();
}
try (Transaction tx = graphDb.beginTx()) {
Schema schema = graphDb.schema();
schema.awaitIndexOnline(indexDefinition, 10, TimeUnit.SECONDS);
tx.success();
}
}

private static void registerShutdownHook(final GraphDatabaseService graphDb) {
Runtime.getRuntime().addShutdownHook(new Thread() {
@Override
public void run() {
LOG.info("### GRAPHDB SHUTDOWNHOOK INVOKED !!!");
graphDb.shutdown();
}
});
}

private enum RelationshipTypes implements RelationshipType {
PARENT_CHILD
}
}

最佳答案

当这个 commit is merged in 时应该会更好。将作为 2.0.1 的一部分发布还有一些其他较小的瓶颈。

您可以尝试将您的网络服务器线程限制为核心数倍(或核心数 * 2)吗?看看这是否有帮助?

我的理解是,在预热并将热数据集放入缓存后,它仅受 CPU 限制,而不再受 I/O 限制以进行读取。因此,线程过多会导致 CPU 和工作人员挨饿。

如果我使用 8 个和 100 个内核运行测试,我会得到这些用于执行查询并获取 50k 结果的分布:

  • 8 个线程:50% 百分位为 500 毫秒,90% 为 650 毫秒
  • 100 个线程:2600 毫秒的 50% 百分位数和 6000 毫秒的 90%

代码和详细直方图:https://gist.github.com/jexp/a164f6cf9686b8125872

关于java - 如何提高 Neo4j 2.0 cypher/ExecutionResult 在重负载下的性能?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20750741/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com