gpt4 book ai didi

neo4j - 与带有 Neo4j 的 MySql 相比,性能较差

转载 作者:行者123 更新时间:2023-12-02 09:05:58 26 4
gpt4 key购买 nike

我将 MySQL 数据库迁移到 Neo4j 并测试了一个简单的请求。我非常惊讶地发现 Neo4j 中的同等请求比 MySql 中的时间长 10 到 100 倍。我正在开发 Neo4j 2.0.1。

在原始的 MySql 架构中,我有以下三个表:

  • 国家:包含“代码”、“大陆 ID”和“选定” bool 值,
  • 城市:包含“国家/地区代码”、“名称”和“状态” bool 值,
  • 剧院:包含“city_id”和“public” bool 值,

每个属性都有一个索引。我想在几个条件下显示给定大陆的按城市划分的剧院数量。请求是:

SELECT count(*) as nb, c.name 
FROM `cities` c LEFT JOIN theaters t ON c.id = t.city_id
WHERE c.country_code IN
(SELECT code FROM countries WHERE selected is true AND continent_id = 4)
AND c.status=1 AND t.public = 1
GROUP BY c.name ORDER BY nb DESC


Neo4j 中的数据库架构如下:

(:大陆)-[:包含]->(:国家/地区{选定的:bool})-[:包含]->(:城市{名称:字符串>, status:bool})-[:包含]->(:Theater{public:bool})

每个属性还定义了一个索引。密码请求是:

MATCH (:Continent{code: 4})-[:Include]->(:Country{selected:true})-[:Include]->(city:City{status:true})-[:Include]->(:Theater{public: true})
RETURN city.name, count(*) AS nb ORDER BY nb DESC


每个数据库中约有 70,000 个城市和 140,000 个剧院。

在 ID 为 4 的大陆上,MySql 请求大约花费了 0.02 秒,而 Neo4j 则花费了 0.4 秒。此外,如果我在 Cypher 请求中引入国家和城市之间的可变关系长度 (...(:Country{selected:true})-[:Ininclude*..3]->(city:City{status :true})...) 因为我希望能够添加中间级别(例如区域),因此请求需要超过 2 秒的时间。

我知道在这种特殊情况下,使用 Neo4j 代替 MySql 没有任何好处,但我希望看到两种技术之间大致相当的性能,并且我想利用 Neo4j 地理层次结构功能。

我错过了什么还是这是 Neo4j 的限制?

感谢您的回答。

编辑:首先您会找到数据库转储文件 here 。 Neo4j server configuration是开箱即用的。我在 Ruby 环境中工作,并且使用 neography gem。我还单独运行 Neo4J 服务器因为我不在 JRuby 上,所以它通过 Rest API 发送密码请求。

该数据库包含 244 个国家、69000 个城市和 138,000 个剧院。对于 Continental_id 4,有 46,982 个城市(37,210 个状态 bool 值设置为 true)和 74,420 个剧院。

请求返回 2256 行。第三次运行花费了 338 毫秒。以下是带有分析信息的请求输出:

profile MATCH (:Continent{code: 4})-[:Include]->(country:Country{selected:true})-[:Include*..1]->(city:City{status:true})-[:Include]->(theater:Theater{public: true}) RETURN city.name, count(*) AS nb ORDER BY nb DESC;

==> ColumnFilter(symKeys=["city.name", " INTERNAL_AGGREGATE85ca19f3-9421-4c18-a449-1097e3deede2"], returnItemNames=["city.name", "nb"], _rows=2256, _db_hits=0)
==> Sort(descr=["SortItem(Cached( INTERNAL_AGGREGATE85ca19f3-9421-4c18-a449-1097e3deede2 of type Integer),false)"], _rows=2256, _db_hits=0)
==> EagerAggregation(keys=["Cached(city.name of type Any)"], aggregates=["( INTERNAL_AGGREGATE85ca19f3-9421-4c18-a449-1097e3deede2,CountStar())"], _rows=2256, _db_hits=0)
==> Extract(symKeys=["city", " UNNAMED27", " UNNAMED7", "country", " UNNAMED113", "theater", " UNNAMED72"], exprKeys=["city.name"], _rows=2257, _db_hits=2257)
==> Filter(pred="(hasLabel(theater:Theater(3)) AND Property(theater,public(5)) == true)", _rows=2257, _db_hits=2257)
==> SimplePatternMatcher(g="(city)-[' UNNAMED113']-(theater)", _rows=2257, _db_hits=4514)
==> Filter(pred="(((hasLabel(city:City(2)) AND hasLabel(city:City(2))) AND Property(city,status(4)) == true) AND Property(city,status(4)) == true)", _rows=2257, _db_hits=74420)
==> TraversalMatcher(start={"label": "Continent", "query": "Literal(4)", "identifiers": [" UNNAMED7"], "property": "code", "producer": "SchemaIndex"}, trail="( UNNAMED7)-[ UNNAMED27:Include WHERE (((hasLabel(NodeIdentifier():Country(1)) AND hasLabel(NodeIdentifier():Country(1))) AND Property(NodeIdentifier(),selected(3)) == true) AND Property(NodeIdentifier(),selected(3)) == true) AND true]->(country)-[:Include*1..1]->(city)", _rows=37210, _db_hits=37432)

最佳答案

你说得对,我自己尝试过,查询时间只降到了 100 毫秒。

 MATCH (:Continent{code: 4})-[:Include]->
(country:Country{selected:true})-[:Include]->
(city:City{status:true})-[:Include]->
(theater:Theater{public: true})
RETURN city.name, count(*) AS nb
ORDER BY nb DESC;

| "Forbach" | 1 |
| "Stuttgart" | 1 |
| "Mirepoix" | 1 |
| "Bonnieux" | 1 |
| "Saint Cyprien Plage" | 1 |
| "Crissay sur Manse" | 1 |
+--------------------------------------+
2256 rows
**85 ms**

请注意,自 2.0.x 起,cypher 尚未进行性能优化,该工作从 Neo4j 2.1 开始,并将持续到 2.3。内核中还计划进行更多性能工作,这也将加快速度。

我也在 Java 中实现了该解决方案,并将其降低到 19 毫秒。它当然不是那么漂亮,但这也是我们使用 cypher 的目标:

class City {
Node city;
int count = 1;

public City(Node city) {
this.city = city;
}

public void inc() { count++; }

@Override
public String toString() {
return String.format("City{city=%s, count=%d}", city.getProperty("name"), count);
}
}

private List<?> queryJava3() {
long start = System.currentTimeMillis();
Node continent = IteratorUtil.single(db.findNodesByLabelAndProperty(CONTINENT, "code", 4));
Map<Node,City> result = new HashMap<>();
for (Relationship rel1 : continent.getRelationships(Direction.OUTGOING,Include)) {
Node country = rel1.getEndNode();
if (!(country.hasLabel(COUNTRY) && (Boolean) country.getProperty("selected", false))) continue;
for (Relationship rel2 : country.getRelationships(Direction.OUTGOING, Include)) {
Node city = rel2.getEndNode();
if (!(city.hasLabel(CITY) && (Boolean) city.getProperty("status", false))) continue;
for (Relationship rel3 : city.getRelationships(Direction.OUTGOING, Include)) {
Node theater = rel3.getEndNode();
if (!(theater.hasLabel(THEATER) && (Boolean) theater.getProperty("public", false))) continue;
City city1 = result.get(city);
if (city1==null) result.put(city,new City(city));
else city1.inc();
}
}
}
List<City> list = new ArrayList<>(result.values());
Collections.sort(list, new Comparator<City>() {
@Override
public int compare(City o1, City o2) {
return Integer.compare(o2.count,o1.count);
}
});
output("java", start, list.iterator());
return list;
}


java time = 19ms
first = City{city=Val de Meuse, count=1} total-count 22561

关于neo4j - 与带有 Neo4j 的 MySql 相比,性能较差,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22900972/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com