gpt4 book ai didi

neo4j - Titan 如何使用 HBase/Cassandra 实现恒定时间查找?

转载 作者:行者123 更新时间:2023-12-04 16:52:41 25 4
gpt4 key购买 nike

在 O'Reilly 的书“Graph Databases”的第 6 章中,关于 Neo4j 如何存储图形数据库,它说:

To understand why native graph processing is so much more efficient than graphs based on heavy indexing, consider the following. Depending on the implementation, index lookups could be O(log n) in algorithmic complexity versus O(1) for looking up immediate relationships. To traverse a network of m steps, the cost of the indexed approach, at O(m log n), dwarfs the cost of O(m) for an implementation that uses index-free adjacency.



然后解释 Neo4j 通过将所有节点和关系存储为固定大小的记录来实现这种恒定时间查找:

With fixed sized records and pointer-like record IDs, traversals are implemented simply by chasing pointers around a data structure, which can be performed at very high speed. To traverse a particular relationship from one node to another, the database performs several cheap ID computations (these computations are much cheaper than searching global indexes, as we’d have to do if faking a graph in a non-graph native database)



这最后一句话引发了我的问题:使用 Cassandra 或 HBase 作为存储后端的 Titan 如何实现这些性能提升或弥补它?

最佳答案

Neo4j 仅在数据位于同一 JVM 中的内存中时才达到 O(1)。当数据在磁盘上时,由于在磁盘上追逐指针(它们的磁盘表示很差),Neo4j 很慢。

当数据位于同一 JVM 的内存中时,Titan 仅实现 O(1)。当数据在磁盘上时,Titan 比 Neo4j 更快,因为它具有更好的磁盘表示。

请参阅以下博客文章,从数量上解释了上述内容:
http://thinkaurelius.com/2013/11/24/boutique-graph-data-with-titan/

因此,当人们说 O(1) 时,了解他们在内存层次结构的哪个部分很重要。当您在单个 JVM(单机)中时,很容易变得快速,正如 Neo4j 和 Titan 各自的缓存所展示的那样引擎。当你不能把整个图形放在内存中时,你就不得不依赖智能磁盘布局、分布式缓存等。

请参阅以下两篇博文了解更多信息:

http://thinkaurelius.com/2013/11/01/a-letter-regarding-native-graph-databases/
http://thinkaurelius.com/2013/07/22/scalable-graph-computing-der-gekrummte-graph/

关于neo4j - Titan 如何使用 HBase/Cassandra 实现恒定时间查找?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26009102/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com