gpt4 book ai didi

bigdata - 如何在 Corda 中处理大的 Vault 大小?

转载 作者:行者123 更新时间:2023-12-03 23:54:47 24 4
gpt4 key购买 nike

我们保管库中的数据是可管理的。最终,我们将积累大量。不可能为每天的交易保留如此大的数据。我们希望定期存档或存储数据,以便保持查询性能。

我可以知道您是否考虑过处理大规模数据集以及您的建议。

最佳答案

来自 corda-dev邮件列表:

Yep, we should do some design work around this. As you note it’s not a pressing issue right now but may become one in future.

Our current implementation is actually designed to keep data around even when it’s no longer ‘current’ on the ledger. The ORM mapped vault tables prefer to mark a row as obsolete rather than actually delete the data from the underlying database. Also, the transaction store has no concept of garbage collection or pruning so it never deletes data either. This has clear benefits from the perspective of understanding the history of the ledger and how it got into its current state, but it poses operational issues as well.

I think people will have different preferences here depending on their resources and jurisdiction. Let’s tackle the two data stores separately:

Making the relationally mapped tables delete data is easy, it’s just a policy change. Instead of marking a row as gone, we actually issue a SQL DELETE call. The transaction store is trickier. Corda benefits from its blockless design here; in theory we can garbage collect old transactions. The devil is in the details however because for nodes that use SGX the tx store will be encrypted. Thus not only do we need to develop a parallel GC for the tx graph, but also, run it entirely inside the enclaves. A fun systems engineering problem.

If the concern is just query performance, one obvious move is to shift the tx store into a scalable K/V store like Cassandra, hosted BigTable etc. There’s no deep reason the tx store must be in the same RDBMS as the rest of the data, it’s just convenient to have a single database to backup. Scalable K/V stores don’t really lose query performance as the dataset grows, so, this is also a nice solution.

W.R.T. things like the GDPR, being able to delete data might help or it might be irrelevant. As with all things GDPR related nobody knows because the EU didn’t bother to define any answers - auditing a distributed ledger might count as a “legitimate need” for data, or it might not, depending on who the judge is on the day of the case.

It is at any rate only an issue when personal data is stored on ledger, which is not most use cases today.

关于bigdata - 如何在 Corda 中处理大的 Vault 大小?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50581600/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com