gpt4 book ai didi

Why use SS Tables instead of One-File per key in Key-Value Databases?(为什么在键值数据库中使用SS表而不是每个键一个文件?)

转载 作者:bug小助手 更新时间:2023-10-25 20:36:16 30 4
gpt4 key购买 nike



I'm trying to understand the internals of databases and why certain design decisions were taken when I had a doubt.

我试图了解数据库的内部结构,以及为什么在我有疑问时会做出某些设计决定。


Let's assume that the only requirement is to get the value given a key. No other access patterns to be supported.

让我们假设唯一的要求是获得给定了键的值。不支持其他访问模式。


If this is the scenario, why not just use one file per key in disk instead of going the traditional LSM Tree + SS Table approach? Inserts would be O(1) since you create a new file and search will also be O(1) since we know if the file is present in the disk or not.

如果是这种情况,为什么不使用磁盘中的每个键一个文件,而采用传统的LSM树+SS表方法呢?插入将是O(1),因为您创建了一个新文件,搜索也将是O(1),因为我们知道该文件是否存在于磁盘中。


I understand there must be some reason to not use this approach, but I'm not able to visualise what that reason would be.

我明白不使用这种方法肯定有某种原因,但我无法想象会是什么原因。


One reason I could think of is that data in disk is stored in blocks and blocks when retrieved from disk are cached. Now, in most cases, this reduces disk I/O. Seeking in an already open file is also faster than fetching the file from disk, but again, this is optimal only in average cases.

我能想到的一个原因是,磁盘中的数据存储在块中,从磁盘检索到的块被缓存。现在,在大多数情况下,这会减少磁盘I/O。在已打开的文件中查找也比从磁盘获取文件更快,但同样,这仅在一般情况下才是最佳的。


In worst cases, since the number of SS Table files will be much lower than the number of files when storing one file per key, Disk I/O will still be lower.

在最坏的情况下,由于每个键存储一个文件时SS表文件的数量将远远低于文件数量,因此磁盘I/O仍然较低。


Are there any other reasons why we don't store one file per key other than Disk I/O?

除了磁盘I/O之外,我们为什么不为每个密钥存储一个文件,还有其他原因吗?


更多回答
优秀答案推荐
更多回答

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com