java - 磁盘 I/O 算法的运行时间-6ren

java - 磁盘 I/O 算法的运行时间

转载作者：塔克拉玛干更新时间：2023-11-03 03:29:54

在基于内存的计算模型中，通过考虑数据结构，可以抽象地完成唯一需要进行的运行时计算。

但是，关于高性能磁盘 I/O 算法的文档并不多。因此，我提出了以下一组问题:

1) 我们如何估计磁盘 I/O 操作的运行时间？我假设有一组简单的常量，我们可以添加这些常量来查找磁盘上的值，而不是内存中的值......

2) 更具体地说，访问文件中特定索引的性能有何不同？这是一个恒定的时间操作吗？还是取决于指数的“下跌幅度”？

3) 最后... JVM 如何优化对文件索引部分的访问？

还有...就资源而言——总的来说...是否有用于磁盘数据结构实现的任何好的惯用语或库？

最佳答案

1) how can we estimate running time of disk I/o operations? I assume there is a simple set of constants which we might add for looking up a value on disk, rather than in memory...

在 Computer Systems: A Programmer's Perspective 的第 6 章中他们给出了一个非常实用的数学模型，用于说明从典型磁盘读取一些数据需要多长时间。

引用链接 pdf 中的最后一页:

Putting it all together, the total estimated access time is
Taccess = Tavg seek + Tavg rotation + Tavg transfer
        = 9 ms      + 4 ms          + 0.02 ms
        = 13.02 ms

This example illustrates some important points:
• The time to access the 512 bytes in a disk sector is dominated by the seek time and the rotational
latency. Accessing the first byte in the sector takes a long time, but the remaining bytes are essentially
free.
• Since the seek time and rotational latency are roughly the same, twice the seek time is a simple and
reasonable rule for estimating disk access time.

*注意，链接的pdf来自作者网站==没有盗版

当然，如果被访问的数据是最近访问过的，它很有可能缓存在内存层次结构中的某个地方，在这种情况下访问时间非常短(实际上，与磁盘访问时间相比“接近即时”) .

2)And more specifically, what is the difference between performance for accessing a specific index in a file? Is this a constant time operation? Or does it depend on how "far down" the index is?

如果搜索的位置没有按顺序存储在附近，可能会发生另一个搜索+旋转时间量。这取决于您在文件中查找的位置，以及该数据在磁盘上的物理存储位置。例如，碎片文件肯定会导致磁盘搜索读取整个文件。

需要记住的一点是，即使您可能只请求读取几个字节，物理读取往往会发生在固定大小块(扇区大小)的倍数中，最终在缓存中。因此，您稍后可能会搜索文件中附近的某个位置，幸运的是它已经在缓存中了。

顺便说一句 - 如果您对这个主题感兴趣，那本书中关于内存层次结构的整章都是纯金的。

关于java - 磁盘 I/O 算法的运行时间，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12984699/