Effects of POSIX_FADV_DONTNEED when calculating SHA256 sum with python (TL;DR don't use it)(使用PYTHON(TL；DR不使用它)计算SHA256总和时POSIX_FADV

Effects of POSIX_FADV_DONTNEED when calculating SHA256 sum with python (TL;DR don't use it)(使用PYTHON(TL；DR不使用它)计算SHA256总和时POSIX_FADV_DONTNEED的效果)

转载作者：bug小助手更新时间：2023-10-26 20:56:42

Update: This is now more for documentation after doing more tests.

更新：在做了更多测试后，现在更多的是文档。

TL;DR using POSIX_FADV_DONTNEED isn't worth it. Not using it gives best speed. On AMD64 it even seems not to be respected.

TL；DR使用POSIX_FADV_DONTNEED是不值得的。不使用它可以获得最快的速度。在AMD64上，它甚至似乎没有得到尊重。

Environment Rasperry PI4 USB 3.0 5TB spinning disk with ext4 file system

支持ext4文件系统的环境Rasperry pi4 USB 3.0 5TB旋转磁盘

Using 1 thread with this config gives the best speed, probably because of the spinning disk.

在此配置下使用1个线程可获得最佳速度，这可能是因为旋转磁盘的缘故。

When calculating the SHA256 sum of all files in a directory tree (checking multiple restic repositories without having to enter the encryption password for every repository), the read spead of the disk is displayed in nmon and with node_exporter almost twice the speed when using POSIX_FADV_DONTNEED. This argument tells the kernel not to keep the data in the cache. This makes sense because these files are read only once and would otherwise pollute the cache of the system and thus slow it down because other data would miss in the cache.

在计算目录树中所有文件的SHA256总和时(检查多个Restic存储库，而不必为每个存储库输入加密密码)，磁盘的读取速度以nmon显示，并且NODE_EXPORTER的速度几乎是使用POSIX_FADV_DONTNEED时的两倍。该参数告诉内核不要将数据保存在缓存中。这是有意义的，因为这些文件只被读取一次，否则会污染系统的缓存，从而减慢系统的速度，因为其他数据将在缓存中丢失。

Without POSIX_FADV_DONTNEED read speed is between 60 and 90 MB/s. With POSIX_FADV_DONTNEED read speed is between 155 MB/s and 175 MB/s, so about twice the speed. This value is shown in nmon and with prometheus node_exporter in combination with VictoriaMetrics. However using the time command gives completly different results. Between each run there was a ' sync; echo 3 > /proc/sys/vm/drop_caches

在没有POSIX_FADV_DONTNEED的情况下，读取速度在60到90 MB/S之间。使用POSIX_FADV_DONTNEED时，读取速度在155 MB/S到175MB/S之间，大约是这个速度的两倍。该值以nmon和Prometheus node_exporter与VictoriaMetrics相结合的形式显示。但是，使用time命令会产生完全不同的结果。在每次运行之间有一个‘sync；ECHO 3>/proc/sys/vm/Drop_caches

With posix_fadvise(fd, 0, bytesRead) time was 37s and slow disk speed was displayed. When using posix_fadvise(fd, 0, 0) about twice the disk speed was displayed, but in fact time was 1m8 seconds.
When using

使用POSIX_FADVISE(fd，0，bytesRead)时，时间为37s，显示磁盘速度较慢。当使用POSIX_FADVISE(fd，0，0)时，显示的磁盘速度大约是磁盘速度的两倍，但实际上时间是1m8秒。使用时

def posix_fadvise(fd, offset, length):
    return

only 29s where needed, so the fastest results were reached not using POSIX_FADV_DONTNEED at all.

只需要29秒，所以在根本不使用POSIX_FADV_DONTNEED的情况下达到了最快的结果。

So there is a wrong disk speed shown on Raspberry-Pi, where as more accurate speed is shown on AMD64. On Raspberry PI you can see in VictoriaMectrics and the cache size that it isn't growing when using POSIX_FADV_DONTNEED, so the flag is respected.

因此，Raspberry-PI上显示的磁盘速度是错误的，而AMD64上显示的速度更准确。在Raspberry PI上，您可以在VictoriaMectrics中看到，当使用POSIX_FADV_DONTNEED时，它的缓存大小不会增长，因此该标志是受尊重的。

EDIT: On a hosted VM with SSD and much more performance even when using 4 threads, using POSIX_FADV_DONTNEED makes it reproducible about factor 5 slower. Between every run I did # echo 3 > /proc/sys/vm/drop_caches
Very strange.

编辑：在使用SSD且即使使用4个线程也能获得更高性能的托管VM上，使用POSIX_FADV_DONTNEED使其可重现性降低约5倍。在每次运行之间，我执行了#ECHO 3>/proc/sys/vm/Drop_cach，这非常奇怪。

EDIT2: On a physical host using a spinning disk connected via USB3.0 and 1 thread when using POSIX_FADV_DONTNEED it takes 1m25s to read all the files. After clearing cache with # echo 3 > /proc/sys/vm/drop_caches and not using POSIX_FADV_DONTNEED it only takes 12seconds to calculate the checksum. So a factor 7 (!) difference.

EDIT2：在使用通过USB3.0连接的旋转磁盘和一个线程的物理主机上，当使用POSIX_FADV_DONTNEED时，读取所有文件需要1m25秒。在使用#ECHO 3>/proc/sys/vm/DROP_CACHES清除缓存并且不使用POSIX_FADV_DONTNEED之后，计算校验和只需要12秒。因此，因子7(！)不同之处。

Update 20.09.2023: With VictoriaMectrics I can see that POSIX_FADV_DONTNEED seems not to be respected regarding cache on AMD64 (on RPi it is, see above), you can see it growing, despite setting the flag.
There is no noticeable difference in speed (using timecommand) between using posix_fadvise(fd, 0, bytesRead) (real 1m53,630s user 0m48,819s sys 0m5,627s) and immediately returning in def posix_fadvise(fd, offset, length): (real 1m52,675s user 0m51,346s sys 0m6,928s). Using posix_fadvise(fd, 0, 0) takes real 2m31,398s user 1m2,004s sys 0m16,178s

更新20.09.2023：使用VictoriaMectrics，我可以看到POSIX_FADV_DONTNEED在AMD64上的缓存似乎不受尊重(在RPI上，请参见上文)，您可以看到它在增长，尽管设置了标志。在使用POSIX_fise(fd，0，bytesRead)(实际1m53,630s用户0m48,819s sys 0m5,627s)和立即返回def POSIX_fise(fd，Offset，Long)：(实际1m52,675s用户0m51,346s sys 0m6,928s)之间，速度(使用时间命令)没有明显差异。使用POSIX_FADVISE(fd，0，0)获取实数2m31,398s用户1m2,004s系统0m16,178s

import os
import subprocess
import hashlib
import concurrent.futures
import sys
import ctypes

# Constants for posix_fadvise
POSIX_FADV_DONTNEED = 4

base_directory = '/home/pi/5TB'
num_threads = 1  # Adjust the number of threads as needed
# Define posix_fadvise function
def posix_fadvise(fd, offset, length):
    #return #uncomment and speed will be much slower
    libc = ctypes.CDLL("libc.so.6")
    ret = libc.posix_fadvise(fd, offset, length, POSIX_FADV_DONTNEED)
    if ret != 0:
        raise OSError(f"posix_fadvise failed with error code {ret}")

def calculate_sha256(file_path):
    try:
        # Calculate the SHA256 checksum of the file
        sha256_hash = hashlib.sha256()
        bytesRead = 0  # Initialize the counter for bytes read
        with open(file_path, 'rb') as f:
            fd = f.fileno()  # Get file descriptor
            # Advise the kernel that we don't need the file data anymore
            #posix_fadvise(fd, 0, 0)            
            while True:
                data = f.read(65536)  # Read in 64KB chunks
                if not data:
                    break
                bytesRead += len(data)
                sha256_hash.update(data)
                posix_fadvise(fd, 0, bytesRead)     
        
        checksum = sha256_hash.hexdigest()

        # Check if the checksum matches the filename
        filename = os.path.basename(file_path)
        if checksum != filename:
            sys.stderr.write(f"Error: Checksum mismatch for file '{file_path}'\n")
        
        return file_path
    except Exception as e:
        sys.stderr.write(f"Error processing file '{file_path}': {str(e)}\n")
        return None

def process_files_in_directory(directory):
    files = [os.path.join(directory, filename) for filename in os.listdir(directory) if os.path.isfile(os.path.join(directory, filename))]

    results = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=num_threads) as executor:
        for file in executor.map(calculate_sha256, files):
            if file is not None:
                results.append(file)

    return results

if __name__ == "__main__":
    checked_count = 0
    for root, _, _ in os.walk(base_directory):
        checked_files = process_files_in_directory(root)
        checked_count += len(checked_files)
        if checked_count % 100 == 0:
            sys.stdout.write(f"Checked {checked_count} files...\n")
            sys.stdout.flush()  # Flush the stdout buffer to write immediately                                                                

    sys.stdout.write(f"Checked {checked_count} files in total.\n")

更多回答

I think you are doing it backwards. Instead of preventing the data from going into page cash, you can simply mmap the file and work within the page cache directly then MADV_FREE on it when you are done. This way you are still using a single copy of the data, but the page cache's copy rather than your local copy.

我认为你是在倒退。不是阻止数据进入页面现金，而是只需映射文件并直接在页面缓存中工作，然后在完成后在其上执行MADV_FREE。这样，您仍然使用数据的单个副本，但页面缓存的副本而不是您的本地副本。

By calling posix_fadvise with offset == 0 and length == 0 you're telling the kernel that you don't need any byte of the entire file, yet you immediately proceed to read from it again. If the kernel has been reading ahead because the bytes are spinning past the read head anyway, that could explain the performance hit. Probably you want to set length to the number of bytes you've already read.

通过调用带有偏移量==0和长度==0的POSIX_FADEST，您告诉内核您不需要整个文件的任何字节，但是您立即开始再次读取它。如果内核一直在提前读取，因为字节无论如何都会旋转过读取头，这可以解释性能受到的影响。您可能希望将长度设置为您已经读取的字节数。

Thanks for the hint. I used this to do some additional measurements. I've updated the whole post. When not using POSIX_FADV_DONTNEED best speeds are reached on AMD64 as well RasbperryPi.

谢谢你的提示。我用这个做了一些额外的测量。我已经更新了整个帖子。不使用POSIX_FADV_DONTNEED时，AMD64以及RasbperryPI也可达到最佳速度。

优秀答案推荐

更多回答

javascript - "Don' 中的代码不要害怕函数式编程”
我一直在阅读一篇标题为 Don’t Be Scared Of Functional Programming 的文章并且有一段代码我无法理解(粘贴在下面)。该代码的目的是从名为 data 的对象数组中获
iphone - 如何处理位置管理器的“"Don' t允许”？
我现在还没有想到这一点。到目前为止，每当设备要求我使用位置更新时，我都会允许。但是现在我不允许，那么位置管理器会给我 kclErrorDenied 并且位置管理器无法再次启动，直到我重新启动应用程
javascript "don' t 在循环中创建函数”
如何重构我的代码以消除 JSLinter 中的此错误？我尝试将整个函数移至 var，但此后代码无法运行。 for (i = 0; i < timeDifference; i++) { ti
R聚合所有可能的组合，包括。 "don' t关心“
假设我们有一个包含 3 列的数据框，代表 3 种不同的情况，每一种都可以是状态 0 或 1。第四列包含一个测量值。 set.seed(123) df 2 0 0 1 0.4571073 3
c++ - "specializations don’ t参与重载”
“函数模板的特化不参与重载解析。仅考虑基本模板”的真正含义是什么我用其专用版本编写了一个简单的模板函数，并且可以看到调用了专用: // Base template template T max(T
R聚合所有可能的组合，包括。 "don' t关心“
假设我们有一个包含 3 列的数据框，代表 3 种不同的情况，每一种都可以是状态 0 或 1。第四列包含一个测量值。 set.seed(123) df 2 0 0 1 0.4571073 3
c++ - "Don' t 再次显示消息框中的选项
在 C++/MFC 中，显示带有“不再显示”选项的消息框的最简单方法是什么？在我的例子中，我只想要一个简单的 MB_OK 消息框(一个 OK 按钮)。最佳答案或者只使用 SHMessageBox
ruby - 安装cocoapod时获取 "don' t有写权限
我正在尝试为我的 ios 应用程序设置一个谷歌登录，为此我需要 CocoaPods。但是当像在终端上有自己的网站一样安装它时，我得到了这个: $ sudo gem install cocoapods
oop - "tell, don' t询问“是否适用于用户输入验证？
这些年来，我肯定以某种方式忽略了“告诉，不要问” OOP原则，因为我是几天前才第一次了解它。但是上下文是关于已从ASP.NET Web表单页面移到数据/业务对象中的验证代码的讨论，并且没有“Vali
checkbox - 请求对话框复选框选项 "don' t 在发送前询问...”
我在 The Sims Social 应用程序中看到请求对话框中有一个复选框选项，上面写着:“在向发送 The Sims Social 请求之前不要询问。” 我还没有在 Facebook API D
multithreading - 为什么说 "Don' t 同时格式化软盘的评论在谈论线程和进程时很有趣？
我正在阅读 Thread and Processes 之间的区别并在第二个答案中发现了用户留下的评论，其中指出 As so long as you don't format a floppy at t
regex - 哪个正则表达式运算符表示 'Don' t' 匹配该字符？
*、?、+ 字符均表示匹配该字符。哪个字符表示“不”匹配这个？例子会有所帮助。最佳答案您可以使用否定字符类来排除某些字符:例如，[^abcde] 将匹配除 a、b、c、d、e 字符之外的任何字符。
android - 没有 "don' t 打扰的静音电话"
至少在 Pixel 手机上，可以将手机从设置中静音。但是，从屏幕截图中可以看出，我没有找到任何将手机设置为这种模式的 Android API。如果我使用 AudioManager使用 setRing
java - 禁用 "don' 不再询问”当用户第二次拒绝权限时出现的单选按钮
我的应用程序需要权限才能运行。如果用户拒绝初始运行的权限，它将关闭。如果他们第二次运行应用程序并再次拒绝权限，第三次尝试运行应用程序并请求权限时，对话框中还会出现一个单选按钮，其中包含“不再询问”选项
c# - 创建一个消息框，用户可以选择 "Don' 再次显示它。”
制作了一个群发信使和一个多消息/垃圾邮件发送者合二为一，工作正常，只是想让它变得更好。显然，我必须编写代码让 Skype 允许该程序，这样它才能做它做的事情，就在这里， private voi
java - 如何让一个 "Don' t再次显示这个警告信息”对话框弹出
我正在尝试创建一个对话框以在我的应用程序中显示一条介绍消息，其下方有一个“不再显示”复选框。不会写代码。 @Override protected void onStart() {
generics - 如何将泛型指定为 "don' t care”？
我有一个特性，可以为微 Controller 指定允许的引脚配置: pub trait TimChannelsMapping: Sized { const MAPPING: u8; } 它是这
algorithm - 匹配二进制模式包括 "don' t cares"
我有一组位模式，想在该组中找到与给定输入匹配的元素的索引。位模式包含“无关”位，即匹配 0 和 1 的 x-es。例子位模式集是 index abcd 0 00x1 1 01xx
android - 允许用户撤消 "don' t 再次询问”？
在 Android 中，如果您请求权限，第二次请求允许用户选中“不要再问我”，因此将来总是会达到失败状态。如果用户改变主意并想要允许权限，但现在不能，因为该功能默认为权限失败状态，该怎么办？用户如何
Android - "Resources don' t 包含资源编号包”
当我在模拟器中运行时，我的 android 应用程序有以下几种形式的警告: "Resources don't contain package for resource number " 如何修复这些警

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Effects of POSIX_FADV_DONTNEED when calculating SHA256 sum with python (TL;DR don't use it)(使用PYTHON(TL；DR不使用它)计算SHA256总和时POSIX_FADV_DONTNEED的效果)